symfony2,behat,mink,guzzle,goutte
Ok finally I found the answer. Hope that helps someone. To upload a file, the correct way is: $fields = $table->getColumnsHash()[0]; //array('name' => 'test', 'surname' => 'test'); $fields["file"] = fopen($path, 'rb'); $this->client->request("POST", $url, array('Content-Type => multipart/form-data'), array(), array(), $fields); The trick is that you must not use the fourth parameter...
php,curl,libcurl,simple-html-dom,goutte
function get_web_page( $url ) { $options = array( CURLOPT_RETURNTRANSFER => true, CURLOPT_HEADER => false, CURLOPT_FOLLOWLOCATION => true, ); $ch = curl_init( $url ); curl_setopt_array( $ch, $options ); $content = curl_exec( $ch ); curl_close( $ch ); return $content; } echo get_web_page("http://www.emirates.com/account/english/miles-calculator/miles-calculator.aspx?org=BOM&dest=JFK&trvc=0&h=7b1dc440b5eecbda143bd8e7b9ef53a27e364b"); ...
I should think this site is very crawlable. To understand what is going on, turn off JavaScript in your browser and try to browse the site (to do this, I use the Disable->Disable JavaScript menu in Firebug, which is a Firefox plugin). If you go to your first link, and...
Aha: this was neither Guzzle nor Goutte. Elsewhere in my code, I intercept the request.success event for the purposes of HTTP logging. Here I call gethostbyname(), whose purpose is explicitly to do a DNS lookup. Now this is disabled, the "mysterious" DNS calls have disappeared....
I got the answer: $client->getClient()->get($img_url, ['save_to' => $img_url_save_name, 'headers'=>['Referer'=>$src] ]); Actually I can set header Referer in Goutte\Client and but there's no option to give a path to save image. So I finally use Guzzle Client instead....
php,symfony2,automation,ui-automation,goutte
Yes it is possible. Live support Oauth2 : https://msdn.microsoft.com/en-us/library/hh243647.aspx There are plenty of Oauth2 client written in PHP....
The bellow code will fix this issue. $crawler->filter('#most-popular > div.panel.open > ol > li.first-child.ol1 > a')->each(function ($node) { $href = $node->extract(array('href')); var_dump($href[0]); }); ...
php,symfony2,web-scraping,goutte
Tools you decided to use make real http connections and are not suitable for what you want to do. At least out of the box. Option 1: Implement your own BrowserKit Client All goutte does is it extends BrowserKit's Client. It implements http requests with Guzzle. All you need to...
php,symfony2,web-crawler,goutte
Assuming that $msg is a Crawler object which contains this html <div class="mola_wrap"> <span class="mola" title="titleinside">109</span> </div> your code is just fine, maybe the website you are crawling does not have .mola class in some pages ...