Menu
  • HOME
  • TAGS

What does this error mean with mechanize in python?

python,mechanize,bots

You'll probably kick yourself about this, but you need to include the scheme in the URL, i.e. http. Try changing codecademy = 'www.codecademy.com' to codecademy = 'http://www.codecademy.com' ...

How is Ruby Mechanize fast after first get request?

ruby,web-scraping,mechanize

There are lots of possible sources for a speed change between requests. A few that immediately spring to mind: DNS lookup cached on your client. The first call must convert "xyz.com" to "123.45.67.89", involving a DNS lookup which may be slow. HTTP keep-alive. There is an initial conversation between client...

Using mechanize to input username and password

python,screen-scraping,mechanize

This is how I selected the first form in my code. br.select_form(nr=0) #Form fields to populate br.form['username'] = username br.form['password'] = password #Submit the login form br.submit() Modify it to suit your needs. The "nr=0" is probably what you're looking for. But the problem is the DOCTYPE. I tested the...

Change option in select using Ruby Mechanize

javascript,ruby,select,mechanize

Let's say the option looks like this: option = page.at('option[text()=foo]') You would do the action (change location to the option's value) with: page = agent.get option[:value] ...

Why am I getting an unsupportedSchemeError

ruby,mechanize

Where does the link point to? Ie, what's the href? I ask because "scheme" usually refers to something like http or https or ftp, so maybe the link has a weird scheme that mechanize doesn't know how to handle, hence Mechanize::UnsupportedSchemeError...

Python mechanize saying existing control does not exist

python,web-scraping,mechanize,scrape

Its zero-indexed. try the code below: br.select_form(nr=0) ...

How to implement caching for results after web scraping with mechanize using Python

python,caching,web-scraping,mechanize

There are plenty of easy ways to implement caching. Write data to a file. This is especially easy if you are just dealing with small amounts of data. JSON blobs can also be easily written to a file. with open("my_cache_file", "a+") as file_: file_.write(my_json_blob) Use a key value store to...

Hidden HTML elements using Mechanize Python

python,html,web-scraping,mechanize,hidden

Pretty sure there is javascript involved which mechanize cannot handle. An alternative solution here would be to automate a real browser through selenium: from selenium import webdriver from selenium.common.exceptions import TimeoutException from selenium.webdriver.common.by import By from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.support import expected_conditions as EC driver = webdriver.Firefox() # could...

How do I save data to a multi-dimensional Ruby hash then convert the hash to a single JSON file?

ruby,hash,web-scraping,nokogiri,mechanize

With help from here, here and here I have the fully working code: require 'mechanize' @hashes = [] # Initialize Mechanize object a = Mechanize.new # Begin scraping a.get('http://www.marktplaats.nl/') do |page| groups = page.search('//*[(@id = "navigation-categories")]//a') groups.each_with_index do |group, index_1| a.get(group[:href]) do |page_2| categories = page_2.search('//*[(@id = "category-browser")]//a') categories.each_with_index do...

Beautiful Soup trying to get information on /<— comment tag

python,facebook,web-scraping,beautifulsoup,mechanize

Or instead of scraping Facebook, you can do it the proper way - through their graph API ;) import requests url = "http://graph.facebook.com/{}".format("zuck") params = { "fields": "picture" } response = requests.get(url, params=params).json() picture_url = response['picture']['data']['url'] print(picture_url) # output: #...

Mechanize: Unable to redirect to final destination

ruby,mechanize,mechanize-ruby

Looks like there is a meta refresh in there (per your description). Try adding this to your Mechanize object: a.follow_meta_refresh = true Also, you may want your user_agent to an accepted value instead of your custom one: require 'mechanize' Mechanize::AGENT_ALIASES.each { |k,v| puts k } => Mechanize => Linux Firefox...

BeautifulSoup parse 'findAll' run error

python,beautifulsoup,mechanize

You need to supply a user agent: url = "http://www.marinetraffic.com/en/ais/index/positions/all/shipid:415660/mmsi:354975000/shipname:ADESSA%20OCEAN%20KING/_:6012a2741fdfd2213679de8a23ab60d3" import requests headers = {'User-agent': 'Mozilla/5.0'} html = requests.get(url,headers=headers).content soup = BeautifulSoup(html) table = soup.find("table") # only one table So just unpack the list with something like: for row in table.findAll('tr')[1:]: items = row.text.replace(u"kn","") # remove kn so items line...

Logging Into Google To Scrape A Private Google Group (over HTTPS)

ruby,web-scraping,mechanize

I worked with Ian (the OP) on this problem and just felt we should close this thread with some answers based on what we found when we spent some more time on the problem. 1) You can't scrape a Google Group with Mechanize. We managed to get logged in abut...

Error trying to image scrape

ruby,image,mechanize,scrape

open takes an argument that's a filename, not an URL. If you want to access the URL, you would normally have to do a lot more than simply open a file. Luckily, Ruby provides a nice wrapper for Net::HTTP, called open-uri. Just drop the following line at the top of...

Writing loop over multiple pages with BeautifulSoup

python,loops,beautifulsoup,mechanize,bs4

You don't actually need requests module to iterate through paged search result, mechanize is more than enough. This is one possible approach using mechanize. First, get all paging links from current page : links = br.links(url_regex=r"fuseaction=home.search&pageNumber=") Then iterate through paging links, open each link and gather useful information from each...

Loop over all the
tags and extract specefic information via Mechanize/Nokogiri

html,ruby,nokogiri,mechanize

Here's an example of how you could parse the bold text and href attribute from the anchor elements you describe: require 'nokogiri' require 'open-uri' url = 'http://openie.allenai.org/sentences/?rel=contains&arg2=antioxidant&title=Green%20tea' doc = Nokogiri::HTML(open(url)) doc.xpath('//dd/*/a').each do |a| text = a.xpath('.//b').map {|b| b.text.gsub(/\s+/, ' ').strip} href = a['href'] puts "OK: text=#{text.inspect}, href=#{href.inspect}" end # OK:...

Scraping tabulated paginated data

ruby-on-rails-3,web-scraping,mechanize

When you click one of the page links javascript on the page triggers a post request to the same path with the new page number. This can be found in their js file at http://eserver.goutsi.com:8080/js/LoadBoard.js function gotoPage(pageNumber) { document.getElementById("PageNbr").value= pageNumber; // Set new page number document.getElementById("PageDir").value="R"; // Refresh document.getElementById("theForm").submit(); }...

href does not want to get printed although I followed the path

html,ruby,xpath,nokogiri,mechanize

It work for me, you also can try ;) array = page.search('//*[@class="visible-phone"]/a').each { |i| puts i['href'] } UPD: first_link_page = page.link_with(:href => array.first['href']).click UPD2: array = page.search('//*[@class="visible-phone"]/a') first_link_page = page.link_with(:href => array.first['href']).click ...

Ruby Mechanize form input field text

ruby,csv,automation,web-scraping,mechanize

Your error is telling you that something on line 19 in your code is causing the issue for line 442 in mechanize. I tried your sample out in IRB and it seems to work fine: 2.2.2 :001 > require 'mechanize' => true 2.2.2 :002 > agent = Mechanize.new => #<Mechanize:......

How would I find these grades and these class names using mechanize and BeautifulSoup?

python,beautifulsoup,mechanize,urllib

assuming you know how to grab the page with requests you would do the following: ... from bs4 import BeautifulSoup ... gradetd = BeautifulSoup(html).find_all('td',{'class':'fixed-column important'}) for row in gradetd: grades = row.find('span',{'class':'expandable-row'}).text.strip() if grades: avg, grade = grades.split(' ') print("{}/{}".format(avg,grade)) 89.83/A- 99.14/A+ 98.20/A+ 91.14/A- 94.32/A 95.76/A 91.28/A- 85.42/B 97.86/A+ 95.63/A...

Why does this JSON file get filled with 1747 times the last Hash data?

ruby,json,hash,web-scraping,mechanize

The same object, the result of @categories_hash['category'], is being updated each loop. Thus the array is filled with the same object 1747 times, and the object reflects the mutations done on the last loop when it is viewed later. While a fix might be to use @categories_hash[category_name] or similar (i.e....

HTTP Error 999: Request denied

python,web-scraping,beautifulsoup,linkedin,mechanize

Try to set up User-Agent header. Add this line after op.set_handle_robots(False) op.addheaders = [('User-Agent': "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.71 Safari/537.36")] Edit: If you want to scrape web-sites, first check if it has API or library, that deals with API....

Recovering from HTTPError in Mechanize

python,mechanize,http-error

It's been a while since I've written for python, but I think I have a workaround for your problem. Try this method: import requests except Mechanize.HTTPError: while true: ## DANGER ## ## You will need to format and/or decode the POST for your form response = requests.post('http://yourwebsite.com/formlink', data=None, json=None) ##...

How to get option values from second select list?

ruby,web-scraping,mechanize

Mechanize does not support JavaScript and since the model select box is populated via JavaScript, using the select boxes is not going to work. However, http://www.parkers.co.uk/ works fine (though differently) if you disable JavaScript. Disable JavaScript in your browser while using the site and you'll notice that you get a...

Crawling with urllib2 and mechanize is throwing error to me

python,urllib2,mechanize

Got the same error, try using a user-agent or requests: import requests response=requests.get('http://proxygaz.com/country/india-proxy/') print(response.status_code) 200 using agent-works fine: import urllib2 resp = urllib2.Request('http://proxygaz.com/country/india-proxy/') resp.add_header('User-Agent', 'FIREFOX') opener = urllib2.build_opener() print opener.open(resp).read() ...

Submitting a Form using Mechanize (PubChem)

python,mechanize

You can get data from PubChem with their service PUG REST A simple example: import urllib2 import json def get(url): req = urllib2.Request(url) response=urllib2.urlopen(req) return response.read() pugrest = 'http://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/name/' cmpd = 'methane' prop ='/property/MolecularFormula,MolecularWeight,CanonicalSMILES,InChI,IUPACNam/JSON' data = get(pugrest+cmpd+prop) print data This will give you this json: { "PropertyTable": { "Properties": [...

Mechanize select from dropdown

python,mechanize

You need to specify the value as a list: if br.form["list"] == ["---"]: br.form["list"].value = ["1"] According to the mechanize - Forms documentation: # Controls that represent lists (checkbox, select and radio lists) are # ListControl instances. Their values are sequences of list item names. # They come in two...

crawling through pagination mechanize python

python,mechanize

I found the problem in mechanize that it doesn't support javascript. Whenever mechanize reach the page after submit, then javascript not working due to it pagination click was not triggered. I have achieved what I want using selenium. and Beautiful Soup using following selenium selector: elem1 = driver.find_element_by_link_text("Next Page >>")...

How to run functions in the background in a Rake task?

ruby-on-rails,ruby,background,web-crawler,mechanize

You can create new thread and call method there http://www.ruby-doc.org/core-2.1.5/Thread.html

ruby mechanize website scraping always returns javascript data only

ruby-on-rails,ruby,web-scraping,nokogiri,mechanize

I used capybara-webkit to solve my problem

mechanize and Ruby multipart/form-data - content transfer encoding

ruby,mechanize,mechanize-ruby

I have managed to identify the culprit here. As my development machine is Windows-based, this seems to have been an issue with mechanize (or one of its dependencies) and Windows. By specifying the b (binary) part in the second argument of File.new, the problem went away on its own. tl;dr:...

ruby how to close a mechanize connection

ruby-on-rails,ruby,mechanize,open-uri

I think you'll want to use a Mechanize#start block: 10.times do Mechanize.start do |minion| minion.open_timeout = 15 minion.read_timeout = 15 minion.set_proxy '212.82.126.32', 80 page = minion.get("http://www.whatsmyip.org/") proxy_ip_adress = page.parser.css('#ip').text puts proxy_ip_adress end # minion definitely doesn't exist anymore end ...

Using Python and Mechanize to submit data in the website's html

python-2.7,mechanize

You can try out something like this: from mechanize import Browser from bs4 import BeautifulSoup br = Browser() br.set_handle_robots( False ) br.addheaders = [('User-agent', 'Firefox')] br.open("http://www.mcxindia.com/sitepages/BhavCopyCommodityWise.aspx") br.select_form("form1") #now enter the dates according to your choice br.form["mTbFromDate"] = "date-From" br.form["mTbFromDate"] = "date-To" response = br.submit() #now read the response with BeautifulSoup...

click button (not in form) with mechanize/nokogiri

ruby,nokogiri,mechanize

You can probably use "watir" insteed. It can mimic the actions (such as clicking, rolling the page) in a browser.

In scraping, can't login with Mechanize

ruby-on-rails,ruby,login,web-scraping,mechanize

Instead of: docwatan = Nokogiri::HTML(open('http://www.elwatan.com/')) You want to do: docwatan = agent.get('http://www.elwatan.com/') otherwise the session cookie isn't getting sent in the request....

How to automate interaction for a website with POST method

python,selenium,mechanize

Pulled straight from the docs and changed to your example. from selenium import webdriver # Create a new instance of the Firefox driver driver = webdriver.Firefox() # go to the page driver.get("http://www.link.cs.cmu.edu/link/submit-sentence-4.html") # the page is ajaxy so the title is originally this: print driver.title # find the element that's...

Unable to execute python web scraping script successfully after user submits a form on a website built with Flask from the second time onwards

python,flask,screen-scraping,mechanize,cookiejar

Once your Flask app is started it only imports each package once. That means that when it runs into import webscrap for the second time it says “well, I already imported that earlier, so no need to take further action…” and moves on to the next line, rendering the template...

How to check for webpages' popups?

python,firefox,popup,mechanize,webpage

Mechanize cannot handle javascript and popup windows: How do I use Mechanize to process JavaScript? Mechanize and Javascript To accomplish the goal, you need to utilize a real browser, headless or not. This is where selenium would help. It has a built-in support for popup dialogs: Selenium WebDriver has built-in...

Python beautifulsoup not grabbing full table

python,beautifulsoup,mechanize

When combining the two it works without a problem so it might be related to the html.parser you are using. import mechanize from bs4 import BeautifulSoup URL = ('http://www.airchina.com.cn/www/jsp/airlines_operating_data/' 'exlshow_en.jsp') control_year = ['2006', '2007', '2008', '2009', '2010', '2011', '2012', '2013', '2014'] control_month = ['01', '02', '03', '04', '05', '06', '07',...

Selenium interpret javascript on mac?

selenium,web-crawler,mechanize

Selenium is a browser automation tool. You can basically automate everything you can do in your browser. Start with going through the Getting Started section of the documentation. Example: from selenium import webdriver driver = webdriver.Firefox() driver.get("http://www.python.org") print driver.title driver.close() Besides automating common browsers, like Chrome, Firefox, Safari or Internet...

Extract data from HTML Table with mechanize

html,ruby-on-rails,ruby,parsing,mechanize

More succint version relying more on the black magic of XPath :) require 'nokogiri' require 'open-uri' doc = Nokogiri::HTML(open('http://www.alpineascents.com/8000m-peaks.asp')) last_td = doc./("//tr[td[strong[text()='#{ARGV[0]}']]]/td[5]") puts last_td.text.gsub(/.*?;/, '').strip ...

Web Scraper for dynamic forms in python

python,web-scraping,web-crawler,mechanize

If you look at the request being sent to that site in developer tools, you'll see that a POST is sent as soon as you select a state. The response that is sent back has the form with the values in the city dropdown populated. So, to replicate this in...

Using the Mechanize gem with the Nokogirl gem?

ruby-on-rails,ruby,web-scraping,nokogiri,mechanize

You can use page.parser to gain access to the underlying Nokogiri object. http://mechanize.rubyforge.org/Mechanize/Page.html#method-i-parser require 'mechanize' agent = Mechanize.new agent.get("http://stackoverflow.com/questions/23064821/using-the-mechanize-gem-with-the-nokogirl-gem/") agent.page.parser.class # => Nokogiri::HTML::Document agent.page.parser.css("#answer-23065003 .user-details a").text # => "akatakritos" ...

WWW::Mechanize field methods

perl,mechanize,textfield

$mech -> field($name, $value) field() only lets you set one name at a time. But $mech -> set_fields($name => $value, $name2 => $value2,... $nameN => $valueN) ...set_fields() allows you to set multiple names at the same time. That's not really such a big deal because you could always use the...

How to access/set 'select' tag in HTML with python

python,html,selenium,web-scraping,mechanize

You can select the form by the order at which it appears on the page, firstly import & open import mechanize br = mechanize.Browser() br.open('http://www.staffordshire-pcc.gov.uk/space/') Loop through all the forms in the page forms = [f.name for f in br.forms()] Lets check whether form[0] is the correct index for the...

Error logging into instagram with python

python,beautifulsoup,mechanize

There's a library to access instagram from python. To login, you need the following code: from instagram.client import InstagramAPI access_token = "YOUR_ACCESS_TOKEN" # get this from instagram api = InstagramAPI(access_token=access_token) recent_media, next_ = api.user_recent_media(user_id="userid", count=10) for media in recent_media: print media.caption.text In other words, don't reinvent the wheel....

Automated filling in, submission and review of response of javascript form in ruby

javascript,ruby,gem,mechanize

In the end, without some extremely complex programming, this was only possible using Selenium and unfortunately it wasn't reliable enough to rely on.

Getting a table with Mechanize in Ruby

ruby,table,web-scraping,nokogiri,mechanize

Using css selector, to print text and href attribute values: require 'nokogiri' doc = Nokogiri::HTML(page.body) doc.css('table#myTable tbody td[3] a').each {|a| puts a.text, a[:href] } ...

Python/Mechanize doesn't recognize input form

python,python-2.7,mechanize

Author here. It seems like the form was a javascript. I used Selenium instead to output keys to the form.

Scraping successive pages until the last page using Nokogiri and Mechanize

ruby,web-scraping,nokogiri,mechanize

I would try this, replace lien.click with page = lien.click.

Python mechanize is not handling form exception

python,exception,mechanize

Can you try to change this to: except mechanize._mechanize.FormNotFoundError: instead of this: except FormNotFoundError: ...

Download File with Ruby Mechanize

ruby,mechanize

I've not see it used that way. Normally you need to create an agent, then issue the get. try this require 'rubygems' require 'mechanize' uri = URI 'http://website.com/page.html' agent = Mechanize.new file = agent.get uri filename = file.save # saves to page.html puts filename # page.html ...

Mechanize search unable to find CSS selector (it's definitely present)

ruby,css-selectors,nokogiri,mechanize

The page you are searching doesn’t contain any tbody tags. When your browser parses the page it adds the missing tbody elements into the DOM that it creates. This means that when you examine the page through the browser’s inspector and console it acts like the tbody tags exist. Nokogiri...

Ruby Mechanize: Programmatically Clicking a Link Without Knowing the Name of the Link

ruby,mechanize

Mechanize has regular expressions: page.link_with(text: /foo/).click page.link_with(href: /foo/).click Here are the Mechanize criteria that generally work for links and forms: name: name_matcher id: id_matcher class: class_matcher search: search_expression xpath: xpath_expression css: css_expression action: action_matcher ... If you're curious, here's the Mechanize ElementMatcher code...

Mechanize Python page download does not work with HTTPS

python,mechanize,mechanize-python

Your code works for me, but I would remove the line ('Accept-Encoding', 'gzip, deflate, sdch'), to not having to reverse that encoding afterwards. To clarify: you are getting the content, but you expect it to be in "clear text". You get clear text by not requesting gzipped content....

Python requests module : Post and go to next page

python,post,mechanize,twill

The problem here is that you need to actually interact with the javascript on the page. requests, while being an excellent library has no support for javascript interaction, it is just an http library. If you want to interact with javascript-rich web pages in a meaningful way I would suggest...

Map two Nokogiri objects

ruby,nokogiri,mechanize

Easiest way: details = doc.css('table > tr > th') details2 = doc.css('table > tr > td > p') details.map!.with_index { |d, i| {name: d.text, value: details2[i].text } } details will look like [{name: 'asd', value: '123'}, {name: 'qwe', value: '234'}]...

Python webpage scraping can't find form from this page

python,web-scraping,mechanize

Trying to access that page what you are actually doing is being directed to an error page. Paste that url in a browser and you get a page with: Not comply with the conditions of the inquiry data and no forms at all You need to access the page in...

Fix Character encoding of webpage using python Mechanize

python,mechanize

Your problem are some broken HTML comment tags, leading to an invalid website which mechanize's parser can't read. But you can use the included BeautifulSoup parser instead, which works in my case (Python 2.7.9, mechanize 0.2.5): #!/usr/bin/env python #-*- coding: utf-8 -*- import mechanize br = mechanize.Browser(factory=mechanize.RobustFactory()) br.open('http://mspc.bii.a-star.edu.sg/tankp/run_depth.html') br.select_form(nr=0) br['pdb_id']...

Enter data in textarea using mechanize Python

python,forms,beautifulsoup,mechanize,python-requests

There could be different solutions applied here, though it is pretty clear - this is not an easy case for mechanize. Better make that submission (POST request) using requests: import requests url = 'http://bioportal.bioontology.org/annotator' params = { 'text': 'Sample text', # this is the contents of the text area 'longest_only':...

Mechanize submit result is not the correct page

ruby,mechanize,scrape

Adding the following line to change the user agent worked in the end: agent.user_agent_alias = 'Mac Safari' ...

How to extract table browser results from ucsc genome browser by scraping

python,web-scraping,beautifulsoup,mechanize

You can always make the request using requests: import requests url = 'http://genome-euro.ucsc.edu/cgi-bin/hgTables?hgsid=201790284_dkwVYFu7V6ISmTzFGlXzo23aUhXk' session = requests.Session() params = { 'hgsid': '201790284_dkwVYFu7V6ISmTzFGlXzo23aUhXk', 'jsh_pageVertPos': '0', 'clade': 'mammal', 'org': 'Human', 'db': 'hg19', 'hgta_group': 'genes', 'hgta_track': 'refGene', 'hgta_table': 'refFlat', 'hgta_regionType': 'range', 'position': 'chr9:21802635-21865969', 'hgta_outputType': 'gff', 'boolshad.sendToGalaxy': '0',...

Any way to deal with dynamically loaded content with Mechanize?

html,ruby,web-scraping,nokogiri,mechanize

Here's the url for page 2 http://store.steampowered.com/search/results?sort_by=_ASC&page=2&auction=1&snr=1_7_7_230_7 That should be all you need, parse it the same as page 1....

can't access form from website using python mechanize

python,forms,mechanize,mechanize-python

In order to submit the calendarForm, you should access a different URL with mechanize: br.open('https://eappointment.ica.gov.sg/ibook/gethome.do') br.select_form(name="calendarForm") ...

How can I log into a simple web access login using Python?

python,web,browser,login,mechanize

That form submits data somewhere. You need to find out where and what method it uses. After you find out, you can use the requests library to do a one-liner, like: response = requests.post("https://controller.mobile.lan/101/portal/", data={'login': "username", 'password': "password") print response.read() # Dumps the whole webpage after. Note that if that...

Mechanize gem method error

ruby,mechanize,nomethoderror

form['user[email]'] = '[email protected]' is working...

Mechanize, 2-Step Authentication: How to fill out form within form

python,amazon-ec2,mechanize

after you br.submit() you go straight into br['Verication'] = str(input_var) this is incorrect since using br.submit() will make your browser not have a form selected anymore. after submitting i would try: for form in br.forms(): print form to see if there is another form to be selected read up on...

mechanize gem : get html from other site => response html encoding issue

html,ruby-on-rails,encoding,utf-8,mechanize

ok, google are stupid and not use header encoding, we need to precise it in request string with param: "ie=utf-8&oe=utf-8"... my problem is fixed with this....

Accepting Terms and Conditions with Python and Mechanize

python,post,mechanize

Mechanize won't cut it because it doesn't evaluate Javascript code and you really need it because the input button triggers a Javascript function. Even if you try to use Mechanize to simply submit the form, you'll get the following message: The web site is unable to acknowledge your acceptance of...

How can I login this page and read it?

python,html,google-app-engine,login,mechanize

First of all, you should know that if there isn't a publicly available API to do all this without scraping then it's very likely that what you are doing is not welcomed by the website owners, against their terms of service and could even be illegal and punishable by law...

Messy Python install? (OS X)

python,osx,python-2.7,mechanize,python-3.4

OS X comes with Python pre-installed. Exactly which version(s) depends on your OS X version (e.g., 10.10 comes with 2.6 and 2.7, 10.8 comes with 2.5-2.7 plus a partial installation of 2.3, and 10.3 comes with 2.3). These are installed in /System/Library/Frameworks/Python.framework/Versions/2.*, and their site-packages are inside /Library/Python/2.*. (The reason...

ArrayList in Scala with Gistlabs Mechanize. Unable to use foreach

scala,arraylist,mechanize

java.util.List has not method foreach. But you can convert it to scala list using implicit conversion. Just add import scala.collection.convert.wrapAsScala._ to your source file.

how do I build a hash when using serialization?

ruby-on-rails,ruby-on-rails-4,mechanize

Try this code. It mixes .map and .each_with_object to obtain the hash and array mix you describe: namespace :data do task scrap: :environment do agent = Mechanize.new results = (3000..25000).step(1000).each_with_object({}) do |amount, h| year_costs = (1..5).step(1).map do |year| total_cost = agent.get("https://www.cashbuddy.se/api/loancost?amount=#{amount}&numberOfYears=#{year}").body {year => total_cost} end # => an array of...

Python mechanize to submit form data

python,forms,web-scraping,mechanize

I don't know if you need to select the form, but if you did so, following should just do the trick: br.find_control(type="checkbox").items[0].selected=True if you want to select all checkboxes: for i in range(0, len(br.find_control(type="checkbox").items)): br.find_control(type="checkbox").items[i].selected =True then submit br.submit() ...

Python: mechanize don't find any form

python,forms,mechanize

According to the FAQ Mechanize cant handle invalid HTML like :" br/ " you have such code on your website You could use the BeautifullSoup Parser import mechanize browser = mechanize.Browser(factory=mechanize.RobustFactory()) browser.open("http://example.com/") print browser.forms Alternatively, you can process the HTML (and headers) arbitrarily: browser = mechanize.Browser() browser.open("http://example.com/") html = browser.response().get_data().replace("<br/>",...