尝试使用Python Mechanize填充<td>标记中包含的搜索框

import urllib2, sys, re, mechanize, itertools, csv # Set the url for the online search url = 'http://www.realtor.ca/PropertyResults.aspx?Page=1&vs=Residential&ret=300&curPage=PropertySearch.aspx&sts=0-0&beds=0-0&baths=0-0&ci=Victoria&pro=3&mp=200000-300000-0&mrt=0-0-4&trt=2&of=1&ps=10&o=A' content = urllib2.urlopen(url).read() text = str(content) # finds all instances of "MLS®: " to create a list of MLS numbers # "[0-9]+" matches all numbers (the plus means one or more) In this case it's looking for a 6-digit MLS number findMLS = re.findall("MLS®: [0-9]+", text) findMLS = [x.strip('MLS®: ') for x in findMLS] # "Page 1 of " precedes the number of pages in the search result (10 listings per page) num_pages = re.findall("Page 1 of [0-9]+", text) num_pages = [y.strip('Page 1 of ') for y in num_pages] pages = int(num_pages[0]) for page in range(2,pages+1): # Update the url with the different search page numbers url_list = list(url) url_list[48] = str(page) url = "".join(url_list) # Read the new url to get more MLS numbers content = urllib2.urlopen(url).read() text = str(content) newMLS = re.findall("MLS®: [0-9]+", text) newMLS = [x.strip('MLS®: ') for x in newMLS] # append new MLS numbers to the list findMLS for number in newMLS: findMLS.append(number)

1条回答

网友

1楼 · 发布于 2024-05-12 22:00:12

我没有使用Mechanize，但我很幸运地使用Selenium导航。我知道这是一个额外的模块，你可能想用也可能不想用，但自从Selenium 2问世以来，它对用户非常友好，你完全可以按照自己喜欢的方式浏览网站。在

编辑：像这样的事情很容易。在

mls_search = driver.find_element_by_id('txtMlsNumber')
mls_search.send_keys('number that you scraped')

search = driver.find_element_by_id('lnkMlsSearch')
search.click()

相关问题更多 >

编程相关推荐

热门问题

热门文章