pip包,使用serpwow对google搜索结果进行抓取和解析。访问https://serpwow.com获取免费的API密钥。
google-search-results-serpwow的Python项目详细描述
python中的google搜索结果
这个python包允许您使用SerpWow刮取和解析google搜索结果。除了Search之外,您还可以使用这个包访问serpwow Locations API、Batches API和Account API。
安装
您可以使用以下命令安装谷歌搜索结果服务程序:
$ pip install google-search-results-serpwow
并使用以下命令进行升级:
$ pip install google-search-results-serpwow --upgrade
文档
我们在这里提供了一些示例,但是完整的serpwow api文档可以在API Docs:
您还可以使用API Playground使用serpwow可视化地构建google搜索请求。
示例
- Simple Example
- Example Response
- Getting an API Key
- Searching with a Location
- Searching Google Places, Google Videos, Google Images, Google Shopping and Google News
- Returning results as JSON, HTML and CSV
- Requesting mobile and tablet results
- Parsing Results
- Paginating results, returning up to 100 results per page
- Search example with all parameters
- Locations API Example
- Account API Example
- Batches API
简单示例
标准查询“pizza”的最简单示例,将google serp(搜索引擎结果页)数据返回为json:
fromserpwow.google_search_resultsimportGoogleSearchResultsimportjson# create the serpwow object, passing in our API keyserpwow=GoogleSearchResults("API_KEY")# set up a dict for the search parametersparams={"q":"pizza"}# retrieve the search results as JSONresult=serpwow.get_json(params)# pretty-print the resultprint(json.dumps(result,indent=2,sort_keys=True))
示例响应
下面显示了返回的json响应的快照(为简洁起见)。有关已解析的google搜索结果页面中所有字段的详细信息,请参见docs。
{"request_info":{"success":true},"search_metadata":{"id":"20c8e44e9cacedabbdff2d9b7e854436056d4f33","engine_url":"https://www.google.com/search?q=pizza&oq=pizza&sourceid=chrome&ie=UTF-8","total_time_taken":0.14},"search_parameters":{"q":"pizza"},"search_information":{"total_results":1480000000,"time_taken_displayed":0.45,"query_displayed":"pizza","detected_location":"Ireland"},"local_map":{"link":"https://www.google.com/search?q=pizza&npsic=0&rflfq=1&rldoc=1&rlha=0&rllag=53350059,-6249133,1754&tbm=lcl&sa=X&ved=2ahUKEwiC3cLZ0JLgAhXHUxUIHQrsBC4QtgN6BAgBEAQ","gps_coordinates":{"latitude":53.350059,"longitude":-6.249133,"altitude":1754},"local_results":[{"position":1,"title":"Apache Pizza Temple Bar","extensions":["American-style pizza-delivery chain"],"rating":3.6,"reviews":382,"type":"Pizza","block_position":1}]},"knowledge_graph":{"title":"Pizza","type":"Dish","description":"Pizza is a savory dish of Italian origin, consisting of a usually round, flattened base of leavened wheat-based dough topped with tomatoes, cheese, and various other ingredients baked at a high temperature, traditionally in a wood-fired oven.","source":{"name":"Wikipedia","link":"https://en.wikipedia.org/wiki/Pizza"},"nutrition_facts":{"total_fat":["10 g","15%"],"sugar":["3.6 g"]}},"related_searches":[{"query":"apache pizza","link":"https://www.google.com/search?q=apache+pizza&sa=X&ved=2ahUKEwiC3cLZ0JLgAhXHUxUIHQrsBC4Q1QIoAHoECAUQAQ"}],"organic_results":[{"position":1,"title":"10 Best Pizzas in Dublin - A slice of the city for every price point ...","link":"https://www.independent.ie/life/travel/ireland/10-best-pizzas-in-dublin-a-slice-of-the-city-for-every-price-point-37248689.html","domain":"www.independent.ie","displayed_link":"https://www.independent.ie/.../10-best-pizzas-in-dublin-a-slice-of-the-city-for-every-p...","snippet":"Oct 20, 2018 - Looking for the best pizza in Dublin? Pól Ó Conghaile scours the city for top-notch pie... whatever your budget.","cached_page_link":"https://webcache.googleusercontent.com/search?q=cache:wezzRov42dkJ:https://www.independent.ie/life/travel/ireland/10-best-pizzas-in-dublin-a-slice-of-the-city-for-every-price-point-37248689.html+&cd=4&hl=en&ct=clnk&gl=ie","block_position":2}],"related_places":[{"theme":"Best dinners with kids","places":"Pinocchio Italian Restaurant - Temple Bar, Cafe Topolisand more","images":["https://lh5.googleusercontent.com/p/AF1QipNhGt40OpSS408waVJUHeItGrrGqImmEKzuVbBv=w152-h152-n-k-no"]}],"pagination":{"current":"1","next":"https://www.google.com/search?q=pizza&ei=fRZQXMKqL8en1fAPitiT8AI&start=10&sa=N&ved=0ahUKEwiC3cLZ0JLgAhXHUxUIHQrsBC4Q8NMDCOkB"}}
获取API密钥
以获得指向app.serpwow.com/signup的免费api密钥。
使用位置搜索
google查询的示例将查询定位为好像用户位于纽约一样。
fromserpwow.google_search_resultsimportGoogleSearchResultsimportjson# create the serpwow object, passing in our API keyserpwow=GoogleSearchResults("API_KEY")# set up a dict for the query (q) and location parameters# note that the "location" parameter should be a value# returned from the Locations APIparams={"q":"pizza","location":"New York,New York,United States"}# retrieve the search results as JSONresult=serpwow.get_json(params)# pretty-print the resultprint(json.dumps(result,indent=2,sort_keys=True))
搜索谷歌地点、谷歌视频、谷歌图片、谷歌购物和谷歌新闻
使用search_type
参数搜索google位置、视频、图像和新闻。请参阅Search API Parameters Docs以获取每个搜索类型可用的其他参数的完整详细信息。
fromserpwow.google_search_resultsimportGoogleSearchResultsimportjson# create the serpwow object, passing in our API keyserpwow=GoogleSearchResults("API_KEY")# perform a search on Google News, just looking at blogs, ordered by date, in the last year, filtering out duplicatesparams={"q":"football news","search_type":"news","news_type":"blogs","sort_by":"date","time_period":"last_year","show_duplicates":"false"}result=serpwow.get_json(params)print(json.dumps(result,indent=2,sort_keys=True))# perform a search on Google Places for "plumber" in Londonparams={"search_type":"places","q":"plumber","location":"London,England,United Kingdom"}result=serpwow.get_json(params)print(json.dumps(result,indent=2,sort_keys=True))# perform an image search on Google Images for "red flowers"params={"q":"red flowers","search_type":"images"}result=serpwow.get_json(params)print(json.dumps(result,indent=2,sort_keys=True))
以json、html和csv格式返回结果
serpwow可以使用get_json
、get_html
和get_csv
方法返回json、html和csv格式的数据。对于csv结果,使用csv_fields
参数(docs)请求特定的结果字段。
fromserpwow.google_search_resultsimportGoogleSearchResultsimportjson# create the serpwow object, passing in our API keyserpwow=GoogleSearchResults("API_KEY")# set up a dict for the query (q) and location parameters# note that the "location" parameter should be a value# returned from the Locations APIparams={"q":"pizza","location":"New York,New York,United States"}# retrieve the Google search results as JSONresult_json=serpwow.get_json(params)# retrieve the Google search results as HTMLresult_html=serpwow.get_html(params)# retrieve the Google search results as a CSVresult_csv=serpwow.get_csv(params)
请求移动和平板电脑结果
要请求serpwow通过移动或平板电脑浏览器呈现谷歌搜索结果,请使用device
参数:
fromserpwow.google_search_resultsimportGoogleSearchResultsimportjsonserpwow=GoogleSearchResults("API_KEY")# set up the mobile paramsparams_mobile={"q":"pizza","device":"mobile"}# set up the tablet paramsparams_tablet={"q":"pizza","device":"tablet"}# set up the desktop params (note we omit the "device" param)params_desktop={"q":"pizza"}# retrieve the mobile search resultsresult_mobile_json=serpwow.get_json(params_mobile)# retrieve the tablet search resultsresult_tablet_json=serpwow.get_json(params_tablet)# retrieve the desktop search resultsresult_desktop_json=serpwow.get_json(params_desktop)
分析结果
当通过get_json
方法发出请求时,将返回标准pythondict
。您可以检查这个dict来迭代、解析和存储应用程序中的结果。
fromserpwow.google_search_resultsimportGoogleSearchResultsimportjson# make a simple query, returning JSONserpwow=GoogleSearchResults("API_KEY")result=serpwow.get_json({"q":"pizza"})# determine if the request was successfulsuccess=result["request_info"]["success"]ifsuccess:# extract the time taken and number of organic resultstime_taken=result["search_metadata"]["total_time_taken"]organic_result_count=len(result["organic_results"])# printprintstr(organic_result_count)+" organic results returned in "+str(time_taken)+"s"
分页结果,每页最多返回100个结果
使用page
和num
参数在google搜索结果中分页。每页返回的最大结果数(由^ {CD10> } PARAM控制)对于所有的{{CD1}}都是100(谷歌施加的限制),除了谷歌位置,其中最大值是20。这里有一个例子。
fromserpwow.google_search_resultsimportGoogleSearchResultsimportjson# request the first 100 resultsserpwow=GoogleSearchResults("API_KEY")params={"q":"pizza","page":1,"num":100}result_page_1=serpwow.get_json(params)# request the next 100 resultsparams["page"]=2result_page_2=serpwow.get_json(params)# pretty-print the resultprint"Page 1"printjson.dumps(result_page_1,indent=2,sort_keys=True)print"Page 2"printjson.dumps(result_page_2,indent=2,sort_keys=True)
使用所有参数搜索示例
fromserpwow.google_search_resultsimportGoogleSearchResults# create the serpwow object, passing in our API keyserpwow=GoogleSearchResults("API_KEY")# set up a dict for the search parameters, retrieving results as CSV (note the csv_fields param)params={"q":"pizza","gl":"us","hl":"en","location":"New York,New York,United States","google_domain":"google.com","time_period":"custom","sort_by":"date","time_period_min":"02/01/2018","time_period_max":"02/08/2019","device":"mobile","csv_fields":"search.q,organic_results.position,organic_results.domain","page":"1","num":"100"}# retrieve the search results as CSVresult=serpwow.get_csv(params)printresult
位置API示例
Locations API允许您搜索serpwow支持的google搜索位置。您可以提供locations api返回的full_name
作为搜索api查询中的location
参数(请参见上面的Searching with a location示例),以检索位于该位置的地理位置的搜索结果。
fromserpwow.google_search_resultsimportGoogleSearchResultsimportjson# create the serpwow object, passing in our API keyserpwow=GoogleSearchResults("API_KEY")# set up a dict for the location query parametersparams={"q":"mumbai"}# retrieve locations matching the query parameters as JSONresult=serpwow.get_locations(params)# pretty-print the resultprint(json.dumps(result,indent=2,sort_keys=True))
帐户API示例
Account API允许您检查当前的serpwow使用情况和帐单信息。
fromserpwow.google_search_resultsimportGoogleSearchResultsimportjson# create the serpwow object, passing in our API keyserpwow=GoogleSearchResults("API_KEY")# get our account inforesult=serpwow.get_account()# pretty-print the resultprint(json.dumps(result,indent=2,sort_keys=True))
批处理API
Batches API允许您在serpwow帐户上创建、更新和删除批处理(批处理允许您最多保存15000个搜索,并让serpwow按计划运行它们)。
有关更多信息和大量代码示例,请参见Batches API Docs。