以有组织的方式从网站上获取信息

2024-05-14 16:27:12 发布

您现在位置:Python中文网/ 问答频道 /正文

我正试图用Python创建一个网站,但遇到了一些问题。我已经在网上发了很多文章和问题,但我仍然不能做我需要做的事情。 我有这个网站:

https://beta.nhs.uk/find-a-pharmacy/results?latitude=51.2457238068354&location=Little%20London%2C%20Hampshire%2C%20SP11&longitude=-1.45959328501975

我需要打印商店的名称和地址,并保存在一个文件(可以是csv或excel)。我试过硒、熊猫、靓汤,但都不管用

有人能帮我吗


Tags: https网站文章locationfind事情resultsbeta
2条回答
import requests
from bs4 import BeautifulSoup


page = requests.get("https://beta.nhs.uk/find-a-pharmacy/results?latitude=51.2457238068354&location=Little%20London%2C%20Hampshire%2C%20SP11&longitude=-1.45959328501975")

soup = BeautifulSoup(page.content, 'html.parser')
data = soup.find_all("div", class_="results__details")

for container in data:
  Pharmacyname = container.find_all("h2")
  Pharmacyadd  = container.find_all("p")
  for pharmacy in Pharmacyname:
      for add in Pharmacyadd:
          print(add.text)
          continue
      print(pharmacy.text)

输出:

Shepherds Spring Pharmacy Ltd is 1.8 miles away

       The Oval, 
       Cricketers Way, 

       Andover, 
       Hampshire, 
       SP10 5DN
      01264 355700

Map and directions for Shepherds Spring Pharmacy Ltd at The Oval
Services available in Shepherds Spring Pharmacy Ltd at The Oval
Open until 6:15pm today
Shepherds Spring Pharmacy Ltd
Tesco Instore Pharmacy is 2.1 miles away

       Tesco Superstore, 
       River  Way, 

       Andover, 
       Hampshire, 
       SP10 1UZ
      0345 677 9007

      .
      .
      .

Note: You could create separate lists for pharmacy_name and pharmacy_add to store the data and then write to the files. PS. You could also strip off the unwanted text from the lists (let's say the text after the Phone number from each pharmacy)

import requests
from bs4 import BeautifulSoup
import re
import xlsxwriter

workbook  = xlsxwriter.Workbook('File.xlsx')
worksheet = workbook.add_worksheet()

request = requests.get("https://beta.nhs.uk/find-a-pharmacy/results?latitude=51.2457238068354&location=Little%20London%2C%20Hampshire%2C%20SP11&longitude=-1.45959328501975")
soup = BeautifulSoup(request.content, 'html.parser')
data = soup.find_all("div", class_="results__details")
formed_data=[]
for results_details in data:
    formed_data.append([results_details.find_all("h2")[0].text,re.sub(' +',' ',results_details.find_all("p")[1].text.replace('\n',''))])
row=col=0
for name, adress in (formed_data):
    worksheet.write(row, col, name)
    worksheet.write(row, col + 1, adress)
    row += 1
workbook.close()

enter image description here

相关问题 更多 >

    热门问题