用靓汤剔除数据后如何注入弹性搜索

2024-04-26 09:24:17 发布

您现在位置:Python中文网/ 问答频道 /正文

您好,我正在取消网站并尝试进入弹性搜索。
我能编字典。我想知道如何进入弹性搜索。每个医生在这里都是一份文件。我正在粘贴下面代码的输出

 import urllib.request
 import urllib.request
 import urllib.parse
 from bs4 import BeautifulSoup
 url = 'https://health.usnews.com/doctors/new-jersey'
 #data = data.encode('utf-8')
 headers = {}
 headers['User-Agent'] = "Mozilla/5.0 (X11; Linux i686)"
 req = urllib.request.Request(url, headers=headers)
 resp = urllib.request.urlopen(req)
 resp_data = resp.read()
 #print(resp_data)
 soup = BeautifulSoup(resp_data, 'html.parser')
 doc = soup.findAll('a', {'class': 'search-result-link bar-tighter'})
 links = ['https://health.usnews.com' + do.get('href', None) for do in doc]
 for link in links:
    headers = {}
    doctor = []
    headers['User-Agent'] = "Mozilla/5.0 (X11; Linux i686)"
    doc_req = urllib.request.Request(link,headers=headers)
    doc_resp = urllib.request.urlopen(doc_req)
    doc_resp_data = doc_resp.read()
    doc_soup = BeautifulSoup(doc_resp_data, 'html.parser')
    doc_name = doc_soup.find('h1')
    doc_name_text =  (doc_name.text).strip()
    doc_name_text_mod = (re.sub('\s+', ' ', doc_name_text))
    doc_name_text_mod_1  = ('Name' ':' +doc_name_text_mod)
    doctor.append(doc_name_text_mod_1)
    doc_overview = doc_soup.find('p')
    doc_overview_text = (doc_overview.text).strip()
    doc_overview_text_mod = (re.sub('\n\| ', ', ', doc_overview_text))
    doc_overview_text_mod_1  = ('Specialised and Location' ':' + doc_overview_text_mod)
    doctor.append(doc_overview_text_mod_1)
    #print (doctor)
    dicto =  (dict(s.split(':') for s in doctor))
    print(dicto)

>>>Output
 {'Name': 'Dr. Tajwar Aamir MD', 'Specialised and Location': 'Pediatrics, Princeton, NJ'}
 {'Name': 'Dr. Bernard Aaron MD', 'Specialised and Location': 'Gastroenterology, Brick, NJ'}

Tags: textnameimportmoddatadocrequestoverview