AttributeError:找不到类时，“非类型”对象没有属性“文本”

import re #regex import requests #fetches html page content from requests import get from bs4 import BeautifulSoup #parses html page content import pandas as pd import numpy as np #initialize empty list where we can store data categories = [] #Get the contents of the page we're looking at by requesting the URL results = requests.get("https://www.canpages.ca/business/AB/edmonton/restaurants/183-720200-p42.html", headers=headers) soup = BeautifulSoup(results.text, "html.parser") #grab the container of each company by result id companies_div = soup.find_all('div', {'id': re.compile('result-id-.*')}) for x in companies_div: # Extract category class and split by white space. Category should follow [City Category] but sometimes typos result in [Category] categoryChunk = x.find('div', class_='result__business-category').text.split() # if list does not have [City Category] format and therefore list length of 2, mark as "-" category = categoryChunk[1] if len(categoryChunk) == 2 else '-' categories.append(category) #ininitalize pd dataframe companies = pd.DataFrame({ 'category': categories, }) print(companies) companies.to_csv('companiestest6.csv')

categoryDiv = x.find('div', class_='result__business-category') if categoryDiv: categoryChunk = categoryDiv.text.split() if len(addressChunk) == 3: category = categoryChunk[1] categories.append(category) else: category = '-' categories.append(category) else: category = '-' categories.append(category)

1条回答

网友

1楼 · 发布于 2024-05-26 21:51:03

似乎您应该能够相当简单地测试.find返回的内容

div = x.find('div', class_='result__business-category')

if div:
    categoryChunk = div.text.split()

    category = categoryChunk[1]

else:
    category = '-'

这不会显式地测试长度为2的情况，但我假设这只是为了在找不到的情况下尝试获取

相关问题更多 >

编程相关推荐

热门问题

热门文章