在python和beutifulsoup中如何将数据追加到循环生成的数据帧中

2024-05-14 00:21:26 发布

您现在位置:Python中文网/ 问答频道 /正文

下面的循环假设在一个数据帧中添加多个表的行(html页)。循环工作得很好,它为每个表逐个创建一个数据帧,但它也会从我要修复的数据帧中替换上一个表的数据。它应该将每个表的数据附加到同一个数据帧中,而不应该替换数据帧中前一个表的数据。请帮帮我。你知道吗

column_headers = ['state', 'sr_no', 'district_name', 'country']

headers = ['district_id']

    district_link = [[li.get('href') for li in data_rows_link[i].findAll('a')]
               for i in range(len(data_rows))]

district_data_02 = []  # create an empty list to hold all the data

for i in range(len(data_rows)):  # for each table row
    district_row = []  # create an empty list for each pick/player
    district_row.append("a")

    # for each table data element from each table row
    for li in data_rows[i].findAll('li'):
        # get the text content and append to the district_row
        district_row.append(li.getText())

    # then append each pick/player to the district_data matrix
    district_data_02.append(district_row)

district_data == district_data_02

#dataframe - districtlist
districtlist = pd.DataFrame(district_data ,columns=column_headers)

districtid = pd.DataFrame(district_link, columns=headers)

#df_row_merged = pd.concat([df, df1])

#dataframe - districtid
final_districtlist =pd.concat([districtlist, districtid], axis=1)

Tags: the数据infordatalinklirows