用另一个字符串更改大列表中的字符串

2024-06-06 22:26:22 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一份名单,叫“我的朋友”

enter image description here

我想用包含“BUKAN HOAX(1)”的“Label”字符串替换第行中的字符串为“BUKAN HOAX”,并将包含“HOAX(1)”的字符串更改为“HOAX”。 但是我在使用这个代码时发现了错误

for i in range (len(t_pre_eks_tfberita)):
if(t_pre_eks_tfberita[i][0]=="Label"):
    j=1
    while j in range (len(t_pre_eks_tfberita[i])):
        cek = re.search("BUKAN",t_pre_eks_tfberita[i][j])

        if(cek):
            t_pre_eks_tfberita[i][j] = "BUKANHOAX"
        else:
            t_pre_eks_tfberita[i][j] = "HOAX"
        j+=1

dfr_eks_tfberita = pd.DataFrame(list(map(list, zip(*t_pre_eks_tfberita))))
new_header = dfr_eks_tfberita.iloc[0] #grab the first row for the header
dfr_eks_tfberita = dfr_eks_tfberita[1:] #take the data less the header row
dfr_eks_tfberita.columns = new_header

for i in range(len(new_header)):
    if new_header[i] != 'Label' and new_header[i] != 'Isi_Dokumen':
        dfr_eks_tfberita[new_header[i]] = dfr_eks_tfberita[new_header[i]].astype('int')

dfr_eks_tfberita

当我运行它时,我发现了这样的错误

enter image description here

这个问题有什么解决办法吗


Tags: the字符串innewforlenrangepre
2条回答

使用稀土在这里是过度杀戮。 您需要遍历df值,只需检查“BUKAN HOAX(1)”或“HOAX(1)”

if "HOAX (1)" in t_pre_eks_tfberita[i][j]:
    dosomething()

但实际上,您可以在DF内部使用自己的函数(如iterrows())来完成

IIUC,用strip试试pandas.Series.str.replace

import pandas as pd

s = pd.Series(['HOAX', 'HOAX (1)', 'BUKAN HOAX', 'BUKAN HOAX (1000)'])
# Sample input
new_s = s.str.replace('\(\d+\)', '').str.strip()
print(new_s)

输出:

0          HOAX
1          HOAX
2    BUKAN HOAX
3    BUKAN HOAX
dtype: object

相关问题 更多 >