在我的数据框中找不到列的子字符串匹配

0 投票

1 回答

53 浏览

提问于 2025-04-14 16:09

def process_and_predict(folder_path):
    image_files = os.listdir(folder_path)
    results_df = pd.DataFrame(columns=['name', 'prediction', 'actual'])
    
    for image_file in image_files:
        #some pre processing
        str1 = str(image_file)#converting name to string, just a precaution not really necessary since i have confirmed it is the same
        str1 = str1.strip()
        st.write("string ",str1)
        actual = df.loc[df['Image_filename'].str.contains(str1), 'BIRADS'].values[0]

我有一个数据表，里面有一列叫做'Image_filename'，存放的是文件的路径。我正在遍历一些测试图片，想找到与image_file匹配的那一行，并提取'BIRADS'这一列的值。

举个例子，比如“inst/BIRADS 2/birads - 2 (11).bmp”，这是我数据表中df['Image_filename']的一项值。

在遍历的时候，image_file（变成了str1）得到了一个值——'birads - 2 (11).bmp'。

理论上，上面的代码应该能找到匹配的项，但实际上没有找到，我收到的提示是——

UserWarning: This pattern is interpreted as a regular expression, and has match groups. To actually get the groups, use str.extract.new_info['Image_filename'].str.contains(x)

这很奇怪，因为当str1的值是'case001.png'时，同样的代码却能顺利找到匹配项。

在'Image_filename'中匹配的项是——'BrEaST-Lesions_USG-images_and_masks/case001.png'。

文件路径数据处理数据提取字符串匹配图像文件列索引数据框 BIRADS

1 个回答

试试这个：.new_info['Image_filename'].str.contains(x, regex=False)

默认情况下，x 会被当作正则表达式来理解，其中 (...) 是有特殊含义的。想了解更多，可以查看这里。

回答于 2025-04-14 由 Python大师

分享举报

在我的数据框中找不到列的子字符串匹配

1 个回答

撰写回答