搜索列表以查看它是否包含存储在python中不同列表中的字符串

import csv word_list = ["Slam", "Slams", "Slammed", "Slamming", "Blast", "Blasts", "Blasting", "Blasted"] slam_list = [] csv_data = [] # Creating the list I need by opening a csv and getting the column I need with open("website_headlines.csv", encoding="utf8") as csvfile: reader = csv.reader(csvfile) for row in reader: data.append(row) headline_col = [headline[2] for headline in csv_data]

2条回答

网友

1楼 · 编辑于 2024-05-16 08:59:42

在这里，由于您正在阅读csv，因此使用pandas来实现您的目标可能会更容易

您要做的是通过其索引来标识列，它看起来像是2。然后在word_list中找到第三列的值

import pandas as pd

df = pd.read_csv("website_headlines.csv")
col = df.columns[2]
df.loc[df[col].isin(word_list), col]

考虑下面的例子

import numpy as np
import pandas as pd

word_list = ["Slam", "Slams", "Slammed", "Slamming",
             "Blast", "Blasts", "Blasting", "Blasted"]

# add some extra characters to see if limited to exact matches
word_list_mutated = np.random.choice(word_list + [item + '_extra' for item in word_list], 10)

data = {'a': range(1, 11), 'b': range(1, 11), 'c': word_list_mutated}
df = pd.DataFrame(data)
col = df.columns[2]

>>>df.loc[df[col].isin(word_list), col]
    a   b               c
0   1   1           Slams
1   2   2           Slams
2   3   3   Blasted_extra
3   4   4          Blasts
4   5   5     Slams_extra
5   6   6  Slamming_extra
6   7   7            Slam
7   8   8     Slams_extra
8   9   9            Slam
9  10  10        Blasting

网友

2楼 · 编辑于 2024-05-16 08:59:42

所以，正如你提到的，使用集合绝对是一种方法。这是因为在集合中查找要比在列表中查找快得多。如果你想知道原因，可以在谷歌上快速搜索哈希。要进行此更改，只需将word_列表中的方括号更改为大括号

你需要处理的真正问题是“标题是由许多单词组成的字符串，而单词列表是单个单词”

你需要做的是重复许多单词。我假设headline_col是一个标题列表，其中headline是一个包含一个或多个单词的字符串。我们将遍历所有标题，然后遍历标题中的每个单词

word_list = {"Slam", "Slams", "Slammed", "Slamming", "Blast", "Blasts", "Blasting", "Blasted"}

# Iterate over each headline
for headline in headline_col:

    # Iterate over each word in headline
    # Headline.split will break the headline into a list of words (breaks on whitespace)
    for word in headline.split():

        # if we've found our word
        if word in word_list:
            # add the word to our list
            slam_list.append(headline)
            # we're done with this headline, so break from the inner for loop
            break

相关问题更多 >

编程相关推荐

热门问题

热门文章