列表推导式移除仅包含数字(包含"_"或"-")的Python列表元素
我有很多这样的列表:
synonyms = ["3,2'-DIHYDROXYCHALCONE", '36574-83-1', '36574831', "2',3-Dihydroxychalcone", '(E)-1-(2-hydroxyphenyl)-3-(3-hydroxyphenyl)prop-2-en-1-one', MLS002693861]
我需要从中删除所有只包含数字的元素。我搞不懂怎么去掉元素 [1]
,因为它虽然是数字,但中间夹杂了随机的破折号。
当然,这样做是行不通的,因为破折号让这个元素不再是纯数字:
synonym_subset = [x for x in synonym_subset if not (x.isdigit())]
而且我不能简单地去掉破折号,因为我希望其他元素中的破折号保留:
synonym_subset = [x.replace('-','') for x in synonym_subset]
我可以运行上面的代码来找到要删除的元素的索引,然后再通过索引去删除它们,但我希望能有一个一行代码就能搞定的方法。
谢谢。
3 个回答
0
你可以使用 filter
这个功能 [编辑:不过在这种情况下,你可能不应该使用它,正如评论中提到的那样]:
import re
synonyms = [
"3,2'-DIHYDROXYCHALCONE",
"36574-83-1",
"36574831",
"2',3-Dihydroxychalcone",
"(E)-1-(2-hydroxyphenyl)-3-(3-hydroxyphenyl)prop-2-en-1-one",
"MLS002693861",
]
filtered_synonyms = list(
filter(lambda x: not re.sub(r"[-_]", "", x).isdigit(), synonyms)
)
结果是:
["3,2'-DIHYDROXYCHALCONE", "2',3-Dihydroxychalcone", '(E)-1-(2-hydroxyphenyl)-3-(3-hydroxyphenyl)prop-2-en-1-one', 'MLS002693861']
2
作为对已经发布的回复的一个小补充,根据字符串的长度,使用set()可能会更合适,比如:
synonyms = [
"3,2'-DIHYDROXYCHALCONE",
"36574-83-1",
"36574831",
"2',3-Dihydroxychalcone",
"(E)-1-(2-hydroxyphenyl)-3-(3-hydroxyphenyl)prop-2-en-1-one",
"MLS002693861",
]
myset = set("0123456789-_")
[s for s in synonyms if not set(s).issubset(myset)]
编辑:正如@no comment提到的,这可以通过使用issuperset
进一步改进,如下所示:
isdigits = set("0123456789-_").issuperset
[s for s in synonyms if not isdigits(s)]
每个都会返回:
["3,2'-DIHYDROXYCHALCONE", "2',3-Dihydroxychalcone", '(E)-1-(2-hydroxyphenyl)-3-(3-hydroxyphenyl)prop-2-en-1-one', 'MLS002693861']
附言:另一种方法是使用ord(),但这通常会更慢,而且可读性较差:
[s for s in synonyms if not all(ord(k) in (*range(48,58),45,95) for k in s)]
3
试试这个:
synonyms = [
"3,2'-DIHYDROXYCHALCONE",
"36574-83-1",
"36574831",
"2',3-Dihydroxychalcone",
"(E)-1-(2-hydroxyphenyl)-3-(3-hydroxyphenyl)prop-2-en-1-one",
"MLS002693861",
]
out = [s for s in synonyms if not all(ch in "0123456789-_" for ch in s)]
print(out)
输出结果是:
[
"3,2'-DIHYDROXYCHALCONE",
"2',3-Dihydroxychalcone",
"(E)-1-(2-hydroxyphenyl)-3-(3-hydroxyphenyl)prop-2-en-1-one",
"MLS002693861",
]