我可以在每个单元格都是列表的Python列上使用正则表达式搜索或匹配吗？

import pandas as pd import numpy as np cycling = pd.DataFrame( { 'qty' : [1,0,2,1,1], 'item' : ['frame','frame',np.nan,'order including a saddle and other things','brake'], 'desc' : [np.nan,['bike','wheel'],['bike',['tire','tube']],['saddle',['seatpost','bag']],['bike','brakes']] } )

--------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-45-4c72cdaa87a4> in <module>() ----> 1 cycling['saddle2'] = [int(bool(re.search(r"saddle",x))) for y in cycling['desc'].replace(np.nan,'missing') for x in y] 2 cycling.head() 1 frames /usr/lib/python3.6/re.py in search(pattern, string, flags) 180 """Scan through string looking for a match to the pattern, returning 181 a match object, or None if no match was found.""" --> 182 return _compile(pattern, flags).search(string) 183 184 def sub(pattern, repl, string, count=0, flags=0): TypeError: expected string or bytes-like object

1条回答

网友

1楼 · 发布于 2024-04-23 11:52:53

您可以使用map，而不是运行for循环（这很慢）。您可以将列表转换为str以调用正则表达式。例如：

import pandas as pd
import numpy as np
import re

cycling = pd.DataFrame(
    {
        'qty' : [1,0,2,1,1],
        'item' : ['frame','frame',np.nan,'order including a saddle and other things','brake'],
        'desc' : [np.nan,['bike','wheel'],['bike',['tire','tube']],['saddle',['seatpost','bag']],['bike','brakes']]
    }
)
cycling['saddle1'] = cycling['item'].replace(np.nan,'missing').map(lambda x :int(bool(re.search(r"saddle",x))))
cycling['saddle2'] = cycling['desc'].replace(np.nan,'missing').map(lambda x :int(bool(re.search(r"saddle",str(x)))))

cycling

希望这有帮助！！一,

相关问题更多 >

编程相关推荐

热门问题

热门文章