Pandas。应用在spacy doc列上返回none值

2024-06-09 13:55:40 发布

男 | 程序猿一只，喜欢编程写python代码。

我在我的“sp500news3”上运行以下命令，它返回一个None值

def extract_ticker(title):
    for word in title:
        if word in constituents['Symbol']:
            return word

sp500news3['tickers'] = sp500news3['title'].apply(extract_ticker)

#sp500news3 sample:



  index date_publish    title   tickers
0   79944   2007-01-29 19:08:35 (MSFT, Vista, corporate, sales, go, very, well) None
1   181781  2007-12-14 19:39:06 (WMB, No, Anglican, consensus, on, Episcopal, Church)   None
2   213175  2008-01-22 11:17:19 (CSX, quarterly, profit, rises) None
3   93554   2008-01-22 18:52:56 (C, says, 30, bln, capital, helps, exceed, target)  None

成分['Symbol']：样品

0      TWX  
1      C  
2      MSFT  
3      WMB ...

从以下位置复制spacy文档：

constituents =  pd.DataFrame({"Symbol":["TWX","C","MSFT","WMB"]})

sp500news3 = pd.DataFrame({"title":["MSFT Vista corporate sales go very well","WMB No Anglican consensus on Episcopal Church","CSX quarterly profit rises",'C says 30 bln capital helps exceed target','TWX plans cable spinoff']})

import spacy

nlp = spacy.load('en_core_web_sm')

sp500news3['title'] = sp500news3['title'].apply(nlp)

Tags： in none spacy title extract symbol word ticker

1条回答

网友

1楼 · 发布于 2024-06-09 13:55:40

必须使用word.text，因为iterating over a ^{}在^{} which doesn't implement ^{} for strings上迭代时：

for word in title:
    if word.text in constituents['Symbol'].values:
        return word

以您的例子：

In [11]: sp500news3['title'].apply(extract_ticker)
Out[11]:
0    MSFT
1     WMB
2    None
3       C
4     TWX
Name: title, dtype: object

Pandas。应用在spacy doc列上返回none值

相关问题更多 >

编程相关推荐

热门问题

热门文章

Pandas。应用在spacy doc列上返回none值

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >