apply函数返回随机子字符串而不是完整字符串
我已经试过了:
def extract_ticker(title):
for word in title:
word_str = word.encode('utf-8')
if word_str in constituents['Symbol'].values:
return word_str
sp500news3['tickers'] = sp500news3['title'].apply(extract_ticker)
它回来了
sp500news3['tickers']
79944 M
181781 M
213175 C
93554 C
257327 T
而不是预期的产出
79944 MSFT
181781 WMB
213175 CSX
93554 C
257327 TWX
从下面创建示例
constituents = pd.DataFrame({"Symbol":["TWX","C","MSFT","WMB"]})
sp500news3 = pd.DataFrame({"title":["MSFT Vista corporate sales go very well","WMB No Anglican consensus on Episcopal Church","CSX quarterly profit rises",'C says 30 bln capital helps exceed target','TWX plans cable spinoff']})
为什么不改用正则表达式提取股票代码呢
将^{} 与单词bondaries和
|
的联接值一起使用:您的解决方案应该按空格使用
split
,也许encode
也是必要的:相关问题 更多 >
编程相关推荐