如何扁平化Pandas的系列?
在使用数据框(dataframe)或序列(series)的字符串提取方法时,得到的结果是一个包含多个序列的序列。我们该如何把这个结果变得简单一些,去掉里面的那些序列,而不需要逐行处理呢?
df = DataFrame({"a": ["asf cat llma 3X dog elepnt", "cat dog child monkey 5X sljka", "fsdjkl; dsfaa"]})
sr = df[df.a.str.contains("[3-6]X", na=False)].a.str.extract("([3-6]X)")
# sr is a series of series
[type(sr.iloc[i]) for i in range(sr.shape[0])]
# --> [<class 'pandas.core.series.Series'>, <class 'pandas.core.series.Series'>]
# I can add the expand parameter to no effect
sr = df[df.a.str.contains("[3-6]X", na=False)].a.str.extract("([3-6]X)", expand=True)
[type(sr.iloc[i]) for i in range(sr.shape[0])]
# --> [<class 'pandas.core.series.Series'>, <class 'pandas.core.series.Series'>]
1 个回答
3
如果我理解正确的话,
sr = df[df.a.str.contains("[3-6]X", na=False)].a.str.extract("([3-6]X)", expand=False)
[type(sr.iloc[i]) for i in range(sr.shape[0])]
输出结果是:
[str, str]