Python - pandas - 将系列附加到空DataFrame中

7 投票

1 回答

18834 浏览

提问于 2025-04-18 08:09

假设我在Python中有两个pandas的Series：

import pandas as pd
h = pd.Series(['g',4,2,1,1])
g = pd.Series([1,6,5,4,"abc"])

我可以先用h创建一个DataFrame，然后把g加到里面：

df = pd.DataFrame([h])
df1 = df.append(g, ignore_index=True)

这样我就得到了：

>>> df1
   0  1  2  3    4
0  g  4  2  1    1
1  1  6  5  4  abc

但是现在假设我有一个空的DataFrame，我想把h加进去：

df2 = pd.DataFrame([])
df3 = df2.append(h, ignore_index=True)

这样做不行。我觉得问题出在倒数第二行代码上。我需要以某种方式定义这个空的DataFrame，让它有正确的列数。

顺便说一下，我之所以想这么做，是因为我正在用requests和BeautifulSoup从网上抓取文本，然后处理这些文本，想把它们一行一行地写入DataFrame。

数据处理数据分析 beautifulsoup 数据抓取 pandas dataframe requests series

1 个回答

所以，如果你不给DataFrame构造函数传一个空列表，它就能正常工作：

In [16]:

df = pd.DataFrame()
h = pd.Series(['g',4,2,1,1])
df = df.append(h,ignore_index=True)
df
Out[16]:
   0  1  2  3  4
0  g  4  2  1  1

[1 rows x 5 columns]

这两种构造方式的区别在于，索引的 dtypes 设置不同，传入空列表时是 Int64，而不传任何东西时是 object：

In [21]:

df = pd.DataFrame()
print(df.index.dtype)
df = pd.DataFrame([])
print(df.index.dtype)
object
int64

我不太明白为什么这会影响行为（我只是猜测）。

更新

在重新查看这个问题后，我可以确认这在我看来是pandas版本 0.12.0 的一个bug，因为你最初的代码运行得很好：

In [13]:

import pandas as pd
df = pd.DataFrame([])
h = pd.Series(['g',4,2,1,1])
df.append(h,ignore_index=True)

Out[13]:
   0  1  2  3  4
0  g  4  2  1  1

[1 rows x 5 columns]

我正在使用pandas 0.13.1 和numpy 1.8.1 64位，运行在python 3.3.5.0 上，但我觉得问题出在pandas上，不过为了保险起见，我建议你同时升级pandas和numpy，我认为这不是32位和64位python的问题。

回答于 2025-04-18 由 Python大师

分享举报

Python - pandas - 将系列附加到空DataFrame中

1 个回答

撰写回答