Python Pandas，创建指定列数据类型的空数据帧

def create_empty_dataframe(): index = pandas.Index([], name="id", dtype=int) column_names = ["name", "score", "height", "weight"] series = [pandas.Series(dtype=str), pandas.Series(dtype=int), pandas.Series(dtype=float), pandas.Series(dtype=float)] columns = dict(zip(column_names, series)) return pandas.DataFrame(columns, index=index, columns=column_names) # The columns=column_names is required because the dictionary will in general put the columns in arbitrary order.

3条回答

网友

1楼 · 编辑于 2024-05-19 03:05:31

还可以通过替换数据帧列来设置其数据类型：

df['column_name'] = df['column_name'].astype(float)

网友

2楼 · 编辑于 2024-05-19 03:05:31

你可以通过使用列表理解来简化一些事情

def create_empty_dataframe():
    index = pandas.Index([], name="id", dtype=int)
    # specify column name and data type 
    columns = [('name', str),
               ('score', int),
               ('height', float),
               ('weight', float)]
    # create the dataframe from a dict
    return pandas.DataFrame({k: pandas.Series(dtype=t) for k, t in columns})

这实际上与您已经做的没有太大的不同，但是不必修改代码中的多个位置，就可以更容易地创建任意数据帧。

网友

3楼 · 编辑于 2024-05-19 03:05:31

不幸的是，DateFramector接受一个dtype描述符，但是您可以使用read_csv进行一些欺骗：

In [143]:
import pandas as pd
import io
cols=["id", "name", "score", "height", "weight"]
df = pd.read_csv(io.StringIO(""), names=cols, dtype=dict(zip(cols,[int, str, int, float, float])), index_col=['id']) 
df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 0 entries
Data columns (total 4 columns):
name      0 non-null object
score     0 non-null int32
height    0 non-null float64
weight    0 non-null float64
dtypes: float64(2), int32(1), object(1)
memory usage: 0.0+ bytes

因此您可以看到数据类型是按需的，并且索引是按需设置的：

In [145]:

df.index
Out[145]:
Int64Index([], dtype='int64', name='id')

相关问题更多 >

编程相关推荐

热门问题

热门文章