在Pandas中为DataFrame添加头部并删除索引
我有一个数据表(DataFrame),它被重新采样成一个更小的数据表,并保留了日期时间索引。然后我把这个数据表进行了转置,现在想把日期索引去掉,换成一些字符串(标签),然后导出成.csv格式,以便在JavaScript中使用(所有的数据处理都是在Python中完成的)。
我尝试过直接把它写成没有表头的.csv文件(去掉日期),然后再读回来加上标签,但这样似乎效率不高。
这是.csv文件的链接: https://www.dropbox.com/s/qy72yht2m7lk2pg/17_predicted.csv
Python/pandas代码:
import pandas as pd
import numpy as np
from dateutil.parser import parse
from datetime import datetime
from pandas import *
# Load csv into pandas DataFrame
df = pd.read_csv("17_predicted_dummydata.csv", parse_dates=True, dayfirst=False, keep_date_col=True, index_col=0)
#Creating a date range
df.index = pd.to_datetime(pd.date_range(start='1/1/2000 00:30:00', end='1/1/2000 05:00:00', freq='30T'))
#Rename index
df.index.name = 'date'
df_year = df.resample('D', how='sum')
df_year = np.round(df_year, 0)
df_year.index.name = 'label'
df_year.column = ['value']
df_year = df_year.T
print df_year.head()
print df_year.index.name
df_year.to_csv("17_dummy.csv") #drop index through header=False
CSV输入:
Date/Time,InteriorEquipment:Electricity:Zone:4419 [J](TimeStep),InteriorEquipment:Electricity:Zone:3967 [J](TimeStep),InteriorEquipment:Electricity:Zone:3993 [J](TimeStep)
01/01 00:30:00,0.583979872,0.428071889,0.044676234
01/01 01:00:00,0.583979872,0.428071889,0.044676234
01/01 01:30:00,0.583979872,0.428071889,0.044676234
01/01 02:00:00,0.583979872,0.428071889,0.044676234
01/01 02:30:00,0.583979872,0.428071889,0.044676234
01/01 03:00:00,0.583979872,0.428071889,0.044676234
01/01 03:30:00,0.583979872,0.428071889,0.044676234
01/01 04:00:00,0.583979872,0.428071889,0.044676234
01/01 04:30:00,0.583979872,0.428071889,0.044676234
01/01 05:00:00,0.583979872,0.428071889,0.044676234
建议的csv输出:
label,value
InteriorEquipment:Electricity:Zone:4419 [J](TimeStep),6.0
InteriorEquipment:Electricity:Zone:3967 [J](TimeStep),4.0
InteriorEquipment:Electricity:Zone:3993 [J](TimeStep),0.0
我尝试过按照这个(向pandas数据表插入一行)的方法,但没能成功。
任何帮助都非常感谢!
1 个回答
1
你可以直接给数据表的索引名称和列名赋值,这样就能让它按照你想要的方式输出。
In [288]: df_year.index.name = 'label'
In [289]: df_year.columns = ['value']
In [290]: print df_year.to_csv()
label,value
Equipment:Electricity:LGF,79468.0
Equipment:Electricity:GF,66724.0
Equipment:Electricity:1st,30700.0
Equipment:Electricity:2nd,24126.0
Lights:Electricity:LGF,30596.0
Lights:Electricity:GF,30596.0
Lights:Electricity:1st,14078.0
Lights:Electricity:2nd,11063.0
General:Equipment:Electricity,201018.0
General:Lights:Electricity,86334.0
Electricity:Facility,314318.0
Electricity:Building,287352.0
Electricity:Plant,6329.0
Gas:Facility,279252.0
Electricity:HVAC,20637.0
General:Fans:Electricity,3554.0
Cooling:Electricity,17083.0
Pumps:Electricity,3708.0
WaterSystems:Electricity,2621.0