数据帧中使用For循环的字符串到时间

2024-04-19 00:24:18 发布

您现在位置:Python中文网/ 问答频道 /正文

我不想去想我花了多长时间试图解决这个琐碎的问题,但我正在尝试将一个字符串转换为特定列中每一行的日期。下面是我的数据框:

table:
        day date            rankgross_budget
    0   Fri Sep. 18, 2015   5   $2,298,380
    1   Sat Sep. 19, 2015   5   $2,993,960
    2   Sun Sep. 20, 2015   5   $1,929,695
    3   Mon Sep. 21, 2015   5   $617,410
    4   Tue Sep. 22, 2015   5   $851,220

我将日期更改为日期格式的尝试失败如下:

for d in table.date :
        table.date[d] = time.strptime(table.date[d],'%b. %d, %Y')

我被抛出这个错误:

TypeError                                 Traceback (most recent call last)
pandas/index.pyx in pandas.index.IndexEngine.get_loc (pandas/index.c:3704)()

pandas/hashtable.pyx in pandas.hashtable.Int64HashTable.get_item (pandas/hashtable.c:7200)()

TypeError: an integer is required

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
<ipython-input-8-cc64c6038ec8> in <module>()
     21 
     22 for d in table.date :
---> 23         table.date[d] = time.strptime(table.date[d],'%b. %d, %Y')
     24 
     25 table.head()

/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/pandas/core/series.py in __getitem__(self, key)
    519     def __getitem__(self, key):
    520         try:
--> 521             result = self.index.get_value(self, key)
    522 
    523             if not np.isscalar(result):

/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/pandas/core/index.py in get_value(self, series, key)
   1593 
   1594         try:
-> 1595             return self._engine.get_value(s, k)
   1596         except KeyError as e1:
   1597             if len(self) > 0 and self.inferred_type in ['integer','boolean']:

pandas/index.pyx in pandas.index.IndexEngine.get_value (pandas/index.c:3113)()

pandas/index.pyx in pandas.index.IndexEngine.get_value (pandas/index.c:2844)()

pandas/index.pyx in pandas.index.IndexEngine.get_loc (pandas/index.c:3761)()

KeyError: 'Sep. 18, 2015'

我哪里出错了?在此方面的任何帮助都将不胜感激:)


Tags: keyinselfpandasforgetdateindex
2条回答

您试图访问一个table.date[d]的值,该值没有值'Sep. 18, 2015'的索引,因此KeyError。打印table.date列时,您将看到它是这样的:

In [19]: df.date
Out[19]: 
0    Sep. 18, 2015
1    Sep. 19, 2015
Name: date, dtype: object

通常应使用apply()方法执行此操作,apply()将函数作为参数并将其应用于指定的列:

# Create the function that transforms the string.
to_time = lambda x: time.strptime(x,'%b. %d, %Y')

# Pass it to apply for the column "date".
table["date"] = table["date"].apply(to_time)

对于模拟数据,结果是:

Out[17]: 
0    (2015, 9, 18, 0, 0, 0, 4, 261, -1)
1    (2015, 9, 19, 0, 0, 0, 5, 262, -1)
Name: date, dtype: object

一个简单的选择是使用to_datetime()。你知道吗

df['date'] = pd.to_datetime(df['date'])

给你:

df['date']
0   2015-09-18
1   2015-09-19
2   2015-09-20
3   2015-09-21
4   2015-09-22
Name: date, dtype: datetime64[ns]

相关问题 更多 >