转换时区 pandas 数据框
我有一些数据:
Symbol bid ask
Timestamp
2014-01-01 21:55:34.378000 EUR/USD 1.37622 1.37693
2014-01-01 21:55:40.410000 EUR/USD 1.37624 1.37698
2014-01-01 21:55:47.210000 EUR/USD 1.37619 1.37696
2014-01-01 21:55:57.963000 EUR/USD 1.37616 1.37696
2014-01-01 21:56:03.117000 EUR/USD 1.37616 1.37694
这些时间是按照格林威治标准时间(GMT)来记录的。有没有办法把它们转换成东部时间呢?
注意,当我这样做的时候:
data.index
我得到的结果是:
<class 'pandas.tseries.index.DatetimeIndex'>
[2014-01-01 21:55:34.378000, ..., 2014-01-01 21:56:03.117000]
Length: 5, Freq: None, Timezone: None
4 个回答
0
这个对我有效:
# Import pandas
import pandas as pd
# Import pytz
import pytz
# If the column is not the index
# Assuming the column is in UTC or GMT times
# Convert to your desired time zone and remove the time zone information after conversion
df_1['time'] = pd.to_datetime(df_1['time'], unit='s', utc=True).dt.tz_convert('Europe/Paris').dt.tz_localize(None)
# To create a separate column with only the date values and without time
df_1['date'] = pd.to_datetime(df_1['time']).dt.normalize()
6
要把东部标准时间(EST)转换成亚洲时区的时间
df.index = data.index.tz_localize('EST')
df.index = data.index.tz_convert('Asia/Kolkata')
Pandas现在有内置的时区转换功能。
49
最简单的方法是使用 to_datetime
并设置 utc=True
:
df = pd.DataFrame({'Symbol': ['EUR/USD'] * 5,
'bid': [1.37622, 1.37624, 1.37619, 1.37616, 1.37616],
'ask': [1.37693, 1.37698, 1.37696, 1.37696, 1.37694]})
df.index = pd.to_datetime(['2014-01-01 21:55:34.378000',
'2014-01-01 21:55:40.410000',
'2014-01-01 21:55:47.210000',
'2014-01-01 21:55:57.963000',
'2014-01-01 21:56:03.117000'],
utc=True)
如果你想要更多的灵活性,可以使用 tz_convert()
来转换时区。如果你的数据列或索引没有时区信息,系统会给你一个警告,这时你需要先用 tz_localize
来添加时区信息。
df = pd.DataFrame({'Symbol': ['EUR/USD'] * 5,
'bid': [1.37622, 1.37624, 1.37619, 1.37616, 1.37616],
'ask': [1.37693, 1.37698, 1.37696, 1.37696, 1.37694]})
df.index = pd.to_datetime(['2014-01-01 21:55:34.378000',
'2014-01-01 21:55:40.410000',
'2014-01-01 21:55:47.210000',
'2014-01-01 21:55:57.963000',
'2014-01-01 21:56:03.117000'])
df.index = df.index.tz_localize('GMT')
df.index = df.index.tz_convert('America/New_York')
对于日期时间列,这个方法也适用,但在访问列时需要加上 dt
:
df['column'] = df['column'].dt.tz_convert('America/New_York')
56
将索引的时间设置为协调世界时(UTC),这样时间戳就会知道时区信息,然后再转换为东部时间(使用 tz_convert
):
import pytz
eastern = pytz.timezone('US/Eastern')
df.index = df.index.tz_localize(pytz.utc).tz_convert(eastern)
举个例子:
import pandas as pd
import pytz
index = pd.date_range('20140101 21:55', freq='15S', periods=5)
df = pd.DataFrame(1, index=index, columns=['X'])
print(df)
# X
# 2014-01-01 21:55:00 1
# 2014-01-01 21:55:15 1
# 2014-01-01 21:55:30 1
# 2014-01-01 21:55:45 1
# 2014-01-01 21:56:00 1
# [5 rows x 1 columns]
print(df.index)
# <class 'pandas.tseries.index.DatetimeIndex'>
# [2014-01-01 21:55:00, ..., 2014-01-01 21:56:00]
# Length: 5, Freq: 15S, Timezone: None
eastern = pytz.timezone('US/Eastern')
df.index = df.index.tz_localize(pytz.utc).tz_convert(eastern)
print(df)
# X
# 2014-01-01 16:55:00-05:00 1
# 2014-01-01 16:55:15-05:00 1
# 2014-01-01 16:55:30-05:00 1
# 2014-01-01 16:55:45-05:00 1
# 2014-01-01 16:56:00-05:00 1
# [5 rows x 1 columns]
print(df.index)
# <class 'pandas.tseries.index.DatetimeIndex'>
# [2014-01-01 16:55:00-05:00, ..., 2014-01-01 16:56:00-05:00]
# Length: 5, Freq: 15S, Timezone: US/Eastern