Pandas数据帧中的神秘转换,如何禁用

2024-06-16 10:22:42 发布

您现在位置:Python中文网/ 问答频道 /正文

我有这样一个代码:

        series_tmp = pd.Series()
        series_tmp["date"] = pd.Timestamp(str(msg.date))
        series_tmp["Timestamp_downloaded"] = pd.Timestamp.now(self._timezone_to_use)
        series_tmp["contract_str"] = contract_str_now
        series_tmp["open"] = float(msg.open)
        series_tmp["high"] = float(msg.high)
        series_tmp["low"] = float(msg.low)
        series_tmp["close"] = float(msg.close)
        series_tmp["volume"] = float(msg.volume)
        series_tmp["count"] = float(msg.count)
        series_tmp["WAP"] = float(msg.WAP)
        series_tmp["hasGaps"] = float(msg.hasGaps)
        # print series_tmp
        self._df_to_record_historical_data = self._df_to_record_historical_data.append(pd.Series(series_tmp),ignore_index=True)

最后,出于某种原因,最终数据帧的日期类型数据变得毫无意义,例如:

date                                              1517270400000000000
Timestamp_downloaded                              1518209472212471000
contract_str            ('GBL', 'FUT', 'DTB', 'EUR', '20180308', 0.0)
open                                                           158.74
high                                                           159.18
low                                                            158.73
close                                                          158.91
volume                                                         797602
count                                                           57142
WAP                                                           158.989
hasGaps                                                             0

日期变成了数字。。。最简单的方法是什么。请注意,“date”是一个时间戳,Timestamp_downloaded是具有指定时区的时间戳。你知道吗

编辑: 要使问题更加清楚:

首先通过以下方式:

series_tmp = pd.Series()
series_tmp["date"] = pd.Timestamp("20180101")

结果是:

date   2018-01-01
dtype: datetime64[ns]
In [ ]:

通过这样做:

series_tmp = pd.Series()
series_tmp["date"] = pd.Timestamp("20180101")
series_tmp["Timestamp_downloaded"] = pd.Timestamp.now(time_zone_to_use)

时区信息丢失(我实际上需要保留):

date                   2018-01-01 00:00:00.000000
Timestamp_downloaded   2018-02-09 21:31:58.566041
dtype: datetime64[ns]

通过这样做:

series_tmp = pd.Series()
series_tmp["date"] = pd.Timestamp("20180101")
series_tmp["Timestamp_downloaded"] = pd.Timestamp.now(time_zone_to_use)
series_tmp["name"] = "Name1"

它变成:

date                    1514764800000000000
Timestamp_downloaded    1518212057225521000
name                                  Name1
dtype: object

现在,序列不能存储pd.时间戳它把它转换成整数。。。这种情况对我来说很困难。。。因为最后,我得到一些数据帧如下:

Timestamp_downloaded    WAP close   contract_str    count   date    hasGaps high    low open    volume
0   1.518212e+18    159.0520    158.71  ('GBL', 'FUT', 'DTB', 'EUR', '20180308', 0.0)   63672.0 1.517184e+18    0.0 159.64  158.66  159.59  957215.0
1   1.518212e+18    158.9895    158.91  ('GBL', 'FUT', 'DTB', 'EUR', '20180308', 0.0)   57142.0 1.517270e+18    0.0 159.18  158.73  158.74  797602.0
2   1.518212e+18    159.0235    158.82  ('GBL', 'FUT', 'DTB', 'EUR', '20180308', 0.0)   60825.0 1.517357e+18    0.0 159.33  158.50  158.96  878128.0
3   1.518212e+18    158.4750    158.60  ('GBL', 'FUT', 'DTB', 'EUR', '20180308', 0.0)   70543.0 1.517443e+18    0.0 158.81  158.15  158.55  1012469.0
4   1.518212e+18    158.0410    157.87  ('GBL', 'FUT', 'DTB', 'EUR', '20180308', 0.0)   67786.0 1.517530e+18    0.0 158.36  157.71  158.17  976233.0
5   1.518212e+18    158.2065    158.09  ('GBL', 'FUT', 'DTB', 'EUR', '20180308', 0.0)   59744.0 1.517789e+18    0.0 159.30  157.62  157.67  825094.0
6   1.518212e+18    158.8200    158.86  ('GBL', 'FUT', 'DTB', 'EUR', '20180308', 0.0)   107830.0    1.517875e+18    0.0 159.24  158.48  158.96  1222665.0
7   1.518212e+18    158.4925    158.23  ('GBL', 'FUT', 'DTB', 'EUR', '20180308', 0.0)   67543.0 1.517962e+18    0.0 158.92  157.92  158.68  895965.0
8   1.518212e+18    157.7935    157.74  ('GBL', 'FUT', 'DTB', 'EUR', '20180308', 0.0)   77263.0 1.518048e+18    0.0 158.23  157.26  157.92  1077249.0
9   1.518212e+18    158.0740    158.01  ('GBL', 'FUT', 'DTB', 'EUR', '20180308', 0.0)   76398.0 1.518134e+18    0.0 158.65  157.70  158.05  866737.0

然而,我需要将Timestamp_downloaded转换为时区敏感型pd.时间戳列和date进入时区不敏感的列。。。我确实有理由这么做。。。你知道吗


Tags: todatemsgfloateurtmptimestampseries
1条回答
网友
1楼 · 发布于 2024-06-16 10:22:42

我不能在级数级解决这个问题。但是在我得到相应的数据帧之后,我做了如下的事情:

historical_data_df["date"] = pd.to_datetime(historical_data_df["date"])

historical_data_df["Timestamp_downloaded"] = pd.to_datetime(historical_data_df["Timestamp_downloaded"]).dt.tz_localize(time_zone_to_use)

这就解决了问题:

Timestamp_downloaded    WAP close   contract_str    count   date    hasGaps high    low open    volume
0   2018-02-09 21:53:21.974567936-05:00 159.0520    158.71  ('GBL', 'FUT', 'DTB', 'EUR', '20180308', 0.0)   63672.0 2018-01-29  0.0 159.64  158.66  159.59  957215.0
1   2018-02-09 21:53:21.986366976-05:00 158.9895    158.91  ('GBL', 'FUT', 'DTB', 'EUR', '20180308', 0.0)   57142.0 2018-01-30  0.0 159.18  158.73  158.74  797602.0
2   2018-02-09 21:53:21.998098944-05:00 159.0235    158.82  ('GBL', 'FUT', 'DTB', 'EUR', '20180308', 0.0)   60825.0 2018-01-31  0.0 159.33  158.50  158.96  878128.0
3   2018-02-09 21:53:22.010740992-05:00 158.4750    158.60  ('GBL', 'FUT', 'DTB', 'EUR', '20180308', 0.0)   70543.0 2018-02-01  0.0 158.81  158.15  158.55  1012469.0
4   2018-02-09 21:53:22.022957056-05:00 158.0410    157.87  ('GBL', 'FUT', 'DTB', 'EUR', '20180308', 0.0)   67786.0 2018-02-02  0.0 158.36  157.71  158.17  976233.0
5   2018-02-09 21:53:22.033947904-05:00 158.2065    158.09  ('GBL', 'FUT', 'DTB', 'EUR', '20180308', 0.0)   59744.0 2018-02-05  0.0 159.30  157.62  157.67  825094.0
6   2018-02-09 21:53:22.043888896-05:00 158.8200    158.86  ('GBL', 'FUT', 'DTB', 'EUR', '20180308', 0.0)   107830.0    2018-02-06  0.0 159.24  158.48  158.96  1222665.0
7   2018-02-09 21:53:22.055529984-05:00 158.4925    158.23  ('GBL', 'FUT', 'DTB', 'EUR', '20180308', 0.0)   67543.0 2018-02-07  0.0 158.92  157.92  158.68  895965.0
8   2018-02-09 21:53:22.067960064-05:00 157.7935    157.74  ('GBL', 'FUT', 'DTB', 'EUR', '20180308', 0.0)   77263.0 2018-02-08  0.0 158.23  157.26  157.92  1077249.0
9   2018-02-09 21:53:22.078354944-05:00 158.0740    158.01  ('GBL', 'FUT', 'DTB', 'EUR', '20180308', 0.0)   76398.0 2018-02-09  0.0 158.65  157.70  158.05  866737.0

更具体地说:

从该数据帧提取行将导致:

Timestamp_downloaded              2018-02-09 17:31:32.948958976-05:00
WAP                                                           158.074
close                                                          158.01
contract_str            ('GBL', 'FUT', 'DTB', 'EUR', '20180308', 0.0)
count                                                           76398
date                                              2018-02-09 00:00:00
hasGaps                                                             0
high                                                           158.65
low                                                             157.7
open                                                           158.05
volume                                                         866737
Name: 9, dtype: object

这是问题中提出的最初期望的结果。如果有人有办法不走这条路那就太好了。。。你知道吗

相关问题 更多 >