在当前记录之前计算15个记录的平均值作为新列

bidopen bidhigh bidlow bidclose bidvolume currencypair 2007-03-30 16:01:00 1.9687 1.96900 1.9686 1.9686 877.40 2007-03-30 16:02:00 1.9686 1.96905 1.9686 1.9686 897.20 2007-03-30 16:03:00 1.9686 1.96900 1.9686 1.9690 1076.11 2007-03-30 16:04:00 1.9689 1.96910 1.9688 1.9690 849.70 2007-03-30 16:05:00 1.9690 1.96900 1.9688 1.9689 1402.80

currencypair,datetime,bidopen,bidhigh,bidlow,bidclose,askopen,askhigh,asklow,askclose,bidvolume,askvolume GBPUSD,2007-03-30 16:01:00,1.96870,1.96900,1.96860,1.96860,1.96850,1.96880,1.96845,1.96850,877.40,1386.70 GBPUSD,2007-03-30 16:02:00,1.96860,1.96905,1.96860,1.96860,1.96850,1.96890,1.96840,1.96840,897.20,1272.30 GBPUSD,2007-03-30 16:03:00,1.96860,1.96900,1.96860,1.96900,1.96850,1.96890,1.96840,1.96880,1076.11,1333.30 GBPUSD,2007-03-30 16:04:00,1.96890,1.96910,1.96880,1.96900,1.96880,1.96890,1.96865,1.96880,849.70,765.10 GBPUSD,2007-03-30 16:05:00,1.96900,1.96900,1.96880,1.96890,1.96875,1.96890,1.96860,1.96870,1402.80,1240.90 GBPUSD,2007-03-30 16:06:00,1.96890,1.96890,1.96840,1.96860,1.96870,1.96870,1.96820,1.96850,769.50,1727.30 GBPUSD,2007-03-30 16:07:00,1.96860,1.96880,1.96820,1.96830,1.96850,1.96870,1.96810,1.96820,842.00,1865.60 GBPUSD,2007-03-30 16:08:00,1.96830,1.96930,1.96830,1.96910,1.96820,1.96920,1.96820,1.96890,1096.60,1197.70 GBPUSD,2007-03-30 16:09:00,1.96910,1.96920,1.96880,1.96890,1.96895,1.96910,1.96865,1.96880,368.60,432.10

<class 'pandas.core.frame.DataFrame'> Index: 2362159 entries, 2007-03-30 16:01:00 to 2013-09-02 18:59:00 Data columns (total 5 columns): bidopen 2362159 non-null values bidhigh 2362159 non-null values bidlow 2362159 non-null values bidclose 2362159 non-null values bidvolume 2362159 non-null values dtypes: float64(5)

usecols = ['currencypair','datetime','bidopen','bidhigh','bidlow','bidclose','bidvolume'] df=pd.read_csv(path,parse_dates=('datetime'),index_col=1, usecols = usecols ) df=df.drop('currencypair',1)

bidopen bidhigh bidlow bidclose bidvolume datetime 2007-03-30 16:01:00 1.9687 1.96900 1.9686 1.9686 877.40 2007-03-30 16:02:00 1.9686 1.96905 1.9686 1.9686 897.20 2007-03-30 16:03:00 1.9686 1.96900 1.9686 1.9690 1076.11 2007-03-30 16:04:00 1.9689 1.96910 1.9688 1.9690 849.70 2007-03-30 16:05:00 1.9690 1.96900 1.9688 1.9689 1402.80

<class 'pandas.core.frame.DataFrame'> DatetimeIndex: 2362159 entries, 2007-03-30 16:01:00 to 2013-09-02 18:59:00 Data columns (total 5 columns): bidopen 2362159 non-null values bidhigh 2362159 non-null values bidlow 2362159 non-null values bidclose 2362159 non-null values bidvolume 2362159 non-null values dtypes: float64(5)

1条回答

网友

1楼 · 发布于 2024-05-14 14:04:50

当您只需要指定几个列时非常简单 e、 g.a的最大值，b的最小值

In [65]: df = DataFrame(randn(100,4),columns=list('abcd'),
        index=date_range('20130101 16:00',periods=100,freq='T'))

In [66]: df.head(20)
Out[66]: 
                            a         b         c         d
2013-01-01 16:00:00  0.404056  0.115774 -0.202356  0.998315
2013-01-01 16:01:00 -0.231966  0.262609  1.192302 -0.702163
2013-01-01 16:02:00 -0.467005  0.744724 -0.871782 -0.308637
2013-01-01 16:03:00 -0.175704  0.036244  1.404604 -0.106320
2013-01-01 16:04:00  0.046306 -0.098140  0.535573 -0.306300
2013-01-01 16:05:00 -0.115620 -1.069991  0.790965 -0.504283
2013-01-01 16:06:00  1.496555  0.373582  1.028092 -0.816990
2013-01-01 16:07:00  0.432081  0.182106  0.115107  1.239192
2013-01-01 16:08:00 -0.245789 -2.030840  0.118330 -1.922616
2013-01-01 16:09:00 -0.358188 -0.121750  1.768505 -2.096908
2013-01-01 16:10:00 -1.634722 -0.808355 -0.773417  0.095078
2013-01-01 16:11:00 -0.396295  0.168568 -0.901945 -0.073811
2013-01-01 16:12:00 -1.364391  2.052481 -0.175291  0.927363
2013-01-01 16:13:00 -0.523331  0.042475  0.361593 -2.239468
2013-01-01 16:14:00  1.573967 -0.709043  0.551812  0.452311
2013-01-01 16:15:00  0.180578  0.846856 -2.304107 -1.283507
2013-01-01 16:16:00  0.065386  0.356015 -0.174369  1.167562
2013-01-01 16:17:00 -1.747416  1.279114  0.559075  0.200927
2013-01-01 16:18:00 -2.041764 -0.085398  2.032789  0.195671
2013-01-01 16:19:00 -0.639329  0.268832  0.394621 -0.271260

滚动函数从这一点开始计算，所以我们进行时间偏移（这只是改变索引）使值对齐（与起点，而不是终点）

^{pr2}$

高低差只是

df['max_a'] - df['min_b']

似乎您的系列中有空白，请使用asfreq：

In [16]: df = DataFrame(randn(10,2),columns=list('ab'),index=date_range('20130101 9:00',freq='T',periods=10))

In [17]: df
Out[17]: 
                            a         b
2013-01-01 09:00:00  0.516518 -1.497564
2013-01-01 09:01:00  1.747399  1.100530
2013-01-01 09:02:00 -0.223476 -0.682712
2013-01-01 09:03:00  0.343172 -0.341965
2013-01-01 09:04:00 -1.380057 -1.565732
2013-01-01 09:05:00 -2.156675  1.043532
2013-01-01 09:06:00 -1.237155 -0.219086
2013-01-01 09:07:00  1.626510 -0.596204
2013-01-01 09:08:00 -0.767588  0.496110
2013-01-01 09:09:00 -0.014556  0.012049

In [18]: df.index
Out[18]: 
<class 'pandas.tseries.index.DatetimeIndex'>
[2013-01-01 09:00:00, ..., 2013-01-01 09:09:00]
Length: 10, Freq: T, Timezone: None

In [19]: df.append(Series(name=[Timestamp('20130101 09:15')]))
Out[19]: 
                            a         b
2013-01-01 09:00:00  0.516518 -1.497564
2013-01-01 09:01:00  1.747399  1.100530
2013-01-01 09:02:00 -0.223476 -0.682712
2013-01-01 09:03:00  0.343172 -0.341965
2013-01-01 09:04:00 -1.380057 -1.565732
2013-01-01 09:05:00 -2.156675  1.043532
2013-01-01 09:06:00 -1.237155 -0.219086
2013-01-01 09:07:00  1.626510 -0.596204
2013-01-01 09:08:00 -0.767588  0.496110
2013-01-01 09:09:00 -0.014556  0.012049
2013-01-01 09:15:00       NaN       NaN

In [20]: df.append(Series(name=[Timestamp('20130101 09:15')])).index
Out[20]: 
<class 'pandas.tseries.index.DatetimeIndex'>
[2013-01-01 09:00:00, ..., 2013-01-01 09:15:00]
Length: 11, Freq: None, Timezone: None

In [21]: df.append(Series(name=[Timestamp('20130101 09:15')])).asfreq('T')
Out[21]: 
                            a         b
2013-01-01 09:00:00  0.516518 -1.497564
2013-01-01 09:01:00  1.747399  1.100530
2013-01-01 09:02:00 -0.223476 -0.682712
2013-01-01 09:03:00  0.343172 -0.341965
2013-01-01 09:04:00 -1.380057 -1.565732
2013-01-01 09:05:00 -2.156675  1.043532
2013-01-01 09:06:00 -1.237155 -0.219086
2013-01-01 09:07:00  1.626510 -0.596204
2013-01-01 09:08:00 -0.767588  0.496110
2013-01-01 09:09:00 -0.014556  0.012049
2013-01-01 09:10:00       NaN       NaN
2013-01-01 09:11:00       NaN       NaN
2013-01-01 09:12:00       NaN       NaN
2013-01-01 09:13:00       NaN       NaN
2013-01-01 09:14:00       NaN       NaN
2013-01-01 09:15:00       NaN       NaN

In [22]: df.append(Series(name=[Timestamp('20130101 09:15')])).asfreq('T').index
Out[22]: 
<class 'pandas.tseries.index.DatetimeIndex'>
[2013-01-01 09:00:00, ..., 2013-01-01 09:15:00]
Length: 16, Freq: T, Timezone: None

相关问题更多 >

编程相关推荐

热门问题

热门文章