数据帧中的移位是什么意思?

2024-04-24 03:44:11 发布

您现在位置:Python中文网/ 问答频道 /正文

我陷入了以下困境

import quandl,math
import pandas as pd
import numpy as np
from  sklearn import preprocessing ,cross_validation , svm
from sklearn.linear_model import  LinearRegression


df = quandl.get('WIKI/GOOGL')




df = df[['Adj. Open','Adj. High','Adj. Low','Adj. Close','Adj. Volume']]

df['HL_PCT'] = (df["Adj. High"] - df['Adj. Close'])/df['Adj. Close'] * 100
df['PCT_CHANGE'] = (df["Adj. Close"] - df['Adj. Open'])/df['Adj. Open'] * 100

df = df[['Adj. Close','HL_PCT','PCT_CHANGE','Adj. Open']]

forecast_col = 'Adj. Close'

df.fillna(-99999,inplace = True)

forecast_out = int(math.ceil(.1*len(df)))

df['label'] = df[forecast_col].shift(-forecast_out)
print df.head()

我不明白df[forecast-col].shift(-forecast-out)是什么意思

请解释一下命令是什么??


Tags: fromimportdfcloseascolmathsklearn
1条回答
网友
1楼 · 发布于 2024-04-24 03:44:11

pandas的Shift函数。Dataframe使用可选的时间频率按所需的周期数移动索引。有关Shift函数的更多信息,请参阅link

下面是正在移位的列值的小示例:

import pandas as pd 
import numpy as np
df = pd.DataFrame({"date": ["2000-01-03", "2000-01-03", "2000-03-05", "2000-01-03", "2000-03-05",
                        "2000-03-05", "2000-07-03", "2000-01-03", "2000-07-03", "2000-07-03"],
               "variable": ["A", "A", "A", "B", "B", "B", "C", "C", "C", "D"],
               "no": [1, 2.2, 3.5, 1.5, 1.5, 1.2, 1.3, 1.1, 2, 3],
               "value": [0.469112, -0.282863, -1.509059, -1.135632, 1.212112, -0.173215,
                         0.119209, -1.044236, -0.861849, None]})

下面是列在移动之前的值

df['value']

输出

0    0.469112
1   -0.282863
2   -1.509059
3   -1.135632
4    1.212112
5   -0.173215
6    0.119209
7   -1.044236
8   -0.861849
9         NaN

使用移位函数值根据给定的周期进行移位

例如,使用带正整数的shift将行值下移:

df['value'].shift(1)

输出

0         NaN
1    0.469112
2   -0.282863
3   -1.509059
4   -1.135632
5    1.212112
6   -0.173215
7    0.119209
8   -1.044236
9   -0.861849
Name: value, dtype: float64

使用带负整数的shift将行值上移:

df['value'].shift(-1)

输出

0   -0.282863
1   -1.509059
2   -1.135632
3    1.212112
4   -0.173215
5    0.119209
6   -1.044236
7   -0.861849
8         NaN
9         NaN
Name: value, dtype: float64

相关问题 更多 >