Pandas数据帧上的滚动PCA

1条回答

网友

1楼 · 发布于 2024-06-02 05:06:15

不幸的是，pandas.DataFrame.rolling()似乎在滚动之前使df变平，因此不能像人们期望的那样在df的行上滚动并将行的窗口传递给PCA。在

下面是一个基于滚动索引而不是行的解决方法。它可能不是很优雅，但它很管用：

# Generate some data (1000 time points, 10 features)
data = np.random.random(size=(1000,10))
df = pd.DataFrame(data)

# Set the window size
window = 100

# Initialize an empty df of appropriate size for the output
df_pca = pd.DataFrame( np.zeros((data.shape[0] - window + 1, data.shape[1])) )

# Define PCA fit-transform function
# Note: Instead of attempting to return the result, 
#       it is written into the previously created output array.
def rolling_pca(window_data):
    pca = PCA()
    transf = pca.fit_transform(df.iloc[window_data])
    df_pca.iloc[int(window_data[0])] = transf[0,:]
    return True

# Create a df containing row indices for the workaround
df_idx = pd.DataFrame(np.arange(df.shape[0]))

# Use `rolling` to apply the PCA function
_ = df_idx.rolling(window).apply(rolling_pca)

# The results are now contained here:
print df_pca

快速检查会发现，这产生的值与手动切片相应窗口并在其上运行PCA计算的控制值相同。在

相关问题更多 >

编程相关推荐

热门问题

热门文章

Pandas数据帧上的滚动PCA

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >