Pandas数据帧“ValueError:cannot reindex from a duplicate axis”复制索引暴力解决方案？

2024-06-01 04:25:21 发布

男 | 程序猿一只，喜欢编程写python代码。

import pandas as pd

df_avocado = pd.read_csv("avocado.csv")
df_avocado.set_index("Date", inplace=True)

问题在于：

'''
determines all unique regions (ex: "Alabama", "Alaska", "Arkansas") in dataframe "df_avocado"
finds all data-points belonging to that unique region
dumps those data-points into a temporary dataframe "df_region"
calculates the 25sma of every df_region
dumps the 25sma to "df_avocado_region_25ma" so I can compare 25sma of every region
'''

df_avocado_region_25ma = pd.DataFrame()
for region in df_avocado["region"].unique():
    df_region = df_avocado.copy()[df_avocado["region"] == region]
    df_avocado_region_25ma[f"{region}_25ma"] = df_region["AveragePrice"].rolling(25).mean()

Jupyter给出“ValueError:cannot reindex from a duplicate axis”当添加每个dfèu区域到dfèu鳄梨èu区域时。你知道吗

我研究了ValueError的含义；引用了What does `ValueError: cannot reindex from a duplicate axis` mean?，“当索引具有重复值时，当您连接/分配到列时，此错误通常会出现”。你知道吗

这很有意义，因为“date”列（我将其设置为索引）有很多重叠的值。然而，由于我不关心有重复的索引（它们为20sma提供了一个高/低），并且我不想覆盖以前的索引（更喜欢包含每个数据点），有没有办法强制它并将所有的点添加到？你知道吗

你知道吗www.kaggle.com/neuromics/avocado-prices你知道吗

import pandas as pd

df_avocado = pd.read_csv("avocado.csv")
wanted_columns = ["Date", "AveragePrice", "region"]
df_avocado = df_avocado[wanted_columns]
df_avocado["Date"] = pd.to_datetime(df_avocado["Date"])
df_avocado.set_index("Date", inplace=True)
df_avocado.sort_index(inplace=True)

df_avocado_region_25ma = pd.DataFrame()
for region in df_avocado["region"].unique():
    df_region = df_avocado.copy()[df_avocado["region"] == region]
    df_avocado_region_25ma[f"{region}_25ma"] = df_region["AveragePrice"].rolling(25).mean()
df_avocado_region_25ma.plot()

Tags： csv to in true df date index mean

0条回答

目前没有回答

Pandas数据帧“ValueError:cannot reindex from a duplicate axis”复制索引暴力解决方案？

相关问题更多 >

编程相关推荐

热门问题

热门文章

Pandas数据帧“ValueError:cannot reindex from a duplicate axis”复制索引暴力解决方案？

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >