我有一个数据库,我从中添加每个迭代的新数据,并尝试在datetime列中组合它们
我正在使用这部分代码:
# Iterate by days
for row in rows:
i += 1;
df_name = f"{row[0]}_{row[1]}";
print(f"Getting {df_name} {i}/{len(rows)}{spaces}", end="\r");
if (predictionPoint == row[0]):
currentDf = pd.read_sql(f"SELECT updated_at, c as '{df_name}_c', "
f"v as '{df_name}_v', o FROM commons "
f"WHERE cid LIKE '{predictionMeasure}%' AND s = '{row[0]}' AND cid = '{row[1]}' "
"ORDER BY updated_at DESC", con = sqlite);
else:
currentDf = pd.read_sql(f"SELECT updated_at, c as '{df_name}_c', "
f"v as '{df_name}_v' FROM commons "
f"WHERE cid LIKE '{predictionMeasure}%' AND s = '{row[0]}' AND cid = '{row[1]}' "
"ORDER BY updated_at DESC", con = sqlite);
currentDf["updated_at"] = currentDf["updated_at"].apply(convertDatetime);
if (df.empty == False):
df = pd.merge(left = df, right = currentDf, on = "updated_at", how = "inner");
else:
df = currentDf;
if not os.path.exists(f"{dirName}/{datasetFilename}"):
df.to_csv(f"{dirName}/{datasetFilename}", encoding = "utf-8", index = False);
else:
tempDf = pd.read_csv(f"{dirName}/{datasetFilename}", parse_dates = ["updated_at"]);
df = pd.concat([tempDf, df], axis = 0, sort = False);
df.to_csv(f"{dirName}/{datasetFilename}", encoding = "utf-8", index = False);
print(f"Dataset created {a}/{len(archives)}{spaces}");
df = pd.read_csv(f"{dirName}/{datasetFilename}", parse_dates = ["updated_at"]);
df = df.set_index("updated_at", drop = False);
print("Sorting, filling N/A, cleaning...");
df = df.sort_index(ascending = False);
df = df.fillna(method = "ffill").fillna(method = "bfill");
我在这段代码的27个字符串处遇到了一个错误,它返回了一个未合并的数组,其中重复的列更新了,或者数据帧被剪切了,但是我希望:
updated_at one two three four
2019-06-02 23:59:45 1 2 3 4
2019-06-02 23:59:30 2 3 4 5
2019-06-02 23:59:15 3 4 5 6
2019-06-02 23:59:00 4 5 6 7
2019-06-02 23:58:45 5 6 7 8
由于通过合并添加的数据,无需复制更新的_at,也无间隙。 我已经尝试过使用连接和其他类型的合并
目前没有回答
相关问题 更多 >
编程相关推荐