如何排列旧/新值的列，以使第i个旧值=第(i-1)个新值

import pandas as pd import itertools df = pd.DataFrame({'group': ['a', 'a', 'a', 'a', 'a', 'a', 'b', 'b', 'b'], 'date': [0, 1, 1, 1, 1, 2, 3, 4, 4], 'old': [1, 8, 2, 2, 5, 5, 4, 10, 7], 'new': [2, 5, 5, 8, 2, 4, 7, 1, 10]}) print(df) ### jumbled: the `new` value of a row is not the same as the next row's `old` value # group date old new # 0 a 0 1 2 # 1 a 1 8 5 # 2 a 1 2 5 # 3 a 1 2 8 # 4 a 1 5 2 # 5 a 2 5 4 # 6 b 3 4 7 # 7 b 4 10 1 # 8 b 4 7 10

df1 = df.copy() df1 = df1.groupby(['group'], as_index=False, sort=False).apply(order_rows).reset_index(drop=True) print(df1) ### correct: the `old` value in each row equals the `new` value of the previous row # group date old new # 0 a 0 1 2 # 1 a 1 2 5 # 2 a 1 5 2 # 3 a 1 2 8 # 4 a 1 8 5 # 5 a 2 5 4 # 6 b 3 4 7 # 7 b 4 7 10 # 8 b 4 10 1

import networkx as nx import numpy as np df = pd.DataFrame({'group': ['a', 'a', 'a', 'a', 'a'], 'date': [1, 1, 1, 1, 1], 'old': [8, 1, 2, 2, 5], 'new': [5, 2, 5, 8, 2]}) g = nx.from_pandas_edgelist(df[['old', 'new']], source='old', target='new', create_using=nx.DiGraph) ordered = np.asarray(list(nx.algorithms.traversal.edge_dfs(g, df.old[0]))) ordered # array([[8, 5], # [5, 2], # [2, 5], # [2, 8]])

1条回答

网友

1楼 · 发布于 2024-04-19 21:41:57

这是一个图形问题。可以使用networkx创建图形，然后使用numpy进行操作。一个简单的遍历算法，比如depth-first search，将从一个源开始访问所有的边。你知道吗

源只是您的第一个节点（即df.old[0]）

以你为例：

import networkx as nx

g = nx.from_pandas_edgelist(df[['old', 'new']], 
                            source='old', 
                            target='new', 
                            create_using=nx.DiGraph)

ordered = np.asarray(list(nx.algorithms.traversal.edge_dfs(g, df.old[0])))

>>>ordered
array([[ 1,  2],
       [ 2,  5],
       [ 5,  2],
       [ 2,  8],
       [ 8,  5],
       [ 5,  4],
       [ 4,  7],
       [ 7, 10],
       [10,  1]])

您可以只分配回您的数据帧：df[['old', 'new']] = ordered。您可能需要更改一些小细节，例如，如果您的组没有相互连接。但是，如果起点是在group和date和上排序的df，则对old_i = new_(i-1)的依赖关系在组间是受尊重的，那么只需重新分配ordered数组就可以了。你知道吗

不过，我仍然认为你应该调查你的时间戳。我相信这是一个简单的问题，可以通过排序时间戳来解决。在读取/写入文件时，请确保时间戳的精度不会降低。你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章