如何将选定的数据转换为相同长度(形状)

2024-04-28 14:42:21 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在读取多个.csv文件作为一个熊猫数据帧具有相同的形状。对于某些索引,有些值是零,所以我想选择每个具有相同形状的索引的值,为同一索引放置零值,并删除零使其成为同一形状:

a = pd.DataFrame(pd.read_csv("path_a",index_col=0))
b = pd.DataFrame(pd.read_csv("path_b",index_col=0))
c = pd.DataFrame(pd.read_csv("path_c",index_col=0))
print a,"\n",b,"\n",c
L = np.array(a.shape)
X = L[0]
d = a.index.values
a = np.array(a)
b = np.array(b)
c = np.array(c)
for i in range (0,X):
    xdata  = a[i]
    xdata1 = b[i]
    xdata2 = c[i]
    xdata  = np.where(xdata2==0,0,xdata)
    xdata1 = np.where(xdata2==0,0,xdata1)
    xdata1 = np.where(xdata==0,0,xdata1)
    xdata2 = np.where(xdata==0,0,xdata2)
    xdata  = np.where(xdata1==0,0,xdata)
    xdata2 = np.where(xdata1==0,0,xdata2)
    indexX  = np.argwhere(xdata==0)
    index1X = np.argwhere(xdata1==0)
    index2X = np.argwhere(xdata2==0)
    xdata  = np.delete(xdata,indexX)
    xdata1 = np.delete(xdata1,index1X)
    xdata2 = np.delete(xdata2,index2X)
    print d[i],"\n",xdata,"\n",xdata1,"\n",xdata2
     1980  1985  1990  1995  2000  2005  2010
ISO3                                          
AFG    0.0   0.0   3.8   0.0   0.0   9.8   0.0
AGO    2.0   0.0   3.0   4.0   0.0   0.0   0.0
ALB    0.0   0.2   0.5   0.2   1.3   1.6   2.7
AND    0.0   0.0   0.0   0.0   0.0   0.0   0.0
ARE    0.7   0.8   0.9   1.7   2.3   2.7   3.0
ARG    3.1   6.7   5.3  15.1  17.2  18.2  18.7
ARM    0.4   0.5   0.5   0.5   0.4   1.2   1.3 
      1980  1985  1990  1995  2000  2005  2010
ISO3                                          
AFG    2.5   0.0   0.0   4.7   0.0   0.0   0.0
AGO   13.1  14.9  15.8  16.4  16.9  17.6  18.1
ALB    1.4   1.5   1.6   1.6   1.6   1.6   1.7
AND    0.2   0.2   0.2   0.2   0.1   0.4   0.6
ARE    0.0   0.0   0.0   0.0   0.0   0.0   0.0
ARG    1.8   1.8   1.7   1.8   1.8   1.9   1.9
ARM    1.8   1.8   1.7   0.0   1.8   1.9   1.5 
      1980  1985  1990  1995  2000  2005  2010
ISO3                                          
AFG    0.0   0.0   0.0   0.0   0.0   0.0   0.0
AGO    0.0   0.0   4.7   5.8   6.0   0.0   0.0
ALB    0.0   0.2   0.5   0.2   1.3   1.6   2.7
AND    1.4   1.8   2.3   3.7   0.0   0.0   5.4
ARE    0.7   0.8   0.9   1.7   2.3   2.7   3.0
ARG    3.1   6.7   5.3  15.1  17.2  18.2  18.7
ARM    0.4   0.5   0.5   0.5   0.4   1.2   1.3

AFG 
[] 
[] 
[]
AGO 
[ 3.  4.] 
[ 15.8  16.4] 
[ 4.7  5.8]
ALB 
[ 0.2  0.5  0.2  1.3  1.6  2.7] 
[ 1.5  1.6  1.6  1.6  1.6  1.7] 
[ 0.2  0.5  0.2  1.3  1.6  2.7]
AND 
[] 
[] 
[]
ARE 
[] 
[] 
[]
ARG 
[  3.1   6.7   5.3  15.1  17.2  18.2  18.7] 
[ 1.8  1.8  1.7  1.8  1.8  1.9  1.9] 
[  3.1   6.7   5.3  15.1  17.2  18.2  18.7]
ARM 
[ 0.4  0.5  0.5  0.4  1.2  1.3] 
[ 1.8  1.8  1.7  1.8  1.9  1.5] 
[ 0.4  0.5  0.5  0.4  1.2  1.3]

这段代码可以工作,但这是一种尝试性的方法,在数据量较大时效率不高。你能给我一个更有效的方法,以及如何根据最小长度索引选择数据吗?你知道吗


Tags: andcsvindexnpargwhereagoarray
1条回答
网友
1楼 · 发布于 2024-04-28 14:42:21

一种想法是将所有3个数组进行多重化,然后测试它是否为not0,也可以使用loop by listL1中的3个数组。然后还改变了逻辑-选择不匹配的值来代替np.argwherenp.delete

L = np.array(a.shape)
X = L[0]
d = a.index.values
a = np.array(a)
b = np.array(b)
c = np.array(c)
m = (a * b * c) != 0
L1 = [a,b,c]

for i in range (0,X):
    for arr in L1:
        xdata  = arr[i][m[i]]
        print (xdata)

如果使用pandas 0.24+,那么转换为numpy数组的更好方法是使用^{}

L = np.array(a.shape)
X = L[0]
d = a.index.to_numpy()
a = a.to_numpy()
b = b.to_numpy()
c = c.to_numpy()
m = (a * b * c) != 0
L1 = [a,b,c]

for i in range (0,X):
    for arr in L1:
        xdata  = arr[i][m[i]]
        print (xdata)

编辑:

L = np.array(a.shape)
X = L[0]
d = a.index.to_numpy()
a = a.to_numpy()
b = b.to_numpy()
c = c.to_numpy()
m = (a * b * c) != 0
L1 = [a,b,c]

for i in range (0,X):
    out = []
    for arr in L1:
        xdata  = arr[i][m[i]]
        out.append(xdata)
    data = np.vstack((out))
    print (data)

[]
[[ 3.   4. ]
 [15.8 16.4]
 [ 4.7  5.8]]
[[0.2 0.5 0.2 1.3 1.6 2.7]
 [1.5 1.6 1.6 1.6 1.6 1.7]
 [0.2 0.5 0.2 1.3 1.6 2.7]]
[]
[]
[[ 3.1  6.7  5.3 15.1 17.2 18.2 18.7]
 [ 1.8  1.8  1.7  1.8  1.8  1.9  1.9]
 [ 3.1  6.7  5.3 15.1 17.2 18.2 18.7]]
[[0.4 0.5 0.5 0.4 1.2 1.3]
 [1.8 1.8 1.7 1.8 1.9 1.5]
 [0.4 0.5 0.5 0.4 1.2 1.3]]

相关问题 更多 >