将基于regex的选定列更改为percentag

2024-04-28 20:51:34 发布

您现在位置:Python中文网/ 问答频道 /正文

我有以下数据:

Shiver  Shiver - H2O -Multi Shiver-SHOWME -Multi    Shiver - STARWARS - Multi   Shiver - UFC - Multi    Total   no_of_prize
3.30    46.94               14.83                   10.76                       0.00                    575.30  3
4.01    21.31               21.28                   10.77                       0.00                    568.16  3
1.24    0.15                0.46                    0.00                        0.00                    482.74  2
0.90    0.43                0.94                    0.00                        0.00                    128.38  2
2.91    2.93                0.75                    0.86                        0.00                    265.52  3
0.31    8.09                4.70                    0.72                        0.00                    117.81  3
0.25    0.83                2.12                    0.00                        0.00                    55.84   2
0.04    2.43                1.39                    1.64                        0.00                    457.23  3
5.95    12.28               4.33                    5.73                        0.00                    540.54  3

对于每一列(除了总奖金和无奖金外)将更改为总奖金的百分比。 现在,列的数量可能因数据而异,因此我想使用regex来标识像Shiver和Shiver-Multi这样的列。 例如,另一个数据可能是

恐怖-弗兰克斯汀-多总无奖

我想通过使用regex或者其他可行的方式来自动处理这个问题。你知道吗

预期产量:

Shiver  Shiver - H2O -Multi Shiver-SHOWME -Multi    Shiver - STARWARS - Multi   Shiver - UFC - Multi    Total   no_of_prem
0.57%   8.16%               2.58%                   1.87%                       0.00%                   575.30  3
0.71%   3.75%               3.75%                   1.89%                       0.00%                   568.16  3
0.26%   0.03%               0.09%                   0.00%                       0.00%                   482.74  2
0.71%   0.34%               0.73%                   0.00%                       0.00%                   128.38  2
1.10%   1.10%               0.28%                   0.33%                       0.00%                   265.52  3
0.27%   6.87%               3.99%                   0.61%                       0.00%                   117.81  3
0.46%   1.48%               3.80%                   0.00%                       0.00%                   55.84   2
0.01%   0.53%               0.30%                   0.36%                       0.00%                   457.23  3
1.10%   2.27%               0.80%                   1.06%                       0.00%                   540.54  3

有人能帮我吗?你知道吗


Tags: of数据no数量multiregextotal百分比
1条回答
网友
1楼 · 发布于 2024-04-28 20:51:34

首先按^{}过滤掉不必要的列,然后按Total除以^{}再赋值:

cols = df.columns.difference(['Total','no_of_prize'])
print (cols)
Index(['Shiver', 'Shiver - H2O -Multi', 'Shiver - STARWARS - Multi',
       'Shiver - UFC - Multi', 'Shiver-SHOWME -Multi'],
      dtype='object')

df[cols] = df[cols].div(df['Total'], axis=0).mul(100)

另一种解决方案是将boolean mask^{}一起使用,并通过~反转-需要所有不带Totalno_of_prize的列:

col_mask = ~df.columns.isin(['Total','no_of_prize'])
print (col_mask)
[ True  True  True  True  True False False]

df.loc[:, col_mask] = df.loc[:, col_mask].div(df['Total'], axis=0).mul(100)

print (df)
     Shiver  Shiver - H2O -Multi  Shiver-SHOWME -Multi  \
0  0.573614             8.159221              2.577786   
1  0.705787             3.750704              3.745424   
2  0.256867             0.031073              0.095289   
3  0.701044             0.334943              0.732201   
4  1.095963             1.103495              0.282465   
5  0.263136             6.866989              3.989475   
6  0.447708             1.486390              3.796562   
7  0.008748             0.531461              0.304005   
8  1.100751             2.271802              0.801051   

   Shiver - STARWARS - Multi  Shiver - UFC - Multi   Total  no_of_prize  
0                   1.870329                   0.0  575.30            3  
1                   1.895593                   0.0  568.16            3  
2                   0.000000                   0.0  482.74            2  
3                   0.000000                   0.0  128.38            2  
4                   0.323893                   0.0  265.52            3  
5                   0.611154                   0.0  117.81            3  
6                   0.000000                   0.0   55.84            2  
7                   0.358682                   0.0  457.23            3  
8                   1.060051                   0.0  540.54            3 

将数值列转换为字符串有两种方法:

如果需要在.后面加两位数的百分比,请使用format

cols = df.columns.difference(['Total','no_of_prize'])
df[cols] = df[cols].div(df['Total'], axis=0).mul(100).round(2).applymap('{:,.2f}%'.format)
print (df)
  Shiver Shiver - H2O -Multi Shiver-SHOWME -Multi Shiver - STARWARS - Multi  \
0  0.57%               8.16%                2.58%                     1.87%   
1  0.71%               3.75%                3.75%                     1.90%   
2  0.26%               0.03%                0.10%                     0.00%   
3  0.70%               0.33%                0.73%                     0.00%   
4  1.10%               1.10%                0.28%                     0.32%   
5  0.26%               6.87%                3.99%                     0.61%   
6  0.45%               1.49%                3.80%                     0.00%   
7  0.01%               0.53%                0.30%                     0.36%   
8  1.10%               2.27%                0.80%                     1.06%   

  Shiver - UFC - Multi   Total  no_of_prize  
0                0.00%  575.30            3  
1                0.00%  568.16            3  
2                0.00%  482.74            2  
3                0.00%  128.38            2  
4                0.00%  265.52            3  
5                0.00%  117.81            3  
6                0.00%   55.84            2  
7                0.00%  457.23            3  
8                0.00%  540.54            3   

或者如果不需要两位数:

df[cols] = df[cols].div(df['Total'], axis=0).mul(100).round(2).astype(str) + '%'
print (df)
  Shiver Shiver - H2O -Multi Shiver-SHOWME -Multi Shiver - STARWARS - Multi  \
0  0.57%               8.16%                2.58%                     1.87%   
1  0.71%               3.75%                3.75%                      1.9%   
2  0.26%               0.03%                 0.1%                      0.0%   
3   0.7%               0.33%                0.73%                      0.0%   
4   1.1%                1.1%                0.28%                     0.32%   
5  0.26%               6.87%                3.99%                     0.61%   
6  0.45%               1.49%                 3.8%                      0.0%   
7  0.01%               0.53%                 0.3%                     0.36%   
8   1.1%               2.27%                 0.8%                     1.06%   

  Shiver - UFC - Multi   Total  no_of_prize  
0                 0.0%  575.30            3  
1                 0.0%  568.16            3  
2                 0.0%  482.74            2  
3                 0.0%  128.38            2  
4                 0.0%  265.52            3  
5                 0.0%  117.81            3  
6                 0.0%   55.84            2  
7                 0.0%  457.23            3  
8                 0.0%  540.54            3  

相关问题 更多 >