Pandas：来自datafram的订单值

Third party unique identifier Qsex Qage Qfamilystatus QeducationSingle Qincomeevaluation Qjobstatus QRuCitySize QRuDistrict Qcountry 9ea3e3cb6719f3d336d324c446f486bd 1 32 1 5 1 1 1 1 cb570bb986808a5f4d2629287297b902 2 25 5 2 1 1 1 78b3a44eb7c7f7c687ffbcfed57647a4 1 30 4 1 3 6 1 1c728b223a4c2c267f3a3630b4a63f6e 2 45 4 1 1 1 1 8852ecd198fddfa557186c863f2c6fdf 2 41 4 1 7 7 1 1adc146b9ec35f7c632902f480d7e95c 1 70 5 3 1 1 1 0fb0c903a6b2b68f1b0a7cd1962f353c 1 29 5 1 5 7 1

QRuDistrict 1 ЦФО QRuDistrict 2 ЮФО QRuDistrict 3 СЗФО QRuDistrict 4 ДВФО QRuDistrict 5 СФО QRuDistrict 6 УФО QRuDistrict 7 ПФО QRuDistrict 8 СКФО QRuDistrict 9 Крымский ФО

d = (df_1[df_1['sign']=='Qcountry'].set_index('number')['result'].to_dict()) df['Country'] = df.Qcountry.map(d) df2 = pd.crosstab(df.Country, df.Qcountry, margins=True) df3 = np.round(df2[["All"]] / df['Country'].count() * 100, 2).rename(columns={"All": '%'}) country = pd.concat([df2[["All"]], df3], axis=1) less = country[country['%'] < 5] country = country[country['%'] >= 5] country['All'] = ((all_users * df3.divide(100)).astype(int)) country['%'] = country['%'].astype(str) + '%' country.to_excel(writer, sheet_name=sheet_name, startrow=48, startcol=4)

Federal Districts Россия N % ДВФО 131 5.33% Крымский ФО 11 0.48% ПФО 416 16.91% СЗФО 420 17.09% СКФО 43 1.75% СФО 259 10.53% УФО 208 8.48% ЦФО 764 31.08% ЮФО 205 8.35% Total 2461 100.0%

Federal Districts Россия N % ЦФО 764 31.08% ЮФО 205 8.35% СЗФО 420 17.09% ДВФО 131 5.33% СФО 259 10.53% УФО 208 8.48% ПФО 416 16.91% СКФО 43 1.75% Крымский ФО 11 0.48% Total 2461 100.0%

1条回答

网友

1楼 · 发布于 2024-04-24 05:25:55

我认为您可以按第二个数据帧使用^{}，但有必要将最后一项['Total']添加到list：

print (df)
               a            b
0  QRuDistrict 1          ЦФО
1  QRuDistrict 2          ЮФО
2  QRuDistrict 3         СЗФО
3  QRuDistrict 4         ДВФО
4  QRuDistrict 5          СФО
5  QRuDistrict 6          УФО
6  QRuDistrict 7          ПФО
7  QRuDistrict 8         СКФО
8  QRuDistrict 9  Крымский ФО

print (df1)
            Federal Districts  Россия  
                            N         %
ДВФО                      131     5.33%
Крымский ФО                11     0.48%
ПФО                       416    16.91%
СЗФО                      420    17.09%
СКФО                       43     1.75%
СФО                       259    10.53%
УФО                       208     8.48%
ЦФО                       764    31.08%
ЮФО                       205     8.35%
Total                    2461    100.0%

idx = df.b.tolist() + ['Total']
print (idx)
['ЦФО', 'ЮФО', 'СЗФО', 'ДВФО', 'СФО', 'УФО', 'ПФО', 'СКФО', 'Крымский ФО', 'Total']
df1 = df1.reindex(idx)
print (df1)
            Federal Districts  Россия  
                            N         %
ЦФО                       764    31.08%
ЮФО                       205     8.35%
СЗФО                      420    17.09%
ДВФО                      131     5.33%
СФО                       259    10.53%
УФО                       208     8.48%
ПФО                       416    16.91%
СКФО                       43     1.75%
Крымский ФО                11     0.48%
Total                    2461    100.0%

如果使用^{}，则顺序不同：

df1 = df1.sort_index(ascending=False)
print (df1)
            Federal Districts  Россия  
                            N         %
ЮФО                       205     8.35%
ЦФО                       764    31.08%
УФО                       208     8.48%
СФО                       259    10.53%
СКФО                       43     1.75%
СЗФО                      420    17.09%
ПФО                       416    16.91%
Крымский ФО                11     0.48%
ДВФО                      131     5.33%
Total                    2461    100.0%

按注释编辑：

我更改了列名，似乎只需要列sign的值，其中第一列number包含QRuDistrict。然后可以将^{}与^{}一起使用，并将^{}一起使用掩码：

print (df)
          number         sign
0  QRuDistrict 1          ЦФО
1  QRuDistrict 2          ЮФО
2  QRuDistrict 3         СЗФО
3  QRuDistrict 4         ДВФО
4  QRuDistrict 5          СФО
5  QRuDistrict 6          УФО
6  QRuDistrict 7          ПФО
7  QRuDistrict 8         СКФО
8  QRuDistrict 9  Крымский ФО

idx = df.ix[df.number.str.contains('QRuDistrict'), 'sign'].tolist() + ['Total']
print (idx)
['ЦФО', 'ЮФО', 'СЗФО', 'ДВФО', 'СФО', 'УФО', 'ПФО', 'СКФО', 'Крымский ФО', 'Total']

相关问题更多 >

编程相关推荐

热门问题

热门文章