我已经试过了熊猫.融化和.stack和.pivot,但没有任何进展。你知道吗
我有一个excel表格,格式如下:
1997 1998 1999 2000
Total, N (%) 3350 (34) 3387 (33) 4778 (33) 3588 (33)
Age category N, (%)
A 231 (24) 227 (24) 222 (23) 211 (22)
B 492 (24) 481 (24) 487 (24) 405 (24)
C 759 (28) 759 (27) 746 (26) 746 (26)
D 1901 (45) 1873 (44) 1233 (44) 1903 (44)
Sex, N (%)
F 1650 (33) 1493 (33) 1673 (33) 1628 (32)
M 1734 (35) 1794 (34) 1705 (34) 1760 (34)
Diet
Vegan 1553 (32) 1442 (31) 1453 (31) 1422 (31)
Carnivore 1857 (36) 1063 (36) 1225 (35) 1926 (34)
Favorite movie
horror 1036 (24) 1033 (24) 1458 (24) 1742 (24)
romance 732 (41) 743 (40) 735 (40) 799 (38)
comedy 514 (34) 498 (32) 518 (32) 496 (32)
silent 1110 (47) 1933 (47) 1967 (46) 1751 (46)
* Percents are in relation to 100% of children who filled out survey
我正在尝试处理这些数据,以便可以创建以下条形图:
我用于构建df的excel表中的代码:
import pandas as pd
categories = ['Age category N, (%)', 'Sex, N (%)', 'Diet', 'Favorite movie']
subcategories = ['A','B','C','D','F','M',"Vegan","Carnivore",'horror','romance','comedy','silent']
df = pd.DataFrame(
{'1997': [33850 (34), NaN ,231 (24), 492 (24), 759 (28), 1901 (45), NaN , 1650 (33), 1734 (35),NaN ,
1553 (32), 1857 (36),NaN , 1036 (24), 732 (41), 514 (34), 1110 (47)],
'1998': [33687 (33),NaN ,227 (24), 481 (24), 759 (27), 1873 (44),NaN ,1493 (33), 1794 (34), NaN ,1442 (31), 1063 (36),NaN , 1033 (24), 743 (40), 498 (32), 1933 (47)],
'1999': [3778 (33), NaN ,222 (23), 487 (24), 746 (26), 1233 (44),NaN , 1673 (33), 1705 (34),NaN , 1453 (31), 1225 (35),NaN , 1458 (24), 735 (40), 518 (32), 1967 (46)],
'2000' : [3588 (33),NaN ,211 (22), 405 (24), 746 (26), 1903 (44), NaN , 1628 (32), 1760 (34),NaN , 1422 (31), 1926 (34),NaN , 1742 (24), 799 (38), 496 (32), 1751 (46)]},
index = pd.MultiIndex.from_tuples(
[('Age category N, (%)','A'),('Age category N, (%)','B'),('Age category N, (%)','C'),
('Age category N, (%)', 'D'), ('Sex, N (%)', 'F'), ('Sex, N (%)', 'M'),
('Diet', 'Vegan'), ('Diet', 'Carnivore'),
('Favorite Movie','horror'),('Favorite Movie','romance'),('Favorite Movie','comedy'),('Favorite Movie','silent')],
names = [categories, subcategories]))
我有两个问题 我正在寻找透视表,以便“age”、“sex”、“diet”和“favorite movie”行是多索引列,每个列下都有类别,并且以年份作为行(观察) 所以最终产品看起来像:
'age' 'sex' 'diet' 'favorite movie'
A B C D F M Vegan Carnivore horror romance comedy silent
1997
1998
1999
2000
关键是我必须将“(%”从计数中分离出来,同时将其与数字保持“关联”(以通知右y轴)
感谢您的指导!你知道吗
目前没有回答
相关问题 更多 >
编程相关推荐