文件列表中元素的频率

[['N', '25-29', 'Eesti', 'Harju maakond', 'N', '06.03.2020'], ['N', '35-39', 'Eesti', 'Harju maakond', 'N', '06.03.2020'], ['N', '40-44', 'Eesti', 'Saare maakond', 'N', '06.03.2020'], ['N', '35-39', 'Eesti', 'Tartu maakond', 'N', '06.03.2020'], ['M', '40-44', 'Eesti', 'Harju maakond', 'N', '06.03.2020']]

1条回答

网友

1楼 · 发布于 2024-06-17 10:19:41

也许你想要这个

另外，由于像Eesti, Harju maakond这样的位置名是一个位置/地点，所以我对给定的数据表做了一些更改。另外，您提供了5个标题，但数据中有6列，这就是为什么我必须这样做的原因。可能，您必须在以前生成该表的代码中也更改它，因为我猜它是爱沙尼亚某个位置的名称

始终使用Pandas来处理数据列

import pandas as pd # Pandas dataframe (install pandas using pip install pandas)


headers = ['sex', 'age', 'location', 'coronatestanswer', 'date']

datatable = [['N', '25-29', 'Eesti, Harju maakond', 'N', '06.03.2020'],
['N', '35-39', 'Eesti, Harju maakond', 'N', '06.03.2020'],
['N', '40-44', 'Eesti, Saare maakond', 'N', '06.03.2020'],
['N', '35-39', 'Eesti, Tartu maakond', 'N', '06.03.2020'],
['M', '40-44', 'Eesti, Harju maakond', 'N', '06.03.2020']]


df = pd.DataFrame(datatable, columns=headers) # Data frame created from given list of lists
print(df) # Take a look a the organized dataframe in pandas
print(df['age'].value_counts()) # Count frequency of elements in a column

打印输出（df）：

  sex    age              location coronatestanswer        date
0   N  25-29  Eesti, Harju maakond                N  06.03.2020
1   N  35-39  Eesti, Harju maakond                N  06.03.2020
2   N  40-44  Eesti, Saare maakond                N  06.03.2020
3   N  35-39  Eesti, Tartu maakond                N  06.03.2020
4   M  40-44  Eesti, Harju maakond                N  06.03.2020

频率计数的输出：

35-39    2
40-44    2
25-29    1
Name: age, dtype: int64

如果不使用熊猫，它甚至更短。问题是，如果您想为每一列使用它，那么代码只是多余的和不必要的重复。这就是为什么熊猫是可怕的。Python就是要让任务更简单、更高效。：）

但是，无论如何。这是没有熊猫的代码

from collections import Counter # Now, Don't shout at me. This is standard library. No need to install anything.

age_list = [datatable[i][1] for i in range(1,len(datatable))] # This is called list comprehension.
print (Counter(age_list)).

输出：

Counter({'35-39': 2, '40-44': 2, '25-29': 1})

计数器是字典对象。如果将Counter(age_list)赋值给另一个变量。您可以随时随意访问任何年龄组的频率。有点像这样

age_list = Counter([datatable[i][1] for i in range(1,len(datatable))])

print(age_list['40-44'])

当然，输出是2

相关问题更多 >

编程相关推荐

热门问题

热门文章