从.txt文件中读取统计信息并输出

2024-03-28 12:41:35 发布

您现在位置:Python中文网/ 问答频道 /正文

我应该从.txt文件中获取某些信息并输出。这是我需要的信息:

  • 人口最多的州
  • 人口最少的州
  • 平均州人口
  • 德克萨斯州人口

DATA看起来像:

Alabama
AL
4802982
Alaska
AK
721523
Arizona
AZ
6412700
Arkansas
AR
2926229
California
CA
37341989

这是我的代码,它实际上不做任何我需要它做的事情:

^{pr2}$

我所能做的就是读州名,abv,并将人口转换成int。我不需要它做任何事情,但是我不确定如何完成任务所要求的。任何提示都将不胜感激!在过去的几个小时里,我一直在尝试一些事情,但没有成功。在

更新:

这是我更新的代码,但是我收到以下错误:

Traceback (most recent call last):
  File "main.py", line 13, in <module>
    if population > max_population:
TypeError: unorderable types: str() > int()

代码:

with open('StateCensus2010.txt', 'r') as census_file:
    while True:
        try:
            state_name = census_file.readline()
            state_abv = census_file.readline()
            population = int(census_file.readline())
        except IOError:
            break

        # data processing here
        max_population = 0
        for population in census_file:
          if population > max_population:
            max_population = population

        print(max_population)

Tags: 代码intxt信息readlineif事情max
3条回答

另一个pandas解决方案,来自解释器:

>>> import pandas as pd
>>>
>>> records = [line.strip() for line in open('./your.txt', 'r')]
>>>
>>> df = pd.DataFrame([records[i:i+3] for i in range(0, len(records), 3)], 
...     columns=['State', 'Code', 'Pop']).dropna()
>>>
>>> df['Pop'] = df['Pop'].astype(int)
>>>
>>> df
        State Code       Pop
0     Alabama   AL   4802982
1      Alaska   AK    721523
2     Arizona   AZ   6412700
3    Arkansas   AR   2926229
4  California   CA  37341989
>>>
>>> df.ix[df['Pop'].idxmax()]
State    California
Code             CA
Pop        37341989
Name: 4, dtype: object
>>>
>>> df.ix[df['Pop'].idxmin()]
State    Alaska
Code         AK
Pop      721523
Name: 1, dtype: object
>>>
>>> df['Pop'].mean()
10441084.6
>>>
>>> df.ix[df['Code'] == 'AZ' ]
     State Code      Pop
2  Arizona   AZ  6412700

这个问题很容易使用熊猫。在

代码:

states = []
for line in data:
    states.append(
        dict(state=line.strip(),
             abbrev=next(data).strip(),
             pop=int(next(data)),
             )
    )

df = pd.DataFrame(states)
print(df)

print('\nmax population:\n', df.ix[df['pop'].idxmax()])
print('\nmin population:\n', df.ix[df['pop'].idxmin()])
print('\navg population:\n', df['pop'].mean())
print('\nAZ population:\n', df[df.abbrev == 'AZ'])

测试数据:

^{pr2}$

结果:

  abbrev       pop       state
0     AL   4802982     Alabama
1     AK    721523      Alaska
2     AZ   6412700     Arizona
3     AR   2926229    Arkansas
4     CA  37341989  California

max population:
abbrev            CA
pop         37341989
state     California
Name: 4, dtype: object

min population:
abbrev        AK
pop       721523
state     Alaska
Name: 1, dtype: object

avg population:
10441084.6

AZ population:
  abbrev      pop    state
2     AZ  6412700  Arizona

因为数据的顺序是一致的:Statename、State Abv、Population。所以你只需要读一次这几行,然后显示这三条信息。下面是示例代码。在

average = 0.0
total = 0.0
state_min = 999999999999
state_max = 0
statename_min = ''
statename_max = ''
texas_population = 0
with open('StateCensus2010.txt','r') as file:
    # split new line, '\n' here means newline

    data = file.read().split('\n')

    # get the length of the data by using len() method
    # there are 50 states in the text file
    # each states have 3 information stored,
    # state name, state abreviation, population
    # that's why length of data which is 150/3 = 50 states
    state_total = len(data)/3 


    # this count is used as an index for the list 
    count = 0
    for i in range(int(state_total)):

        statename = data[count]
        state_abv = data[count+1]
        population = int(data[count+2])

        print('Statename : ',statename)
        print('State Abv : ',state_abv)
        print('Population: ',population)
        print()

        # sum all states population
        total += population

        if population > state_max:
            state_max = population
            statename_max = statename

        if population < state_min:
            state_min = population
            statename_min = statename

        if statename == 'Texas':
            texas_population = population


        # add 3 because we want to jump to next state
        # for example the first three lines is Alabama info
        # the next three lines is Alaska info and so on
        count += 3


    # divide the total population with number of states 
    average = total/state_total
    print(str(average))

    print('Lowest population state :', statename_min)
    print('Highest population state :', statename_max)
    print('Texas population :', texas_population)

相关问题 更多 >