使用基于YYMM的用户输入在不同列中切片行

2024-06-17 13:18:48 发布

您现在位置:Python中文网/ 问答频道 /正文

由于之前的成功回答,我想尝试一下这个问题。 这会很复杂,我有一个简单的输出,使用Panda从CSV文件中提取

Timeline: 1900 - 1999 ← Did a simple print("Timeline: 1900 - 1999")

     Year Month
0    1900   Jan
1    1900   Feb
2    1900   Mar
3    1900   Apr
4    1900   May
..    ...   ...
1185  1999   Aug
1186  1999   Sep
1187  1999   Oct
1188  1999   Nov
1189  1999   Dec

我的任务是创建一个用户输入,选择开始的YY或YYMM和结束的YY或YYMM来切片行,这是我想象的

start_time = input(YY/YYMM) e.g 1910 Jan
end_time = input(YY/YYMM) e.g 1930 Nov
Note: Again, I want user to also be able to enter just the year itself rather than both year and month e.g. 1911

如上所述,输出应该是这样的

Timeline: YY/YYMM - YY/YYMM  ← Changes based on start_time & end_time

     Year Month
0    1910   Jan
1    1910   Feb
2    1910   Mar
3    1910   Apr
4    1910   May
..    ...   ...
231  1930   Nov

对我来说,问题是我在python中使用Panda时缺乏经验,而且我不习惯分割这种类型的方法,我感谢任何人的时间来帮助我,尽管我只是在试验Panda如何使用其他函数


1条回答
网友
1楼 · 发布于 2024-06-17 13:18:48

这是一种方法

import pandas as pd

# Inputs
start_time = input('Start Period: ') # 1900 Jan or 1900
end_time = input('End Period: ') # 1910 May or 1910

# If month is present
if len(start_time.split()) > 1:
    start_year, start_month = start_time.split()
# no start month
else:
    start_year = start_time
    start_month = 'Jan'
# If end month
if len(end_time.split()) > 1:
    end_year, end_month = end_time.split()

#no end month
else:
    end_year = end_time
    end_month = 'Dec'

months = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']

df = pd.DataFrame(columns=['Month', 'Year']) # output dataframe

# Iterate between start and end year
for i in range(int(start_year), int(end_year)+1):
    temp_df = pd.DataFrame() #temporary dataframe
    if i == int(start_year):
        month_list = months[months.index(start_month):]
    elif i == int(end_year):
        month_list = months[:months.index(end_month)+1]
    else:
        month_list = months

    temp_df['Month'] = month_list
    temp_df['Year'] = i

    df = df.append(temp_df, ignore_index=True)

相关问题 更多 >