如何在Pandas DataFrame中将最后4位9999替换为0101(Python)

0 投票
2 回答
40 浏览
提问于 2025-04-12 05:48

我有一个数据框,长得像这样:

OrdNo  year
1     20059999
2     20070830
3     20070719
4     20030719
5     20039999
6     20070911
7     20050918
8     20070816
9     20069999

我想把这个数据框中最后四位数字是9999的地方,替换成0101,应该怎么做呢?

谢谢!

2 个回答

0

我写了一个脚本,里面解释了如何处理这个问题。需要注意的是,这个版本写得比较详细,可能可以简化,但我尽量让它易于理解,方便大家跟着做。

如果你是初学者,一个很好的练习方法是先在脑海中想出一些步骤来解决这个问题(或者把它写下来),然后去查阅相关库的文档,看看能不能找到好的解决方案。

import pandas as pd

# Creating dataframe
data = [[1, 20059999], [2, 20070830], [3, 20070719], [4, 20030719], [5, 20039999], [6, 20070911], [7, 20050918], [8, 20070816], [9, 20069999]]
df = pd.DataFrame(data, columns=['OrdNo', 'year'])

# Iterating through dataframe
for index, row in df.iterrows():
    # Here we take the columns from the row we are in right now
    OrdNo = row['OrdNo']
    year = row['year']

    # Taking last four digits from year int. We need to convert the year int to string to do this. -4: basically
    # tells the code to start at the end (-), move 4 characters back (4) and return everything from that point to the
    # end (:)
    lastfour = str(year)[-4:]

    # Check if last four digits are 9999 (as string, because lastfour is a string)
    if lastfour == "9999":
        # If true, replace the 9999 with 0101
        # First we take the year but remove the last four digits (the 9999)
        year = str(year)[:-4]

        # Then we add 0101 to the year
        newyear = year + "0101"

        # Now convert it back to int
        newyear = int(newyear)

        # And put it back in the dataframe
        # We use loc to find based on the OrdNo and then we replace the year column by our new value
        df.loc[df['OrdNo'] == OrdNo, 'year'] = newyear

# Lets print the result
print(df.to_string(index=False))
2

假设你的 year 列是字符串类型(也就是文本):

df["year"] = df["year"].str.replace("(9999)$", "0101")

如果它是数字类型的话

df["year"] = pd.to_numeric(df["year"].astype(str).str.replace("(9999)$", "0101"), errors="coerce")

撰写回答