删除数据帧列中的数字

Name Height(inches) 0 2 Snigdho Hasan 65 1 3 Michael Valentin 69 2 4 Andres Vargas 72 3 7 Jasper Diangco 70 4 9 Sayuj Zachariah 74 5 13 Omar Rezika 74 6 14 Gabriel Pjatak 75 7 16 Ryan Chabel 71

2条回答

网友

1楼 · 编辑于 2024-06-16 14:41:05

您可以尝试使用正则表达式：

import re
string_1 = '612156 jose mauricio'
re.sub("^[\d-]*\s*",'',string_1)

您的输出将是：

'jose mauricio'

您可以使用上面的代码定义一个函数并在数据帧中应用，如：

def remove_first_numbers(text):
    return re.sub("^[\d-]*\s*",'',text).lstrip() 
#I'm adding the .lstrip() to remove any leading white spaces, just in case!

df_final['Name'] = df_final['Name'].apply(remove_first_numbers)

网友

2楼 · 编辑于 2024-06-16 14:41:05

好吧，你已经去掉了空白：

df_final['Name'].replace(r'\s+|\\n', ' ', regex = True, inplace = True)

要匹配换行符（\n），只要使用raw string literal（the r''），就不需要双斜杠
是否确实要将\n替换为空格？我想你可能想把它完全移除。（您的示例没有显示换行符，因此很难判断。）
空格是在关键字参数的=周围的not recommended。如果你打破了这个惯例，你的代码仍然可以正常运行，但至少其他程序员阅读你的代码会比较困难
inplace也是not exactly recommended, and may even be deprecated in future。看起来它的内存效率更高，但实际上它通常会在引擎盖下创建一个副本

假设代码中的full_name是名称序列（列），这将删除所有数字，然后清除左右两侧的所有空白（空格和/或换行符），只留下名字和姓氏：

df_final['Name'] = full_name.replace(r'\d+', '', regex=True).str.strip()

（这是一个即时解决方案，但取决于原始数据的格式，我怀疑可能有一种方法可以将数据刮到数据帧中，从而提前避免这种情况。）

相关问题更多 >

编程相关推荐

热门问题

热门文章

删除数据帧列中的数字

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >