AttributeError运行小写字母。翻译&字符串。标点符号

# cleaned_text = lower_case.translate(str.maketrans(string.punctuation, ' '*len(string.punctuation))) # cleaned_text = lower_case.translator = str.maketrans('', '', string.punctuation) cleaned_text = lower_case.translate(str.maketrans('', '', string.punctuation))

import string from collections import Counter import pandas as pd import numpy as np import matplotlib.pyplot as plt from nltk.corpus import stopwords from nltk.sentiment.vader import SentimentIntensityAnalyzer from nltk.stem import WordNetLemmatizer from nltk.tokenize import word_tokenize # data is in excel formatted ugly and unclean columns are Artist Names rows are reviews for said Artist df = pd.read_excel('sample-data.xlsx',encoding='utf8', errors='ignore') lower_case = df.apply(lambda x: x.astype(str).str.lower()) #checking for nulls if present any print("Number of rows with null values:") print(lower_case.isnull().sum().sum()) lower_case.fillna("") #cleaned_text = lower_case.translate(str.maketrans(string.punctuation, ' '*len(string.punctuation))) # cleaned_text = lower_case.translator = str.maketrans('', '', string.punctuation) cleaned_text = lower_case.translate(str.maketrans('', '', string.punctuation))

--------------------------------------------------------------------------- AttributeError Traceback (most recent call last) <ipython-input-78-9f23b8a5e8e0> in <module> 2 # cleaned_text = lower_case.translator = str.maketrans('', '', string.punctuation) 3 ----> 4 cleaned_text = lower_case.translate(str.maketrans('', '', string.punctuation)) ~\anaconda3\envs\nlp_course\lib\site-packages\pandas\core\generic.py in __getattr__(self, name) 5272 if self._info_axis._can_hold_identifiers_and_holds_name(name): 5273 return self[name] -> 5274 return object.__getattribute__(self, name) 5275 5276 def __setattr__(self, name: str, value) -> None: AttributeError: 'DataFrame' object has no attribute 'translate'

1条回答

网友

1楼 · 发布于 2024-05-16 13:54:12

熊猫数据帧没有.translate()方法，但是Python字符串有。例如：

import string

my_str = "hello world!"                                                                                                                                                                            
my_str.translate(str.maketrans('', '', string.punctuation))

如果要将该转换应用于数据帧行中的每个列值，可以在该列上使用.map()。.map()方法采用一个接受列值作为参数的函数，您可以返回转换后的值：

def remove_punctuation(value):
    return value.translate(str.maketrans('', '', string.punctuation))

df["my_cleaned_column"] = df["my_dirty_column"].map(remove_punctuation)

也可以使用lambda函数，而不是定义新函数：

df["my_cleaned_column"] = df["my_dirty_column"].map(
    lambda x: x.translate(str.maketrans('', '', string.punctuation))
)

如果有许多列需要应用此功能，可以执行以下操作：

for column_name in df.columns:
    df[column_name] = df[column_name].map(
        lambda x: x.translate(str.maketrans('', '', string.punctuation))
    )

相关问题更多 >

编程相关推荐

热门问题

热门文章