在整个数据帧中正确地修剪空白？

--------------------------------------------------------------------------- AttributeError Traceback (most recent call last) <ipython-input-9-31d35db1d48c> in <module> 1 df = (pd.read_csv('C:\\Users\\wundermahn\Desktop\\aggregated_po_data.csv', ----> 2 encoding = "ISO-8859-1", low_memory=False).apply(lambda x: x.str.strip() if (x.dtype == "object") else x)) 3 print(df.shape) 4 5 label = df['class'] c:\python367-64\lib\site-packages\pandas\core\frame.py in apply(self, func, axis, raw, result_type, args, **kwds) 6876 kwds=kwds, 6877 ) -> 6878 return op.get_result() 6879 6880 def applymap(self, func) -> "DataFrame": c:\python367-64\lib\site-packages\pandas\core\apply.py in get_result(self) 184 return self.apply_raw() 185 --> 186 return self.apply_standard() 187 188 def apply_empty_result(self): c:\python367-64\lib\site-packages\pandas\core\apply.py in apply_standard(self) 294 try: 295 result = libreduction.compute_reduction( --> 296 values, self.f, axis=self.axis, dummy=dummy, labels=labels 297 ) 298 except ValueError as err: pandas\_libs\reduction.pyx in pandas._libs.reduction.compute_reduction() pandas\_libs\reduction.pyx in pandas._libs.reduction.Reducer.get_result() <ipython-input-9-31d35db1d48c> in <lambda>(x) 1 df = (pd.read_csv('C:\\Users\\wundermahn\Desktop\\aggregated_data.csv', ----> 2 encoding = "ISO-8859-1", low_memory=False).apply(lambda x: x.str.strip() if (x.dtype == "object") else x)) 3 print(df.shape) 4 5 label = df['ON_TIME'] c:\python367-64\lib\site-packages\pandas\core\generic.py in __getattr__(self, name) 5268 or name in self._accessors 5269 ): -> 5270 return object.__getattribute__(self, name) 5271 else: 5272 if self._info_axis._can_hold_identifiers_and_holds_name(name): c:\python367-64\lib\site-packages\pandas\core\accessor.py in __get__(self, obj, cls) 185 # we're accessing the attribute of the class, i.e., Dataset.geo 186 return self._accessor --> 187 accessor_obj = self._accessor(obj) 188 # Replace the property with the accessor object. Inspired by: 189 # http://www.pydanny.com/cached-property.html c:\python367-64\lib\site-packages\pandas\core\strings.py in __init__(self, data) 2039 2040 def __init__(self, data): -> 2041 self._inferred_dtype = self._validate(data) 2042 self._is_categorical = is_categorical_dtype(data) 2043 self._is_string = data.dtype.name == "string" c:\python367-64\lib\site-packages\pandas\core\strings.py in _validate(data) 2096 2097 if inferred_dtype not in allowed_types: -> 2098 raise AttributeError("Can only use .str accessor with string values!") 2099 return inferred_dtype 2100 **AttributeError: Can only use .str accessor with string values!**

3条回答

网友

1楼 · 编辑于 2024-05-12 23:54:32

您必须检查的不是列类型，而是每个单个值的类型，因此，代码可以是，例如：

df.applymap(lambda x: x.strip() if type(x) == str else x)

原因是：

可以有一列对象类型
几乎所有单元格中都包含字符串
但是其中一些可以是NaN，这是float的一个特例，因此你不能在上面调用strip

但这样做会不必要地执行类型列的代码除了对象，在该对象中，任何内容都不会更改。如果这让您感到困扰，请仅对可能存在此问题的列运行此代码要更改任何内容：

cols = df.select_dtypes(include='object').columns
df[cols] = df[cols].applymap(lambda x: x.strip() if type(x) == str else x)

网友
2楼 · 编辑于 2024-05-12 23:54:32

首先使用select_dtypes选择正确的列：
# example dataframe df = pd.DataFrame({'col1':[1,2,3], 'col2':list('abc'), 'col3':[4.0, 5.0, 6.0], 'col4':[' foo', ' bar', 'foobar. ']}) col1 col2 col3 col4 0 1 a 4.0 foo 1 2 b 5.0 bar 2 3 c 6.0 foobar.
str_cols = df.select_dtypes('object').columns df[str_cols] = df[str_cols].apply(lambda x: x.str.strip()) print(df) col1 col2 col3 col4 0 1 a 4.0 foo 1 2 b 5.0 bar 2 3 c 6.0 foobar.

网友
3楼 · 编辑于 2024-05-12 23:54:32

您可以试试try：

def trim(x):
    try:
        return x.str.strip()
    except:
        return x

df = df.apply(trim)

相关问题更多 >

编程相关推荐

热门问题

热门文章