Python系列对象是可变地址解析

2024-04-20 10:02:17 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图在数据帧中创建一个新列,从字符串中解析出一个地址。尝试执行此操作时,我收到以下错误:

("'Series' objects are mutable, thus they cannot be hashed", u'occurred at index pk')

我在这个网站上看到过类似的问题,但不太明白它是如何应用于我的代码的:

import usaddress, re, pyodbc
import pandas as pd

conn = pyodbc.connect("DSN=TEST;UID=test;PWD=test")

sql = "select top 10 pk, address from test..test"
df = pd.read_sql(sql,conn)

pattern = re.compile(".+\\b[0-9]{5}\\b")

def extract(pat):
    print pat
    test = pattern.findall(pat)
    return str(test[0])

i = 0

for i in df.iterrows():
    df[i]['cleansed_address'] = df.apply(lambda x: extract(df[i]['descrsched']))
    i+=1

Tags: 数据字符串testimportredfsqladdress
1条回答
网友
1楼 · 发布于 2024-04-20 10:02:17

MCVE

df = pd.DataFrame([[1, 2,], [3, 4]])
df

# This is a tuple (index value, Series object that represents row)
#   |
#   v    
for i in df.iterrows():
    print(df[i])
#            ^
#            |
# This is you trying to tell Pandas to use a tuple
# in which the second element is a Series as a reference for a column name

解决X/Y问题

df['cleansed_address'] = df['descrsched'].str.findall(pat).str[0]

相关问题 更多 >