简单的代码不工作,我正在寻找一些帮助。你知道吗
df有44000行xml格式的聊天对话。根/子结构如下所示。我需要从一行/聊天记录中获取所有的“<;body>;”条目,并将其合并到一个字符串中;此输出将转到dataframe中的变量“test”。我的代码可以工作,但循环不会停止。我知道它可以工作,因为当我用超时代码停止循环并检查数据帧时,它正在做它应该做的事情。我只想让代码在不使用timeout语句的情况下工作。你知道吗
<chat>
<messages>
<chat-message>
<timestamp>2017-08-22T15:08:35.906-04:00</timestamp>
<name />
<body>Hello Mikey, I see you want to chat with us today about: Account
Assistance. If you are chatting on a mobile device or tablet, your
session will end if you navigate away from the chat window. A
representative will be with you momentarily.
</body>
<usertype>system</usertype>
</chat-message>
<chat-message>
def msgg(row):
root = ET.fromstring(row)
toad = ['the'] #Saving something in toad since python will not let you append to an empty list
for body in root.findall('messages/chat-message/body'):
toad.append(body.text)
return toad
timeout = time.time() + 60*10
for row in df5['chat']:
df5['test'] = df5['chat'].apply(msgg)
if time.time() > timeout: break
代码执行我想要的操作,但不退出for循环。如果我没有添加
if time.time() > timeout: break
代码块,程序将继续运行。我可以用中断代码让它运行1分钟,结果数据集就完成了。如果没有休息,它将运行1个小时(可能更长,但我击中一个小时后红色停止框)。你有没有想过为什么python即使完成了也不会停止呢?提前谢谢。你知道吗
P.S.: For anyone tempted to scream this is a duplicate post and advocate for it's removal, please note it's a different question. My other post was asking about handling the parsing error. This is asking about handling a bad loop.
试过这个
for index,row in df5.iterrows():
row['test'] = row['chat'].apply(msgg)
还有一个
AttributeError: 'str' object has no attribute 'apply'
试过这个
for index,row in df5.itertuples():
row['test'][index] = row['chat'][index].apply(msgg)
得到了
ValueError: too many values to unpack (expected 2)
这就是我要找的:
导入xml.etree.ElementTree文件作为ET 导入lxml.etree文件作为et2
考虑到答案的简单性,我想我在提出这个问题时做得很差,因为没有人回答。感谢所有帮助过你的人。你知道吗
相关问题 更多 >
编程相关推荐