所以我在这个csv文件中有多个嵌入代码,我想读取URL并将它们的标题标签复制到一个新的csv文件中
我能够做到这一点,但与格式是有点意外
from bs4 import BeautifulSoup
import requests
import csv
with open('test.csv','r') as f:
csv_raw = f.read()
with open('newtest.csv','w') as ff:
cw = csv.writer(ff)
split_csv=csv_raw.split('\n')
#split_csv.remove('')
separator=","
for each in split_csv:
url_row_index=0
url = each.split(separator)[url_row_index]
url_delete = '<iframe width="560" height="315" src="' #delete extra texts
url_delete2 = '" frameborder="0" allowfullscreen></iframe>' #delete extra texts
url2 = url.replace(url_delete,'')
url3 = url2.replace(url_delete2,'')
html=requests.get(url3).content
soup=BeautifulSoup(html,'html.parser')
namelist = soup.title.string
word_delete = 'Video ' #delete extra wordings - Video
word_delete2 = '.mp4 (cloned)' #delete extra wordings - .mp4 (cloned)
namelist2 = namelist.replace(word_delete,'')
namelist3 = namelist2.replace(word_delete2,'')
print(namelist3)
cw.writerow(namelist3)
#So say in the original csv file, these are the embed codes
<iframe width="560" height="315" src="https://www.fembed.com/v/2222222" frameborder="0" allowfullscreen></iframe>
<iframe width="560" height="315" src="https://www.fembed.com/v/1111111" frameborder="0" allowfullscreen></iframe>
标题标签是
视频111helloworld111.mp4(克隆)
视频222helloworld222.mp4(克隆)
运行代码后,我将能够打印出这些
111helloworld111
222你好世界222
我希望在新的csv文件中看到它们
然而,在新的csv文件中,它将是这样的
1,1,1,h,e,l,l,o,w,o,r,l,d,1,1,1
2,2,2,h,e,l,l,o,w,o,r,l,d,2,2,2,2,2
我的代码有问题,但我不知道是什么
任何帮助都将不胜感激
分类。感谢@njzk2。 真的是我。 在本例中,我应该将namelist3的输出视为一个列表,而不是一个字符串,正如@njzk2在他的回复中指出的那样
因此,我的问题的答案是简单地将[]添加到我的代码中,即cw.writerow([namelist3])
相关问题 更多 >
编程相关推荐