我有一个文件夹的100个html文件,我正试图创建一个名为page1-page100的树。每个页面都有一个超链接,可以打开另一个页面。我试图让程序读取根节点(page1.html)读取其超链接并基于这些链接创建子节点,然后对其余节点重复此操作,直到树完成。使用超链接的最佳方式是什么?这是我目前的代码。你知道吗
import os
from math import*
from os.path import isfile, join
entries = os.listdir("C:/Users/deonh/Downloads/intranets/intranet1") #This reads the directory
onlyfiles = [f for f in entries if isfile(join("C:/Users/deonh/Downloads/intranets/intranet1", f))] #This took all the webpages in the directory and put them into a list.
print(onlyfiles)
web = open("C:/Users/deonh/Downloads/intranets/intranet1" + "/" + onlyfiles[0]) # This will tell us if the webpage is readable or not
print(web.readable()) # This tells if the file is readable
print(web.readlines()) #This reads the content of the file
web.close()
你可以用os.步行迭代所有子目录:
如果有不清楚的地方,请询问。你知道吗
相关问题 更多 >
编程相关推荐