是否使用sys.argv声明脚本的.csv输出的文件名?

2024-04-20 10:17:36 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在努力让这段代码创建一个.csv文件,文件名是我在使用cmd+r运行脚本时通过第二个参数声明的。现在,我已经设法让脚本在键入时接受第一个参数:

winkey+r -> pyscript https://stackoverflow.com

但当我打字时:

winkey+r -> pyscript https://stackoverflow.com ideal_filename.csv

什么也没发生。我仍在将输出结果放入剪贴板,但它不会创建新的.csv文件。如果我在脚本中手动命名文件,它可以正常工作,但会使脚本的功能有所降低

我真的不知道在这里该做什么——我只是最近才开始学习,并得到了一些帮助

完整代码如下:


import bs4 as bs
import urllib.request
import requests
from requests_html import HTMLSession
import pyperclip
import sys
import pandas as pd

sys.argv
url = sys.argv[1]
docTitle = sys.argv[2]

try:
    session = HTMLSession()
    response = session.get(url)

except requests.exceptions.RequestException as error:
    print(error)

def crawl(url, docTitle=docTitle):
    source = urllib.request.urlopen(url).read()
    soup = bs.BeautifulSoup(source,'lxml')
    csv_from_soup(soup, output_filename=docTitle)
    
def csv_from_soup(soup, output_filename, print_to_console=True):
    title = soup.find('title')
    desc = soup.findAll(attrs={"name": "description"})
    h1Tag = soup.find_all('h1')[0].text.strip()
    metadata = {
        'Canonical' : response.html.xpath("//link[@rel='canonical']/@href"),
        'Page Title': title.string,
        'PT Length': len(title.string),
        'Meta Description': desc[0]['content'],
        'MD Length': len(desc[0]['content']),
        'H1 Tag': h1Tag,
    }
    metadata_strings = ["\n".join([str(k), str(v)]) for k,v in metadata.items()]
    metadata_strings = '\n--------------\n'.join(metadata_strings)
    tag_names = ["h2", "h3", "h4", "h5", "h6"]
    tag_data = [(tags.name + ' ',' ' + tags.text.strip()) for tags in soup.find_all(tag_names)]
    tag_df = pd.DataFrame(tag_data, columns=["H1-H6 Tags", " Text"])
    full_csv = "\n\n".join([metadata_strings, tag_df.to_csv(index=False)])
    pyperclip.copy(full_csv)
    if print_to_console:
        print(full_csv)
    with open(output_filename, "wb") as f:
        f.write(full_csv.encode('utf-8', errors='replace'))
    return full_csv

crawl(url)

Tags: csvimport脚本urltitleastagsys