如何使用不同数量的python值编写CSV

2024-05-19 02:26:55 发布

您现在位置:Python中文网/ 问答频道 /正文

我需要在中注册值的保存。csv,但是每个产品中的值的数量发生了变化,我无法理解如何正确地做,所以每个值都记录在自己的参数下,如文件所示,请告诉我

我还将附上一个文件,以便更容易理解我需要什么

from bs4 import BeautifulSoup
import requests
import time
HOST = 'https://samara.vseinstrumenti.ru'
URL = 'https://samara.vseinstrumenti.ru/santehnika/vse-dlya-vodosnabzheniya/avtonomnaya-kanalizatsiya/'
HEADERS = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.3'}


def get_html(url, params=None):
    r = requests.get(url, headers=HEADERS, params=params)
    return r

def get_url(html):
    soup = BeautifulSoup(html, 'html.parser')
    urls = soup.find_all('div',class_='product-tile grid-item')
    for item in urls: 
        time.sleep(5)
        data_collection(HOST + item.find(class_='title').find('a').get('href'))

def get_name(html):
    soup = BeautifulSoup(html, 'html.parser')
    name = soup.find('h1',class_='title').text
    return name

def get_description(html):
    soup = BeautifulSoup(html, 'html.parser')
    description = soup.find('div',itemprop="description").text
    return description

def get_specifications_parameter(html):
    soup = BeautifulSoup(html, 'html.parser')
    dotted_list = soup.find('ul',class_='dotted-list')
    parameters = dotted_list.find_all('span',class_='text')
    return parameters

def get_specifications_meaning(html):
    soup = BeautifulSoup(html, 'html.parser')
    dotted_list = soup.find('ul',class_='dotted-list')
    meaning = dotted_list.find_all('span',class_='value')
    return meaning

def get_photo(html):
    soup = BeautifulSoup(html, 'html.parser')
    photo = soup.find('div',class_="item -active").find('img').get('src')
    return photo

def get_price(html):
    soup = BeautifulSoup(html, 'html.parser')
    price = soup.find('span',class_='current-price').text
    return price

def data_collection(URL):
    html = get_html(URL)
    name = get_name(html.text)  
    description = get_description(html.text)
    specifications_parameter = get_specifications_parameter(html.text)
    meaning = get_specifications_meaning(html.text)
    # photo = get_photo(html.text)
    price = get_price(html.text)
    

def start():
    html = get_html(URL)
    if html.status_code == 200:
        get_url(html.text)
    else:
        print('Network error')
start()

我试过这么做,但不是这样的

def save_file_walid(items, path):
    with open(path, 'w', newline='') as file:
        writer = csv.writer(file, delimiter=';')
        for item in items:
            writer.writerow(item)

https://drive.google.com/file/d/1uGoW1kpsDGDA-Zh7SiiCDcg9cf2lHQUd/view?usp=sharing


Tags: textparsergetreturndefhtmldescriptionfind
1条回答
网友
1楼 · 发布于 2024-05-19 02:26:55

我想知道更多的真实情况。 首先,源路径的名称是否正确?我的意思是,例如,正确的路径名应该是:

"/root/source path/content.csv" with the file's name inside the path.

查看您的代码,在data_collection函数的末尾,您可以添加:

data = [name, description, specifications_parameter, meaning, price]
save_file_walid(data, path)

将数据存储到csv文件中。然后,在save_file_walid()中,如果只写入数据列表,则不需要使用for循环。你只需要:

def save_file_walid(item, path):
    with open(path, 'w', newline='') as file:
         writer = csv.writer(file, delimiter=';')
         writer.writerow(item)

最后,在存储数据之前,您只能在代码中的某个位置添加一次:

data = ["name", "description", "specifications_parameter", "meaning", "price"]
save_file_walid(data, path)

使用每个列的名称创建文件(如果尚未创建)

希望这对你有帮助)

相关问题 更多 >

    热门问题