捕获主目录python3.x中所有子文件夹中的所有csv文件

2024-04-26 08:04:51 发布

您现在位置:Python中文网/ 问答频道 /正文

下面的代码用于根据给定的时间值拆分csv文件。问题是这段代码不能捕获所有的csv文件。例如在TT1文件夹中有几个子文件夹。和这些子文件夹中有文件夹。在这些子文件夹中有csv文件。当我将路径指定为path='/root/Desktop/TT1时,它不会处理这些子文件夹中的所有文件。请问我怎样才能修好这个。在

在@Serafeim的回答(https://stackoverflow.com/a/57110519/5025009)之后,我试了一下:

import pandas as pd
import numpy as np
import glob
import os

path = '/root/Desktop/TT1/'
mystep = 0.4

#define the function
def data_splitter(df, name):
    max_time = df['Time'].max() # get max value of Time for the current csv file (df)
    myrange= np.arange(0, max_time, mystep) # build the threshold range
    for k in range(len(myrange)):
        # build the upper values 
        temp = df[(df['Time'] >= myrange[k]) & (df['Time'] < myrange[k] + mystep)]
        temp.to_csv("/root/Desktop/T1/{}_{}.csv".format(name, k))

for filename in glob.glob(os.path.join(path, '*.csv')):
    df = pd.read_csv(filename)
    name = os.path.split(filename)[1] # get the name of the file
    data_splitter(df, name)

Tags: 文件csvthepathnameimport文件夹df
1条回答
网友
1楼 · 发布于 2024-04-26 08:04:51

您可以自动获取所有子文件夹并更改路径: 如果子文件夹是“所有子文件夹”开始的话:

import pandas as pd
import numpy as np
import glob
import os

path = '/root/Desktop/TT1/'
mystep = 0.4

#define the function
def data_splitter(df, name):
    max_time = df['Time'].max() # get max value of Time for the current csv file (df)
    myrange= np.arange(0, max_time, mystep) # build the threshold range
    for k in range(len(myrange)):
        # build the upper values 
        temp = df[(df['Time'] >= myrange[k]) & (df['Time'] < myrange[k] + mystep)]
        temp.to_csv("/root/Desktop/T1/{}_{}.csv".format(name, k))

# use os.walk(path) on the main path to get ALL subfolders inside path
for root,dirs,_ in os.walk(path):
    for d in dirs:
        path_sub = os.path.join(root,d) # this is the current subfolder
        for filename in glob.glob(os.path.join(path_sub, '*.csv')):
            df = pd.read_csv(filename)
            name = os.path.split(filename)[1] # get the name of the current csv file
            data_splitter(df, name)

相关问题 更多 >