目标:创建一列:1。显示三个最常见的条目,或2:在源excel文件中显示列的三个随机离散条目。我知道那是一大堆话,让我来示范一下。在下面的示例中,我很难编写代码来生成下面第二个表(示例值列)最右边的列中的内容
给定此源excel文件(NBA.xlsx):
我想:
我将在下面提供代码。很长,但我想让你知道我在做什么。我想我只需要在最后两部分中添加一些代码片段,用#符号表示:
### Setting up
import pandas as pd
import os
import glob
###Setting working directory
path = os.getcws()
files = os.listdir(path)
### Prep to get all files
from os import listdir
from os.path import isfile, join
### Reading only excel files in folder
FileList_xlsx = [f for f in files if f[-4:] == "xlsx"]
# Initializing empty data frame
df = pd.DataFrame()
# Initializing Aggregate List for Column Names
Agg_ColumnNames = []
# Initializing Aggregate List for File Names associated with Column Names
FileNames = []
# Initializing empty dataframe for Data Type
datatype_df = pd.DataFrame()
# Read excel into Python and prepping the dataframe for DATA TYPE ONLY
n = 1
for f in FileList_xlsx:
test_df = pd.read_excel(f)
test_df = test_df.add_prefix(f'File{n}.')
datatype_df = datatype_df.append(test_df, ignore_index=True, sort = False)
n+=1
# Getting the data type of the column
dataTypeSeries = datatype_df.dtypes
# Loop over list of Excel files
for f in FileList_xlsx:
ReadXlsx = pd.read_excel(f)
ColumnNames = list(ReadXlsx.columns.values)
for a in ColumnNames:
Agg_ColumnNames.append(a)
for a in ColumnNames:
FileNames.append(f)
# Create final dataframe
final = {'Column_Name': Agg_ColumnNames, 'File_Name': FileNames, 'Data_Type': dataTypeSeries}
目前没有回答
相关问题 更多 >
编程相关推荐