在除第一个之外的所有XLS文件中删除表头后追加文件
我有多个扩展名为 .xls 的Excel文件,需要把它们垂直合并在一起。每个文件都有表头,但我只想保留第一个文件的表头,其他文件的表头都想去掉。这些文件的名字是 SWEdtd01MAY14NUM1.xls 到 SWEdtd01MAY14NUMn.xls,其中 n 是根据文件数量而变化的。
请问可以通过KSH、Python或SAS来实现吗?
2 个回答
0
你可以完全用Python来实现这个,使用pandas这个库。
1
下面是一个用Python和Pandas库来实现的例子
可能需要调整一些细节,以满足具体需求,但基本思路已经在这里了
# Script to concatenate a bunch of Excel files with
# Python and Pandas
#
# Remember that indexing starts with 0 in Python,
# whereas indexing starts with 1 in Excel
import pandas as pd
# Number of files to process
n = 10
# Excel sheetname
sheetname = 'sheet1'
# Number of row to skip in each file
skiprows=3
# Header line that will be kept for column name (index 5 in Excel)
header=4
# Column containing the index for each row. Leave it to None if no index
index_col=0
# First file to process
f = 'SWEdtd01MAY14NUM1.xls'
DF = pd.read_excel(f, sheetname, skiprows = skiprows,header = header, index_col = index_col)
# Concatenate the content of other file to this dataframe
for i in range(2,n+1)
f = 'SWEdtd01MAY14NUM'+str(i)+'.xls'
df = pd.read_excel(f, sheetname, skiprows = skiprows, header = header, index_col = index_col)
DF.append(df, ignore_index=True)
# Write the concatenated content to excel
DF.to_excel('SWEdtd01MAY14NUM.xls',sheet_name = sheetname)