Python与2个电子表格选项卡的连接

2024-06-11 23:06:38 发布

您现在位置:Python中文网/ 问答频道 /正文

我需要更新现有的Python连接脚本,该脚本当前使用列映射(字典)连接30+MS Excel电子表格。目前,脚本仅使用1个选项卡连接,该选项卡在pd.read\u excel语句中命名。我需要保持列映射的灵活性,并将另一个选项卡中的一个附加列引入到整体连接中,因此我想我需要解析数据帧中的表。我意识到这可能很简单,但我还是很紧张。现有Excel选项卡名为“贷款级别详细信息”,我需要从另一个选项卡(S214每日付款)中引入另一列(付款日期)。当前代码发布在下面。提前感谢您提供的任何帮助


import os, pandas as pd

os.chdir(r'M:\Operations\ABC Database\Accounting FMV Recon\Monthly Report Templates Input\Servicer Reports\201908\XYZ')
df_list = []

col_map = {'XYZ Msp Bank': 'XYZMspBank',
           'XYZ Loan No' : 'XYZLoanNo',
           'Prior Loan' : 'AOLoanNo',
           'Borrowers Name' : 'BorrName',
           'MBA Delinquency Status' : 'MBADelqStatus',
           'Note Prin' : 'NotePrincipal',
           'Interest Rate' : 'IntRate',
           'Current P&I' : 'CurrentPI',
           'Beginning UPB' : 'BegUPB',
           'Ending UPB' : 'EndUPB',
           'Ending Deferred Principal' : 'EndDefPrincipal',
           'Boarded Principal' : 'BoardedPrincipal',
           'UPB Transfer In' : 'UPBTransferIn',
           'UPB Transfer Out' : 'UPBTransferOut',
           'Non-Monetary Adjustments' : 'NonMonetaryAdj',
           'Next Payment' : 'NextPayment',
           'S215 Principal Collected' : 'S215PrinCollected',
           'S215 Interest Collected' : 'S215IntCollected',
           'S213 Curtailment Collected' : 'S213CurtCollected',
           'S214 PIF Principal Collected' : 'S214PIFPrinCollected',
           'S214 PIF Servicing Fee Collected' : 'S214PIFServicingFeeCollected',
           'S214 PIF Interest Collected' : 'S214PIFIntCollected',
           'S214 PIF Prepmt Penalty Collected' : 'S214PIFPrepmtPenaltyCollected',
           'S214 Daily Payoffs (Non Securitized)' : 'S214DailyPayoffsNonSec',
           'S214 Daily Payoffs (Securitized)' : 'S214DailyPayoffsSec',
           'Total Remittance' : 'TotalRemittance'
           }

for f in os.listdir():

    temp = pd.read_excel(f, sheet_name='Loan Level Detail')
    temp = temp[temp['ZYZ Loan No'].notnull()]
    temp = temp[col_map.keys()]
    temp.rename(columns=col_map, inplace=True)
    temp['SourceFile'] = f
    df_list.append(temp)

df = pd.concat(df_list, ignore_index=True)
df.drop_duplicates(inplace=True)
df['ReportDate'] = 201908
df.to_excel(r'M:\Operations\ABC Database\Accounting FMV Recon\Monthly Report Templates Output\Servicer Reports\XYZ Concatenated 201908.xlsx', index=False)

Tags: 脚本principaldfosexceltemp选项卡list
1条回答
网友
1楼 · 发布于 2024-06-11 23:06:38

大概是未经测试的伪代码(因为您没有要复制的数据)

循环内

# existing code
temp = pd.read_excel(f, sheet_name='Loan Level Detail')
temp = temp[temp['ZYZ Loan No'].notnull()]
temp = temp[col_map.keys()]
temp.rename(columns=col_map, inplace=True)
temp['SourceFile'] = f

# Following the same approach for a second sheet
col_map2 = {
       'XYZ Loan No' : 'XYZLoanNo',
       'Payoff Date' : 'PayoffDate'}
temp2 = pd.read_excel(f, sheet_name='S214 Daily Payoffs')
temp2 = temp2[temp2['ZYZ Loan No'].notnull()]
temp2 = temp2[col_map2.keys]
temp2.rename(columns=col_map2, inplace=True)


# assume that 'ZYZ Loan No' appears in both sheets, so join
temp3 = temp.merge(temp2, how='left', on='ZYZ Loan No')

df_list.append(temp3)

相关问题 更多 >