优化使用python和BAPI编写的代码

from pyrfc import Connection, ABAPApplicationError, ABAPRuntimeError, LogonError, CommunicationError from configparser import ConfigParser from pprint import PrettyPrinter import openpyxl ASHOST='***' CLIENT='***' SYSNR='***' USER='***' PASSWD='***' conn = Connection(ashost=ASHOST, sysnr=SYSNR, client=CLIENT, user=USER, passwd=PASSWD) try: wb = openpyxl.load_workbook('new2.xlsx') ws = wb['Sheet1'] for i in range(1,len(ws['A'])+1): x = ws['A'+ str(i)].value options = [{ 'TEXT': "BNAME = '" +x+"'"}] fields = [{'FIELDNAME': 'CLASS'},{'FIELDNAME':'USTYP'}] pp = PrettyPrinter(indent=4) ROWS_AT_A_TIME = 10 rowskips = 0 while True: result = conn.call('RFC_READ_TABLE', \ QUERY_TABLE = 'USR02', \ OPTIONS = options, \ FIELDS = fields, \ ROWSKIPS = rowskips, ROWCOUNT = ROWS_AT_A_TIME) rowskips += ROWS_AT_A_TIME if len(result['DATA']) < ROWS_AT_A_TIME: break data_result = result['DATA'] length_result = len(data_result) for line in range(0,length_result): a= data_result[line]["WA"].strip() wb = openpyxl.load_workbook('new2.xlsx') ws = wb['Sheet1'] ws['B'+str(i)].value = a wb.save('new2.xlsx') except CommunicationError: print("Could not connect to server.") raise except LogonError: print("Could not log in. Wrong credentials?") raise except (ABAPApplicationError, ABAPRuntimeError): print("An error occurred.") raise

try: output_list = [] wb = openpyxl.load_workbook('new3.xlsx') ws = wb['Sheet1'] col = ws['A'] col_lis = [col[x].value for x in range(len(col))] length = len(col_lis) for i in range(length): print("--- %s seconds Start of the loop ---" % (time.time() - start_time)) x = col_lis[i] options = [{ 'TEXT': "BNAME = '" + x +"'"}] fields = [{'FIELDNAME': 'CLASS'},{'FIELDNAME':'USTYP'}] ROWS_AT_A_TIME = 10 rowskips = 0 while True: result = conn.call('RFC_READ_TABLE', QUERY_TABLE = 'USR02', OPTIONS = options, FIELDS = fields, ROWSKIPS = rowskips, ROWCOUNT = ROWS_AT_A_TIME) rowskips += ROWS_AT_A_TIME if len(result['DATA']) < ROWS_AT_A_TIME: break print("--- %s seconds in SAP ---" % (time.time() - start_time)) data_result = result['DATA'] length_result = len(data_result) for line in range(0,length_result): a= data_result[line]["WA"] output_list.append(a) print(output_list)

2条回答

网友

1楼 · 编辑于 2024-05-01 21:32:48

首先，我在代码的不同位置放置了计时标记，并将其划分为功能部分（SAP处理、Excel处理）

通过分析计时，我发现大多数运行时都是由Excel编写代码消耗的，考虑间隔：

16:52:37.306272 
16:52:37.405006 moment it was fetched from SAP
16:52:37.552611 moment it was pushed to Excel
16:52:37.558631 
16:52:37.634395 moment it was fetched from SAP
16:52:37.796002 moment it was pushed to Excel
16:52:37.806930
16:52:37.883724 moment it was fetched from SAP
16:52:38.060254 moment it was pushed to Excel
16:52:38.067235 
16:52:38.148098 moment it was fetched from SAP
16:52:38.293669 moment it was pushed to Excel
16:52:38.304640 
16:52:38.374453 moment it was fetched from SAP
16:52:38.535054 moment it was pushed to Excel
16:52:38.542004 
16:52:38.618800 moment it was fetched from SAP
16:52:38.782363 moment it was pushed to Excel
16:52:38.792336 
16:52:38.873119 moment it was fetched from SAP
16:52:39.034687 moment it was pushed to Excel
16:52:39.040712
16:52:39.114517 moment it was fetched from SAP
16:52:39.264716 moment it was pushed to Excel
16:52:39.275649 
16:52:39.346005 moment it was fetched from SAP
16:52:39.523721 moment it was pushed to Excel
16:52:39.530741  
16:52:39.610487 moment it was fetched from SAP
16:52:39.760086 moment it was pushed to Excel
16:52:39.771057   
16:52:39.839873 moment it was fetched from SAP
16:52:40.024574 moment it was pushed to Excel

正如您所看到的，Excel编写部分是SAP查询部分的两倍

代码中的错误是在每次循环迭代中打开/初始化工作簿和工作表，这会大大降低执行速度，并且是多余的，因为您可以从顶部重用wrokbook变量

另一个冗余的事情是剥离前导零和尾随零，这是相当冗余的，因为Excel会自动为字符串数据这样做

这是代码的变体

try:
    wb = openpyxl.load_workbook('new2.xlsx')
    ws = wb['Sheet1']
    print(datetime.now().time())
    for i in range(1,len(ws['A'])+1):
        x = ws['A'+ str(i)].value
        options = [{ 'TEXT': "BNAME = '" + x +"'"}]
        fields = [{'FIELDNAME': 'CLASS'},{'FIELDNAME':'USTYP'}]
        ROWS_AT_A_TIME = 10
        rowskips = 0
        while True:  
            result = conn.call('RFC_READ_TABLE', QUERY_TABLE = 'USR02', OPTIONS = options, FIELDS = fields, ROWSKIPS = rowskips, ROWCOUNT = ROWS_AT_A_TIME)         
            rowskips += ROWS_AT_A_TIME
            if len(result['DATA']) < ROWS_AT_A_TIME:
                    break
        data_result = result['DATA']
        length_result = len(data_result)
        for line in range(0,length_result):
            ws['B'+str(i)].value = data_result[line]["WA"]
        wb.save('new2.xlsx')
    print(datetime.now().time())
except ...

给我以下程序运行的时间戳：

>>> exec(open('RFC_READ_TABLE.py').read())
18:14:03.003174
18:16:29.014373

1000条用户记录需要2.5分钟，这对于此类处理来说似乎是一个公平的价格

网友

2楼 · 编辑于 2024-05-01 21:32:48

在我看来，问题在于while-True循环。我认为您需要优化查询逻辑（或更改它）。不知道你对DB感兴趣是很难的，其他的事情看起来简单又快速

可能有帮助的是尽量不连续地打开和关闭文件：尝试计算“B”列，然后在xlsx文件中一次打开并粘贴所有内容。这可能会有帮助（但我很确定这就是问题所在）

另外，也许你可以使用一些计时库（like here）来计算你大部分时间花在哪里

相关问题更多 >

编程相关推荐

热门问题

热门文章