将列表中的项目与excel电子表格进行比较，然后从spreadsh中提取比较

2024-05-19 06:25:33 发布

您现在位置：Python中文网/ 问答频道 /正文

7629

网友

男 | 程序猿一只，喜欢编程写python代码。

我使用的是python3.7。如果我的代码有点乱，我会提前道歉。这是我从事过的第一个项目，所以我在学习的过程中学到了很多东西

我正在尝试创建一个程序，该程序扫描和解析PDF中的特定表达式（使用Regex），然后将这些结果与excel电子表格中包含的数据进行比较和标识

目前，该程序成功地从PDF中提取正确的信息，并与excel中的B列进行比较，以确认数据存在且正确无误

我想要它做的是打印B列中某个特定单元格的数据C列中它旁边的单元格

这是我当前的代码：

# Open file dialog
root = tk.Tk()
root.withdraw()

file_path = filedialog.askopenfilename()

# Open DOC and extract text
pdfFile = open(file_path, 'rb')
reader = PyPDF2.PdfFileReader(pdfFile)

pageNum = str(reader.numPages)
print('Your document has ' + pageNum + ' pages' + '\n')

for pN in range(reader.numPages):
    decCon = reader.getPage(pN).extractText()

#print(decCon) #to test if extracting worked.


# find the harmonised standards
# EN 000 000-1 V0.0.0, EN000000-1V0.0.0, EN 00000:0000, EN 00000:0000
docRegex = re.compile('''
EN\s\d\d\d\s\d\d\d-\d\sV\d.\d.\d|

EN\s\d\d\d\d\d:\d\d\d\d|

EN\s\d\d\d\d\d-\d:\d\d\d\d
''', re.VERBOSE)

# extract the harmonised standards
extractedHs = docRegex.findall(decCon)

# DEBUG - to ensure it is collecting correct data
print('It contains the following standards: ' + '\n')
pprint.pprint(extractedHs)
print('\n' + '\n')

# setup progress bar
print('Scanning all ETSI standards...') 
toolbar_width = 10
sys.stdout.write("-" * toolbar_width)

for i in range(toolbar_width):
    time.sleep(0.25)
    sys.stdout.write("-")
    sys.stdout.flush()

sys.stdout.write('\n' + '\n' + 'Printing results now...' + "\n" + '\n')


# extract from etsi spreadsheet
wb = openpyxl.load_workbook('All About Standards.xlsx')
sheet = wb["ETSI Catalog"]

etsi = []
for col in sheet['B']:
    etsi.append(col.value)

#print(etsi) # DEBUG PRINT
extractedEtsi = docRegex.findall(str(etsi))

# comparison code
for item1 in extractedHs:
    for item2 in extractedEtsi:
        if item1 == item2:
            print('Standard found: ' + item2)

抱歉，如果我的解释有点冗长，我会尝试进一步解释或简化，如果需要

提前谢谢

Tags： the 数据 in 程序 for stdout sys extract

0条回答

目前没有回答

将列表中的项目与excel电子表格进行比较，然后从spreadsh中提取比较

相关问题更多 >

编程相关推荐

热门问题

热门文章

将列表中的项目与excel电子表格进行比较，然后从spreadsh中提取比较

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >