如何使用Python从合并的word文档文件中删除重复文本?

2024-04-20 06:23:54 发布

您现在位置:Python中文网/ 问答频道 /正文

如问题所述,我有一个使用aspose创建的合并word文档文件。代码如下:

import os
import asposewordscloud
import asposewordscloud.models.requests
from shutil import copyfile


# Please get your Client ID and Secret from https://dashboard.aspose.cloud.
client_id='my_id_#'
client_secret='my_secret_#'

words_api = asposewordscloud.WordsApi(client_id,client_secret)
words_api.api_client.configuration.host='https://api.aspose.cloud'


remoteFolder = 'Documents/'
localFolder = '/mnt/c/Users/%user%/Documents'
localFileName = 'new_merged_doc.docx'
remoteFileName = 'new_merged_doc.docx'
localFileName1 = 'rainer_docs.docx'
remoteFileName1 = 'rainer_docs.docx'

#upload file
words_api.upload_file(asposewordscloud.models.requests.UploadFileRequest(open(localFolder + '/' + localFileName,'rb'),remoteFolder + '/' + remoteFileName))
words_api.upload_file(asposewordscloud.models.requests.UploadFileRequest(open(localFolder + '/' + localFileName1,'rb'),remoteFolder + '/' + remoteFileName1))

#append Word documents
requestDocumentListDocumentEntries0 = asposewordscloud.DocumentEntry(href=remoteFolder + '/' + remoteFileName1, import_format_mode='KeepSourceFormatting')

requestDocumentListDocumentEntries = [requestDocumentListDocumentEntries0]
requestDocumentList = asposewordscloud.DocumentEntryList(document_entries=requestDocumentListDocumentEntries)
request = asposewordscloud.models.requests.AppendDocumentRequest(name=remoteFileName, 
document_list=requestDocumentList, folder=remoteFolder, dest_file_name= remoteFolder + '/' + remoteFileName)

result = words_api.append_document(request)

#download file
request_download=asposewordscloud.models.requests.DownloadFileRequest(remoteFolder + '/' + remoteFileName)
response_download = words_api.download_file(request_download)
copyfile(response_download, localFolder + '/' +"new_merged_doc2.docx")

源文档和目标文档具有完全相同的文本,只是样式不同。我指定要保留源格式。但是,我不想要重复的文本。我希望覆盖目标文档中的文本,以便只保留源文档的文本样式

到目前为止,我在Aspose中没有发现任何东西,但我可能遗漏了一些东西。任何帮助都将不胜感激