使用python docx合并word文档

2024-05-29 08:25:41 发布

您现在位置:Python中文网/ 问答频道 /正文

我有几个word文件,每个文件都有特定的内容。我想要一个小片段来演示或帮助我找出如何在使用Pythondocx库时将word文件组合成一个文件。

例如,在pywin32库中,我执行了以下操作:

rng = self.doc.Range(0, 0)
for d in data:
    time.sleep(0.05)

    docstart = d.wordDoc.Content.Start
    self.word.Visible = True
    docend = d.wordDoc.Content.End - 1
    location = d.wordDoc.Range(docstart, docend).Copy()
    rng.Paste()
    rng.Collapse(0)
    rng.InsertBreak(win32.constants.wdPageBreak)

但我需要在使用Python docx库而不是win32.client库时执行此操作


Tags: 文件self内容fordocrangecontentword
3条回答

我已经对上面的示例进行了调整,以使用最新版本的python-docx(在编写本文时为0.8.6)。注意,这只是复制元素(合并元素的样式要复杂得多):

from docx import Document

files = ['file1.docx', 'file2.docx']

def combine_word_documents(files):
    merged_document = Document()

    for index, file in enumerate(files):
        sub_doc = Document(file)

        # Don't add a page break if you've reached the last file.
        if index < len(files)-1:
           sub_doc.add_page_break()

        for element in sub_doc.element.body:
            merged_document.element.body.append(element)

    merged_document.save('merged.docx')

combine_word_documents(files)

如果你的需求很简单,那么这样做可能会奏效:

source_document = Document('source.docx')
target_document = Document()

for paragraph in source_document.paragraphs:
    text = paragraph.text
    target_document.add_paragraph(text)

你还可以做一些其他的事情,但这应该能让你开始。

结果发现,在一般情况下,将内容从一个Word文件复制到另一个Word文件是相当复杂的,例如,涉及到协调源文档中可能与目标文档冲突的样式。所以这不是我们明年可能增加的功能。

合并包含所有样式的两个文档的另一种方法是使用python库docxcompose(https://pypi.org/project/docxcompose/)。我们不需要明确定义样式,也不必逐段阅读文档并将其附加到主文档中。python docxcompose的用法如下所示

#Importing the required packages

from docxcompose.composer import Composer
from docx import Document as Document_compose
#filename_master is name of the file you want to merge the docx file into
master = Document_compose(filename_master)

composer = Composer(master)
#filename_second_docx is the name of the second docx file
doc2 = Document_compose(filename_second_docx)
#append the doc2 into the master using composer.append function
composer.append(doc2)
#Save the combined docx with a name
composer.save("combined.docx")

如果要将多个文档合并到一个docx文件中,可以使用下面的函数


#Filename_master is the name of the file you want to merge all the document into
#files_list is a list containing all the filename of the docx file to be merged
def combine_all_docx(filename_master,files_list):
    number_of_sections=len(files_list)
    master = Document_compose(filename_master)
    composer = Composer(master)
    for i in range(0, number_of_sections):
        doc_temp = Document_compose(files_list[i])
        composer.append(doc_temp)
    composer.save("combined_file.docx")
#For Example
#filename_master="file1.docx"
#files_list=["file2.docx","file3.docx","file4.docx",file5.docx"]
#Calling the function
#combine_all_docx(filename_master,files_list)
#This function will combine all the document in the array files_list into the file1.docx and save the merged document into combined_file.docx

相关问题 更多 >

    热门问题