pythonptx从幻灯片标题中提取文本

from pptx import Presentation prs = Presentation(filepath) # load the ppt slide_titles = [] # container foe slide titles for slide in prs.slides: # iterate over each slide title_shape = slide.shapes[0] # consider the zeroth indexed shape as the title if title_shape.has_text_frame: # is this shape has textframe attribute true then # check if the slide title already exists in the slide_title container if title_shape.text.strip(""" !@#$%^&*)(_-+=}{][:;<,>.?"'/<,""")+ '. ' not in slide_titles: slide_titles.append(title_shape.text.strip(""" !@#$%^&*)(_-+=}{][:;<,>.?"'/<,""")+ '. ')

2条回答

网友

1楼 · 编辑于 2024-06-02 06:35:59

如何从目录中的pptx中提取所有文本（fromthis blog）

from pptx import Presentation
import glob

for eachfile in glob.glob("*.pptx"):
    prs = Presentation(eachfile)
    print(eachfile)
    print("           ")
    for slide in prs.slides:
        for shape in slide.shapes:
            if hasattr(shape, "text"):
                print(shape.text)

网友

2楼 · 编辑于 2024-06-02 06:35:59

Slide.shapes（一个SlideShapes对象）具有属性.title，当有一个（通常是）时返回标题形状，如果没有标题，则返回标题形状。
http://python-pptx.readthedocs.io/en/latest/api/shapes.html#slideshapes-objects

这是访问标题形状的首选方法。在

请注意，并非所有幻灯片都有标题形状，因此您必须测试None结果，以避免在这种情况下出现错误。在

另外请注意，用户有时会使用不同的标题形状，比如添加一个单独的新文本框。所以你不能保证你得到的文本“出现”作为标题在幻灯片。但是，您将得到与PowerPoint认为的标题相匹配的文本，例如，在大纲视图中它显示为该幻灯片的标题的文本。在

相关问题更多 >

编程相关推荐

热门问题

热门文章