Python pdfannot包_程序模块 - PyPI

PDF注释实用程序

pdfannot的Python项目详细描述

pdfannot

这个包的目的是在带注释的pdf和excel数据框架之间创建一个双向链接。

它允许您：

创建一个数据框，其中包含在列“annot_text”中对pdf进行注释的每个字符串，以及在列“label”中对其进行注释的字符串，以及坐标、页面等信息。
在给定上述表单的数据框架的情况下对pdf进行注释。

它对于自动生成带注释的pdf文档非常有用，nlp模型能够从数据帧中的原始文本推断注释。

先决条件

熊猫
菲茨

（pip安装pymupdf）

安装

pip安装pdfannot

示例

您的数据框必须包含有关在PDF上批注内容的信息：

import pdfannot
import pandas as pd

# adf stands for annotation dataframe
adf = pd.DataFrame([
{'x': 40, 'y': 60, 'w': 300, 'h': 50}, 
{'text': 'APPEAL relating to Cancellation Proceedings No 399', 'type': 'Highlight'},
{'text': 'ication for a declaration of i', 'type': 'Highlight', 'label': 'label 1'},
{'x': 100, 'y': 600, 'w': 300, 'h': 50, 'page': 1, 'label': 'label 2'}, 
 ])

# pdfannot.exple_pdf is a test pdf shipped with pdf annot package / debug is set to True for some verbose
pdfannot.annotate_pdf(adf, pdfannot.exple_pdf, '/tmp/test.pdf', debug=True)

您的注释数据框应该已经有了列“x”、“y”、“h”、“w”（坐标表示正方形）或列“text”（文本表示注释）。

annotate_pdf(DataFrame, orig_pdfpath, dest_pdfpath)

将使用数据帧和传入的pdf参数的目录对其进行注释，并将其存储在dest_pdfpath。

函数还考虑可选列“label”来标记批注，并考虑“type”来指定是否需要 “正方形”或“突出显示”。

默认值为label:“”和type:“square”。

最后，使用列“page”指定批注的页面可以加快算法的速度。”“页面”对于1页PDF是可选的

内部

但是，如果数据帧的每个批注标签都有一个文本列（警告：每个文本列都必须命名为annot{label{u name}），则可以使用：

annot_utils.dlf2adf(DataFrame)

通过注释pdf使其可接受执行后：

annotate_pdf(DataFrame, orig_pdfpath, dest_pdfpath)

为pdf添加注释（此方法只允许突出显示）

作者

亚瑟·雷诺，安托万·马鲁拉兹-->；斯塔卡多克

有什么建议/问题吗？-->；contact@stackadoc.com

欢迎加入QQ群-->： 979659372

pdfannot 2019.6.5.1

pdfannot的Python项目详细描述

pdfannot

先决条件

安装

示例

内部

作者

推荐PyPI第三方库

aliyunoss2-autoupload

thornp

asyncorews

torchlars

dexterity.localrolesfield

iterativerecursion

eventful

extprot

ityou.astream

pytwits

cs.pfg.mipago

noseonalchemist

bitlyshortener

indentutils

PyQuadTree

导航栏

项目链接

标签

维护者

最新PyPI项目

最新Python常见问题

pdfannot 2019.6.5.1

pdfannot的Python项目详细描述

pdfannot

先决条件

安装

示例

内部

作者

推荐PyPI第三方库

aliyunoss2-autoupload

thornp

asyncorews

torchlars

dexterity.localrolesfield

iterativerecursion

eventful

extprot

ityou.astream

pytwits

cs.pfg.mipago

noseonalchemist

bitlyshortener

indentutils

PyQuadTree

导 航 栏

项目 链接

标 签

维护者

最新PyPI项目

最新Python常见问题

导航栏

项目链接

标签