Python humannotator包_程序模块 - PyPI

可定制工具，方便手动注释

humannotator的Python项目详细描述

人形符号

Library可方便地创建简单的自定义注释器对于数据的手动注释
Jenia Kim，Lawrence Vriend

适用于Jupyter笔记本电脑：

用例

humannotator提供了一种设置自定义注释器的简单方法。如果手动注释是工作流的一部分，则此工具适用于您您正在寻找的解决方案是：

轻量级
可定制
易于设置
与Jupyter/pandas/Python集成

快速入门

安装humannotator

使用conda安装：

    conda install -c lcvriend humannotator

或使用pip：

^{pr2}$

创建一个简单的注释器

Load the data
Define the tasks
Instantiate the annotator

importpandasaspdfromhumannotatorimportAnnotator# load datadf=pd.read_csv('examples/popcorn_classics.csv',sep=';',index_col=0)# set up the annotatorratings=['One bag','Two bags','Three bags','Four bags','Five-bagger',]annotator=Annotator(df,name='VFA | Rate my popcorn classics')annotator.tasks['Bags of popcorn']=ratings# run annotatorannotator(user='GT')

在Jupyter中，这给出了：

注释数据

通过调用注释器来使用它：annotator()。在
注释者会记录你在哪里。在
用“短语”参数突出显示短语。在
注释器将用户（如果提供）和时间戳与注释一起存储。在

访问注释

注释被方便地存储在pandasDataFrame中。在
使用annotated属性访问注释。在
使用unannotated获取没有注释的记录的索引。在
使用merged方法返回与其注释合并的数据。在

存储注释

使用save方法存储注释器。在
{cd7>使用注释器加载方法。在

加载数据

注释器接受list、dict、Series和{}对象作为数据。
数据将在内部转换为数据帧。在

数据帧

默认情况下，注释器将使用数据帧的index和所有columns。在
如果需要更多控制，请使用load_data轻松创建data对象：
1. id_col设置要用作索引的列。在
2. item_cols设置要显示的一个或多个列。在
在

定义任务

可以通过订阅或使用task_factory设置任务。在

使用任务工厂设置任务

通过传递task_factory创建任务：

任务的kind
任务的name
（可选）一个instruction
（可选）列表dependencies
是否为nullable（默认值为False）
任何必要的kwargs（取决于任务的类型）

通常：

task_factory('kind','name',instruction='instruction',dependencies=dependencies,nullable=True/False,**kwargs,)

将dict或list传递给kind将创建一个分类任务。
在本例中，categorieskwarg被忽略。在

通过订阅设置任务

还可以实例化注释器并通过订阅添加任务：

a=Annotator()a.tasks['topic']=['economy','politics','media','other']a.tasks['factual']=bool,"Is the article factual?",False

要添加这样的任务，您至少需要提供您要创建的任务的kind。或者，您可以添加instruction、nullability、dependencies和任何其他kwarg（作为字典）。更改在tasks上使用order属性向用户提示任务的顺序。在

可用任务

kind	kwargs	dtype	description
str		object	String
regex	regex	object	String validated by regex
int		Int64	Nullable integer
float		float64	Float
bool		bool	Boolean
category	categories	CategoricalDtype	Categorical variable
date		datetime64[ns]	Date

依赖性

依赖项由条件和值组成，它们可以作为元组传递：

("col1 == 'x'",False)

条件是pandas query statement。在提示用户输入之前，将对当前批注计算条件。如果查询的计算结果为True，则将自动分配该值。在

注释器

调用注释器

注释器检测它是否从Jupyter运行。如果是这样，注释器将以html和css呈现自己。否则，注释器将以文本形式呈现自身。您可以通过向注释器调用传递一个id列表来注释所选的记录。如果您想重新注释已经注释过的id，那么在cal时将redo设置为True林注释者。在

实例化注释器

arguments
tasks : Task, list of Task objects, Tasks, Annotations or DataFrame
Annotation task(s).
If passed a DataFrame, then the tasks will be inferred from it.
Annotation data in the dataframe will also be initialized.
data : data, list-/dict-like, Series or DataFrame, default None
Data to be annotated.
If `data` is not already a data object,
then it will be passed through `load_data`.
The annotator can be instantiated without data,
but will only work after data is loaded.
user : str, default None
Name of the user.
name : str, default 'HUMANNOTATOR'
Name of the annotator.
save_data : boolean, default False
Set flag to True if you want to store the data with the annotator.
This will ensure that the pickled object, will contain the data.
other parameters
DISPLAY
text_display : boolean, default None
If True will display the annotator in plain text instead of html.
HTML
markdown : boolean, default {markdown}
 If True will pass values through markdown before rendering.
markdown_extensions : list, default {markdown_extensions}
 List of markdown extensions to apply.
escape_html : boolean, default {escape_html}
If true will escape html content within items.
maxheight : str, default '{maxheight_items}'
Max height before item gets y-scroll bar.
Set to None to have no maximum.
DATA
item_cols : str or list of str, default None
Name(s) of dataframe column(s) to display when annotating.
By default: display all columns.
id_col : str, default None
Name of dataframe column to use as index.
By default: use the dataframe's index.
HIGHLIGHTER
phrases : str, list of str, default None
Phrases to highlight in the display.
The phrases can be regexes.
It also to pass in a dict where:
- the keys are the phrases
- the values are the css styling
escape : boolean, default False
Set escape to True in order to escape the phrases.
flags : int, default 0 (no flags)
Flags to pass through to the re module, e.g. re.IGNORECASE.
TRUNCATER
truncate : boolean, default {truncate}
Set to False to not truncate items.
trunc_limit : int, default {truncate_word_limit}
The number of words beyond which an item will be truncated.

该模块包含一个configuration file，其中可以配置humannotator的一些默认行为。在

欢迎加入QQ群-->： 979659372

humannotator 0.0.3

humannotator的Python项目详细描述

人形符号

用例

快速入门

安装humannotator

创建一个简单的注释器

注释数据

访问注释

存储注释

加载数据

数据帧

定义任务

使用任务工厂设置任务

通过订阅设置任务

可用任务

依赖性

注释器

调用注释器

实例化注释器

arguments

other parameters

推荐PyPI第三方库

themis-imager-readfile

djangocms-socialshare

pythonlouvain

radical.facts

CoDrone-mini

gists42

lidazhisheng

Rels

distributions-poon

pyrano

smarthomeconnect

distributed-prob

ocean-jypkg

rqsdk

TeleNex

导 航 栏

项目 链接

标 签

维护者

最新PyPI项目

最新Python常见问题

导航栏

项目链接

标签