来自BYU PCC实验室的Opendomain会话数据集

chitchat-dataset的Python项目详细描述


聊天数据集

PyPI - Python VersionPyPIPyPI - Wheel

CICode style: black

来自BYU的开放域会话数据集 Perception, Control & Cognition实验室Chit-Chat Challenge。在

安装

pip3 install chitchat_dataset

只需下载原始数据集:

^{pr2}$

使用

importchitchat_datasetascccdataset=ccc.Dataset()# Dataset is a subclass of dict()forconvo_id,convoindataset.items():print(convo_id,convo)

其他语言请参见^{}。在

统计

  • 7168次对话
  • 258145句话
  • 1315名独特的参与者

格式

dataset是从会话UUID到会话的映射:

{"prompt":"What's the most interesting thing you've learned recently?","ratings":{"witty":"1","int":5,"upbeat":5},"start":"2018-04-20T01:57:41","messages":[[{"text":"Hello","timestamp":"2018-04-19T19:57:51","sender":"22578ac2-6317-44d5-8052-0a59076e0b96"}],[{"text":"I learned that the Queen of England's last corgi died","timestamp":"2018-04-19T19:58:14","sender":"bebad07e-15df-48c3-a04f-67db828503e3"}],[{"text":"Wow that sounds so sad","timestamp":"2018-04-19T19:58:18","sender":"22578ac2-6317-44d5-8052-0a59076e0b96"},{"text":"was it a cardigan welsh corgi","timestamp":"2018-04-19T19:58:22","sender":"22578ac2-6317-44d5-8052-0a59076e0b96"},{"text":"?","timestamp":"2018-04-19T19:58:24","sender":"22578ac2-6317-44d5-8052-0a59076e0b96"}]]}

如何引证

如果您扩展或使用这项工作,请引用介绍它的论文:

@article{myers2020conversational,
  title={Conversational Scaffolding: An Analogy-Based Approach to Response Prioritization in Open-Domain Dialogs},
  author={Myers, Will and Etchart, Tyler and Fulda, Nancy},
  year={2020}
}

欢迎加入QQ群-->: 979659372 Python中文网_新手群

推荐PyPI第三方库


热门话题
Java中ArrayList的超简单问题   Java 8在一段时间后过期   java如何创建具有用户定义维度的矩阵,并使用从上到下、从左到右的递增值填充它?   java从JDBC重启mysql   带有sqlite的java LiveData未更新UI   带有JDialog的java小程序在Mac OSX中未正确隐藏   java ActionListener无法从公共类引用数组?   java Apache Digester:NoSuchMethodException:没有这样的可访问方法   安卓中数据库中的java数据没有以正确的格式检索   java快速排序实现:使用random pivot时几乎排序   安卓 Java:高效的ArrayList过滤?   java如何在单独的文件中制作GUI程序   jasper报告如何从JSP或Java代码在JasperReport中传递参数值?