Python tensorlm包_程序模块 - PyPI

基于rnns/lstms的字符或词级深层神经文本生成的tensorflow包装器

tensorlm的Python项目详细描述

用4行代码生成莎士比亚诗歌。

安装

tensorlm是用/为python 3.4+和tensorflow 1.1+编写的

pip3 install tensorlm

基本用法

使用CharLM或WordLM类：

importtensorflowastffromtensorlmimportCharLMwithtf.Session()assession:# Create a new model. You can also use WordLMmodel=CharLM(session,"datasets/sherlock/tinytrain.txt",max_vocab_size=96,neurons_per_layer=100,num_layers=3,num_timesteps=15)# Train itmodel.train(session,max_epochs=5,max_steps=500)# Let it generate a textgenerated=model.sample(session,"The ",num_steps=100)print("The "+generated)

它应该输出如下内容：

The  ee e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e

命令行用法

列车：python3 -m tensorlm.cli --train=True--level=char--train_text_path=datasets/sherlock/tinytrain.txt--max_vocab_size=96--neurons_per_layer=100--num_layers=2--batch_size=10--num_timesteps=15--save_dir=out/model--max_epochs=300--save_interval_hours=0.5

示例：python3 -m tensorlm.cli --sample=True--level=char--neurons_per_layer=400--num_layers=3--num_timesteps=160--save_dir=out/model

评估：python3 -m tensorlm.cli --evaluate=True--level=char--evaluate_text_path=datasets/sherlock/tinyvalid.txt--neurons_per_layer=400--num_layers=3--batch_size=10--num_timesteps=160--save_dir=out/model

有关所有选项，请参见python3 -m tensorlm.cli --help。

高级用法

自定义输入数据

输入和目标不必是文本。GeneratingLSTM仅限需要标记ID，因此可以对序列使用任何数据类型，如只要你能把数据编码成整数id。

# We use integer ids from 0 to 19, so the vocab size is 20. The range of ids must always start# at zero.batch_inputs=np.array([[1,2,3,4],[15,16,17,18]])# 2 batches, 4 time steps eachbatch_targets=np.array([[2,3,4,5],[16,17,18,19]])# Create the model in a TensorFlow graphmodel=GeneratingLSTM(vocab_size=20,neurons_per_layer=10,num_layers=2,max_batch_size=2)# Initialize all defined TF Variablessession.run(tf.global_variables_initializer())for_inrange(5000):model.train_step(session,batch_inputs,batch_targets)sampled=model.sample_ids(session,[15],num_steps=3)print("Sampled: "+str(sampled))

它应该输出如下内容：

Sampled: [16, 18, 19]

定制培训、退学等

直接使用GeneratingLSTM类。这个类对数据集类型。它需要整数id并返回整数id。

importtensorflowastffromtensorlmimportVocabulary,Dataset,GeneratingLSTMBATCH_SIZE=20NUM_TIMESTEPS=15withtf.Session()assession:# Generate a token -> id vocabulary based on the textvocab=Vocabulary.create_from_text("datasets/sherlock/tinytrain.txt",max_vocab_size=96,level="char")# Obtain input and target batches from the text filedataset=Dataset("datasets/sherlock/tinytrain.txt",vocab,BATCH_SIZE,NUM_TIMESTEPS)# Create the model in a TensorFlow graphmodel=GeneratingLSTM(vocab_size=vocab.get_size(),neurons_per_layer=100,num_layers=2,max_batch_size=BATCH_SIZE,output_keep_prob=0.5)# Initialize all defined TF Variablessession.run(tf.global_variables_initializer())# Do the trainingepoch=1step=1forepochinrange(20):forinputs,targetsindataset:loss=model.train_step(session,inputs,targets)ifstep%100==0:# Evaluate from time to timedev_dataset=Dataset("datasets/sherlock/tinyvalid.txt",vocab,batch_size=BATCH_SIZE,num_timesteps=NUM_TIMESTEPS)dev_loss=model.evaluate(session,dev_dataset)print("Epoch: %d, Step: %d, Train Loss: %f, Dev Loss: %f"%(epoch,step,loss,dev_loss))# Sample from the model from time to timeprint("Sampled: \"The "+model.sample_text(session,vocab,"The ")+"\"")step+=1

它应该输出如下内容：

Epoch: 3, Step: 100, Train Loss: 3.824941, Dev Loss: 3.778008
Sampled: "The                                                                                                     "
Epoch: 7, Step: 200, Train Loss: 2.832825, Dev Loss: 2.896187
Sampled: "The                                                                                                     "
Epoch: 11, Step: 300, Train Loss: 2.778579, Dev Loss: 2.830176
Sampled: "The         eee                                                                                         "
Epoch: 15, Step: 400, Train Loss: 2.655153, Dev Loss: 2.684828
Sampled: "The        ee    e  e   e  e  e  e  e  e  e   e  e  e   e  e  e   e  e  e   e  e  e   e  e  e   e  e  e "
Epoch: 19, Step: 500, Train Loss: 2.444502, Dev Loss: 2.479753
Sampled: "The    an  an  an  on  on  on  on  on  on  on  on  on  on  on  on  on  on  on  on  on  on  on  on  on  o"

欢迎加入QQ群-->： 979659372

tensorlm 0.4.2

tensorlm的Python项目详细描述

安装

基本用法

命令行用法

高级用法

自定义输入数据

定制培训、退学等

推荐PyPI第三方库

Hshare

dxl-dxp

vapour_linux_amd64

mosaicode-lib-c-opencv

solidstage

webmc

subdomains-chiniki

zhihu_oauth

tap-s3-csv

lain-admin-cli

netlenium

twitter_of_babble

finogeeks

autoworker

inasafe-core

导航栏

项目链接

标签

维护者

最新PyPI项目

最新Python常见问题

tensorlm 0.4.2

tensorlm的Python项目详细描述

安装

基本用法

命令行用法

高级用法

自定义输入数据

定制培训、退学等

推荐PyPI第三方库

Hshare

dxl-dxp

vapour_linux_amd64

mosaicode-lib-c-opencv

solidstage

webmc

subdomains-chiniki

zhihu_oauth

tap-s3-csv

lain-admin-cli

netlenium

twitter_of_babble

finogeeks

autoworker

inasafe-core

导 航 栏

项目 链接

标 签

维护者

最新PyPI项目

最新Python常见问题

导航栏

项目链接

标签