TFLite model Maker是否可以创建多类文本分类器Tensorflow Lite模型?

2024-06-16 10:47:31 发布

您现在位置:Python中文网/ 问答频道 /正文

我尝试使用Tensorflow Lite Model Maker提供的AverageWordVecModelSpec构建一个android应用程序来预测文本分类

我正在使用书籍内容来测试我的应用程序是否有效。我为这个实验准备了三本书。代码如下:

!pip install git+https://github.com/tensorflow/examples.git#egg=tensorflow-examples[model_maker]

import numpy as np
import os

import tensorflow as tf
assert tf.__version__.startswith('2')

from tensorflow_examples.lite.model_maker.core.data_util.text_dataloader import TextClassifierDataLoader
from tensorflow_examples.lite.model_maker.core.task.model_spec import AverageWordVecModelSpec
from tensorflow_examples.lite.model_maker.core.task import text_classifier

data_path = '/content/drive/My Drive/datasetps'

model_spec = AverageWordVecModelSpec()

train_data = TextClassifierDataLoader.from_folder(os.path.join(data_path, 'train'), model_spec=model_spec, class_labels=['categorya', 'categoryb'])
test_data = TextClassifierDataLoader.from_folder(os.path.join(data_path, 'test'), model_spec=model_spec, is_training=False, shuffle=False)

model = text_classifier.create(train_data, model_spec=model_spec)

loss, acc = model.evaluate(test_data)

model.export(export_dir='.')

当我只使用2个类/书(与tensorflow团队提供的示例相同)时,它会起作用: it works normal even though it has small acurracy-- because i only takes 20 sample page per book as dataset actually

你可以看到我这里有合理的损失值, 但我在尝试添加第三类时遇到了一个问题:

train_data = TextClassifierDataLoader.from_folder(os.path.join(data_path, 'train'), model_spec=model_spec, class_labels=['categorya', 'categoryb', 'categoryc'])
test_data = TextClassifierDataLoader.from_folder(os.path.join(data_path, 'test'), model_spec=model_spec, is_training=False, shuffle=False)

以下是涉及三等舱的培训结果: enter image description here

您可以看到损失值大于1是不合理的。 我试着找出我应该改变哪一行代码(来自Tensorflow Model Maker)来解决这个问题,最后在这个论坛上讨论了这个问题

So is it possible to have multiclass model for textclassifier using AverageWordVecModelSpec TFlite model maker?


Tags: pathfromtestimportdatamodelostensorflow
1条回答
网友
1楼 · 发布于 2024-06-16 10:47:31

这是可能的。我建议您先对标签进行编码,然后按照工作流程进行操作:

from tflite_model_maker import model_spec
from tflite_model_maker import text_classifier
from tflite_model_maker import TextClassifierDataLoader
from tflite_model_maker import ExportFormat

from sklearn.model_selection import train_test_split
import pandas as pd 

df = pd.read_excel('data_set.xls')
col = ['sentence', 'your_label']
df = df[col]

# Encoding happens here
df.your_label = pd.Categorical(df.your_label)
df['label'] = df.book_label.cat.codes


train, test = train_test_split(df, test_size=0.2)
train.to_csv('train.csv', index=False)
test.to_csv('test.csv', index=False)


spec = model_spec.get('average_word_vec')

train_data = TextClassifierDataLoader.from_csv(
   filename='train.csv',
   text_column='sentence',
   label_column='label',
   model_spec=spec,
   delimiter=',',
   is_training=True)
test_data = TextClassifierDataLoader.from_csv(
  filename='test.csv',
  text_column='sentence',
  label_column='label',
  model_spec=spec,
  delimiter=',',
  is_training=False)

model = text_classifier.create(train_data, model_spec=spec, batch_size=5, epochs=4)

config = configs.QuantizationConfig.create_dynamic_range_quantization(optimizations=[tf.lite.Optimize.OPTIMIZE_FOR_LATENCY])
model.export(export_dir='average_word_vec/', export_format=[ExportFormat.LABEL, ExportFormat.VOCAB])

相关问题 更多 >