使用Keras对单词组合进行评分

2024-05-20 09:09:32 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个包含一组“单词”的文件,例如:

1a 9( 9j = 2453
3a 4( 6j 0s = 2309
1 7( 8ll = 4934

它看起来像是随机数据,但事实并非如此,它对每一组“单词”都有一个分数。我的文件由大约100万行组成,其中有明确的模式。大约有3600个单词

“结束”列包含特定单词排列的分数

我将每一行编码为int,并用0填充它们,然后将它们放入一个名为words.txt的文件中

该文件的一个示例是:

475,12,2495,2934,105,0,0,0,9384 (last column being the output score)

现在我有了这个代码:

当我运行它时,它的损失/准确性非常差,损失大约7000000

我做错了什么

from numpy import loadtxt
from itertools import islice
from keras.models import Sequential
from keras.layers import Dense
# load the dataset
dataset = loadtxt('words.txt', delimiter=',')
X = dataset[:,0:8]
y = dataset[:,8]
model = Sequential()
model.add(Dense(12, input_dim=8, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(1))
model.compile(loss='mse', optimizer='adam', metrics=['accuracy'])
model.fit(X, y, epochs=150, batch_size=10000)

_, accuracy = model.evaluate(X, y)
print('Accuracy: %.2f' % (accuracy*100))

我的目标是预测我生成的单词随机组合的分数

拟合日志:

:\AI>python main.py
Using TensorFlow backend.
2021-02-14 08:52:48.350476: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
C:\Users\fordy\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow_core\python\framework\indexed_slices.py:424: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
Epoch 1/150
1047711/1047711 [==============================] - 7s 7us/step - loss: 72945595445.1503 - accuracy: 0.2351
Epoch 2/150
1047711/1047711 [==============================] - 3s 3us/step - loss: 72940365091.2725 - accuracy: 0.0016
Epoch 3/150
1047711/1047711 [==============================] - 3s 3us/step - loss: 72922327250.8712 - accuracy: 0.0016
Epoch 4/150
1047711/1047711 [==============================] - 2s 2us/step - loss: 72883151430.7776 - accuracy: 0.0030
Epoch 5/150
1047711/1047711 [==============================] - 2s 2us/step - loss: 72815216732.1170 - accuracy: 0.0041
Epoch 6/150
1047711/1047711 [==============================] - 2s 2us/step - loss: 72711719248.6156 - accuracy: 0.0012
Epoch 7/150
1047711/1047711 [==============================] - 2s 2us/step - loss: 72566884174.8089 - accuracy: 1.5271e-05

1条回答
网友
1楼 · 发布于 2024-05-20 09:09:32

你的型号太小了。尝试添加嵌入层和LSTM:

model = Sequential()
model.add(tf.keras.layers.Embedding(3600, 12, input_length=8)) # <= adjust vocab size
model.add(tf.keras.layers.LSTM(8))
# model.add(tf.keras.layers.Dense(12, input_dim=8, activation='relu'))
# model.add(tf.keras.layers.Dense(8, activation='relu'))
model.add(Dense(1))

相关问题 更多 >