2024-04-19 13:05:17 发布
网友
Another option you would have with sklearn is:
sklearn.model_selection.train_test_split(*arrays, **options)
用法示例:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)
将数组或矩阵X和y分成随机列并测试大小为42的子集。
X
y
42
如前所述,tensorflow没有提供自己的方法来交叉验证模型。建议使用^{}。这有点乏味,但可行。下面是一个完整的交叉验证MNIST模型示例,其中包含tensorflow和KFold:
tensorflow
KFold
from sklearn.model_selection import KFold import tensorflow as tf from tensorflow.examples.tutorials.mnist import input_data # Parameters learning_rate = 0.01 batch_size = 500 # TF graph x = tf.placeholder(tf.float32, [None, 784]) y = tf.placeholder(tf.float32, [None, 10]) W = tf.Variable(tf.zeros([784, 10])) b = tf.Variable(tf.zeros([10])) pred = tf.nn.softmax(tf.matmul(x, W) + b) cost = tf.reduce_mean(-tf.reduce_sum(y*tf.log(pred), reduction_indices=1)) optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost) correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1)) accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) init = tf.global_variables_initializer() mnist = input_data.read_data_sets("data/mnist-tf", one_hot=True) train_x_all = mnist.train.images train_y_all = mnist.train.labels test_x = mnist.test.images test_y = mnist.test.labels def run_train(session, train_x, train_y): print "\nStart training" session.run(init) for epoch in range(10): total_batch = int(train_x.shape[0] / batch_size) for i in range(total_batch): batch_x = train_x[i*batch_size:(i+1)*batch_size] batch_y = train_y[i*batch_size:(i+1)*batch_size] _, c = session.run([optimizer, cost], feed_dict={x: batch_x, y: batch_y}) if i % 50 == 0: print "Epoch #%d step=%d cost=%f" % (epoch, i, c) def cross_validate(session, split_size=5): results = [] kf = KFold(n_splits=split_size) for train_idx, val_idx in kf.split(train_x_all, train_y_all): train_x = train_x_all[train_idx] train_y = train_y_all[train_idx] val_x = train_x_all[val_idx] val_y = train_y_all[val_idx] run_train(session, train_x, train_y) results.append(session.run(accuracy, feed_dict={x: val_x, y: val_y})) return results with tf.Session() as session: result = cross_validate(session) print "Cross-validation result: %s" % result print "Test accuracy: %f" % session.run(accuracy, feed_dict={x: test_x, y: test_y})
随着数据集越来越大,交叉验证变得越来越昂贵。在深入学习中,我们通常使用大型数据集。您应该接受简单的培训。Tensorflow没有一个用于cv的内置机制,因为它通常不用于神经网络,在神经网络中,网络的效率主要依赖于数据集、时段数和学习率。
我在sklearn用过简历 您可以检查链接: https://github.com/hackmaster0110/Udacity-Data-Analyst-Nano-Degree-Projects/
在这篇文章中,请转到“识别安然数据中的欺诈”中的poi_id.py(在项目文件夹中)
用法示例:
将数组或矩阵
X
和y
分成随机列并测试大小为42
的子集。如前所述,tensorflow没有提供自己的方法来交叉验证模型。建议使用^{} 。这有点乏味,但可行。下面是一个完整的交叉验证MNIST模型示例,其中包含
tensorflow
和KFold
:随着数据集越来越大,交叉验证变得越来越昂贵。在深入学习中,我们通常使用大型数据集。您应该接受简单的培训。Tensorflow没有一个用于cv的内置机制,因为它通常不用于神经网络,在神经网络中,网络的效率主要依赖于数据集、时段数和学习率。
我在sklearn用过简历 您可以检查链接: https://github.com/hackmaster0110/Udacity-Data-Analyst-Nano-Degree-Projects/
在这篇文章中,请转到“识别安然数据中的欺诈”中的poi_id.py(在项目文件夹中)
相关问题 更多 >
编程相关推荐