擅长:python、mysql、java
<p>我同意这样的评论:<code>train_test_split</code>可能是一个好办法。但是,由于它被标记为<code>numpy</code>,所以这里有一种<code>numpy</code>的方法,非常快:</p>
<pre><code># recreate random array:
x = np.random.random((46928,28,28))
# pick your indices for sample 1 and sample 2:
s1 = np.random.choice(range(x.shape[0]), 41928, replace=False)
s2 = list(set(range(x.shape[0])) - set(s1))
# extract your samples:
sample1 = x[s1, :, :]
sample2 = x[s2, :, :]
</code></pre>
<p>您的输出:</p>
^{pr2}$
<p><strong>计时:</strong></p>
<p>只是出于好奇,我将这个<code>numpy</code>方法与<code>sklearn.model_selection.train_test_split</code>进行了比较,结果差别不大。<code>train_test_split</code>速度更快,但只有一点点。无论如何,我认为<code>train_test_split</code>是更好的选择。在</p>
<p>方法:</strong>0.26082248413999876秒</p>
<p><strong><code>train_test_split</code>方法:</strong>0.2221719217000092秒</p>