如何将压缩的MNIST数据集提取到测试、训练集?

2024-05-28 19:46:52 发布

您现在位置:Python中文网/ 问答频道 /正文

我正试图从here中提取一个压缩的MNIST文件。 如何从python脚本中执行,并将其拆分为train-nd测试样本。 我试过的代码

def fetch(url):
  import requests, gzip, os, hashlib, numpy
  fp = os.path.join("/tmp", hashlib.md5(url.encode('utf-8')).hexdigest())
  if not os.path.isfile(fp):
    with open(fp, "rb") as f:
      dat = f.read()
  else:
    with open(fp, "wb") as f:
      dat = requests.get(url).content
      f.write(dat)
  return numpy.frombuffer(gzip.decompress(dat), dtype=np.uint8).copy()
X_train = fetch("http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz")[0x10:].reshape((-1, 28, 28))
Y_train = fetch("http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz")[8:]
X_test = fetch("http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz")[0x10:].reshape((-1, 28, 28))
Y_test = fetch("http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz")[8:]

我得到错误FileNotFoundError:[Errno 2] No such file or directory: '/tmp/d8b415e67abd11881e156b8f111d3300' 当我尝试if not os.path.isfile(fp):beif os.path.isfile(fp):时。 我犯了一个错误

> TypeError Traceback (most recent call last)
> <ipython-input-3-a98ee7ff45b8> in <module> 14 f.write(dat) 15 return
> numpy.frombuffer(gzip.decompress(dat), dtype=np.uint8).copy()
> ---> 16 X_train = fetch("http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz")[0x10:].reshape((-1,
> 28, 28)) 17 Y_train =
> fetch("http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz")[8:]
> 18 X_test =
> fetch("http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz")[0x10:].reshape((-1,
> 28, 28))
> 
> TypeError: 'NoneType' object is not subscribable

如何成功获取它们


Tags: pathcomhttpostrainfetchdatimages

热门问题