将json压缩到基于内存的存储（如redis或memcache）中的最佳方法是什么？

>>> my_dict {'details': {'1': {'age': 13, 'name': 'dhruv'}, '2': {'age': 15, 'name': 'Matt'}}, 'members': ['1', '2']} >>> json.dumps(my_dict) '{"details": {"1": {"age": 13, "name": "dhruv"}, "2": {"age": 15, "name": "Matt"}}, "members": ["1", "2"]}' ### SOME BASIC COMPACTION ### >>> json.dumps(my_dict, separators=(',',':')) '{"details":{"1":{"age":13,"name":"dhruv"},"2":{"age":15,"name":"Matt"}},"members":["1","2"]}'

3条回答

网友

1楼 · 编辑于 2024-05-13 19:41:16

我们只是用gzip作为压缩器。

import gzip
import cStringIO

def decompressStringToFile(value, outputFile):
  """
  decompress the given string value (which must be valid compressed gzip
  data) and write the result in the given open file.
  """
  stream = cStringIO.StringIO(value)
  decompressor = gzip.GzipFile(fileobj=stream, mode='r')
  while True:  # until EOF
    chunk = decompressor.read(8192)
    if not chunk:
      decompressor.close()
      outputFile.close()
      return 
    outputFile.write(chunk)

def compressFileToString(inputFile):
  """
  read the given open file, compress the data and return it as string.
  """
  stream = cStringIO.StringIO()
  compressor = gzip.GzipFile(fileobj=stream, mode='w')
  while True:  # until EOF
    chunk = inputFile.read(8192)
    if not chunk:  # EOF?
      compressor.close()
      return stream.getvalue()
    compressor.write(chunk)

在我们的用例中，我们将结果存储为文件，如您所想象的。要仅使用内存中的字符串，还可以使用cStringIO.StringIO()对象作为文件的替换。

网友

2楼 · 编辑于 2024-05-13 19:41:16

如果你想快点，try lz4。如果你想压缩得更好，go for lzma。

Are there any other better ways to compress json to save memory in redis(also ensuring light weight decoding afterwards)?
How good a candidate would be msgpack [http://msgpack.org/]?

Msgpack速度相对较快，内存占用较小。但是ujson对我来说通常更快。您应该在数据上比较它们，测量压缩和解压缩速率以及压缩比。

Shall I consider options like pickle as well?

考虑pickle（cPickle在partucular中）和marshal。他们很快。但请记住，它们不是安全的或可扩展的，您需要为速度付出额外的责任。

网友

3楼 · 编辑于 2024-05-13 19:41:16

基于@Alfe的answer，这里有一个版本，它将内容保存在内存中（用于网络I/O任务）。我还做了一些更改来支持Python 3。

import gzip
from io import StringIO, BytesIO

def decompressBytesToString(inputBytes):
  """
  decompress the given byte array (which must be valid 
  compressed gzip data) and return the decoded text (utf-8).
  """
  bio = BytesIO()
  stream = BytesIO(inputBytes)
  decompressor = gzip.GzipFile(fileobj=stream, mode='r')
  while True:  # until EOF
    chunk = decompressor.read(8192)
    if not chunk:
      decompressor.close()
      bio.seek(0)
      return bio.read().decode("utf-8")
    bio.write(chunk)
  return None

def compressStringToBytes(inputString):
  """
  read the given string, encode it in utf-8,
  compress the data and return it as a byte array.
  """
  bio = BytesIO()
  bio.write(inputString.encode("utf-8"))
  bio.seek(0)
  stream = BytesIO()
  compressor = gzip.GzipFile(fileobj=stream, mode='w')
  while True:  # until EOF
    chunk = bio.read(8192)
    if not chunk:  # EOF?
      compressor.close()
      return stream.getvalue()
    compressor.write(chunk)

要测试压缩，请尝试：

inputString="asdf" * 1000
len(inputString)
len(compressStringToBytes(inputString))
decompressBytesToString(compressStringToBytes(inputString))

相关问题更多 >

编程相关推荐

热门问题

热门文章