高效的Python数据存储(抽象数据类型?)

-1 投票
3 回答
1343 浏览
提问于 2025-04-15 14:10

抱歉标题有点模糊——我不太确定该怎么问我的问题。

假设有一个字符串:

blah = "There are three cats in the hat"

还有这个(我不太确定该用什么数据结构)“userInfo”:

cats -> ("tim", "1 infinite loop")
three -> ("sally", "123 fake st")
three -> ("tim", "1 infinite loop")
three cats -> ("john", "123 fake st")
four cats -> ("albert", "345 real road")
dogs -> ("tim", "1 infinite loop")
cats hat -> ("janet", NULL)

正确的输出应该是:

tim (since 'cats' exists)
sally (since 'three' exists)
tim (since 'three' exists)
john (since both 'three' and 'cats' exist)
janet (since both 'cats' and 'hat' exist somewhere in the string blah)

我想要一种高效的方式来存储这些数据。因为可能会有多个相同的“three”字符串(比如,150个人会有这个字符串)。我是不是应该用一个列表把所有这些数据都放进去,并且重复“键”?

3 个回答

0

我不太清楚你具体想要做什么,但也许你在找这样的东西:

userinfo = {
  "tim": "1 infinite loop",
  "sally": "123 fake st",
  "john": "123 fake st",
  "albert": "345 real road",
  "janet": None
}

conditions = {
  "cats": ["tim"],
  "three": ["sally", "tim"],
  "three cats": ["john"],
  "four cats": ["albert"],
  "dogs": ["tim"],
  "cats hat": ["janet"]
}

for c in conditions:
  if all_words_are_in_the_sentence(c):
    for p in conditions[c]:
      print p, "because of", c
      print "additional info:", userinfo[p]
1

我完全不明白你到底想做什么,但如果你有很多数据需要存储,并且还需要在这些数据中进行搜索,那么使用某种数据库,并且这个数据库要有索引功能,似乎是个不错的选择。

ZODB、CouchBD 或者 SQL 这些都是个人喜好问题。老实说,我觉得你更应该关注搜索和查找的速度,而不是存储空间的效率。

6

像这样吗?

class Content( object ):
    def __init__( self, content, maps_to ):
        self.content= content.split()
        self.maps_to = maps_to
    def matches( self, words ):
        return all( c in words for c in self.content )
    def __str__( self ):
        return "%s -> %r" % ( " ".join(self.content), self.maps_to )

rules = [
    Content('cats',("tim", "1 infinite loop")),
    Content('three',("sally", "123 fake st")),
    Content('three',("tim", "1 infinite loop")),
    Content('three cats',("john", "123 fake st")),
    Content('four cats',("albert", "345 real road")),
    Content('dogs',("tim", "1 infinite loop")),
    Content('cats hat', ("janet", None)),
]

blah = "There are three cats in the hat"

for r in rules:
    if r.matches(blah.split()):
        print r

输出结果

cats -> ('tim', '1 infinite loop')
three -> ('sally', '123 fake st')
three -> ('tim', '1 infinite loop')
three cats -> ('john', '123 fake st')
cats hat -> ('janet', None)

撰写回答