如何处理Sqlite中列的重复?通过压缩?

2024-04-19 08:22:48 发布

您现在位置:Python中文网/ 问答频道 /正文

我知道(从this question的答案来看),Sqlite默认情况下不支持压缩。是否可以启用它,或者是否需要其他工具?情况如下:

我需要在Sqlite数据库中添加数百万行。该表包含一个description列(平均约500个字符),每个description平均由40行共享,如下所示:

id    name    othercolumn    description 
1     azefds  ...            This description will be the same for probably 40 rows
2     tsdyug  ...            This description will be the same for probably 40 rows
...
40    wxcqds  ...            This description will be the same for probably 40 rows
41    azeyui  ...            This one is unique
42    uiuotr  ...            This one will be shared by 60 rows
43    poipud  ...            This one will be shared by 60 rows
...
101   iuotyp  ...            This one will be shared by 60 rows
102   blaxwx  ...            Same description for the next 10 rows
103   sdhfjk  ...            Same description for the next 10 rows
...

问题:

  • 您是否可以像这样插入行,并启用DB的压缩算法?Pro:您不必处理两个表,查询时更容易。

或者

  • 你要两张桌子吗?你知道吗

    id    name    othercolumn    descriptionid
    1     azefds  ...            1
    2     tsdyug  ...            1    
    ...
    40    wxcqds  ...            1
    41    azeyui  ...            2
    ...
    
    id    description
    1     This description will be the same for probably 40 rows
    2     This one is unique
    

    Con:与解决方案1中的简单select id, name, description from mytable不同,我们必须使用一种复杂的方法来检索它,包括两个表,可能还有多个查询?或者可以不用复杂的查询,但是用unionmerge之类的聪明查询来完成吗?


Tags: thenameidforsqlitebydescriptionbe
2条回答

下面是一些用Python编写的示例代码,以说明ScottHunter的答案:

import sqlite3

conn = sqlite3.connect(':memory:')
c = conn.cursor()
c.execute("CREATE TABLE mytable (id integer, name text, descriptionid integer)")
c.execute("CREATE TABLE descriptiontable (id integer, description text)")

c.execute('INSERT INTO mytable VALUES(1, "abcdef", 1)');
c.execute('INSERT INTO mytable VALUES(2, "ghijkl", 1)');
c.execute('INSERT INTO mytable VALUES(3, "iovxcd", 2)');
c.execute('INSERT INTO mytable VALUES(4, "zuirur", 1)');
c.execute('INSERT INTO descriptiontable VALUES(1, "Description1")');
c.execute('INSERT INTO descriptiontable VALUES(2, "Description2")');

c.execute('SELECT mytable.id, mytable.name, descriptiontable.description FROM mytable, descriptiontable WHERE mytable.descriptionid=descriptiontable.id');

print c.fetchall()

#[(1, u'abcdef', u'Description1'),
# (2, u'ghijkl', u'Description1'), 
# (3, u'iovxcd', u'Description2'), 
# (4, u'zuirur', u'Description1')]

使用多个表不仅可以防止不一致,占用更少的空间,而且可能更快,即使涉及多个/更复杂的查询(正是因为它涉及移动更少的数据)。你应该使用哪一个取决于这些特征中哪一个对你最重要。你知道吗

当您有两个表时,用于检索结果的查询将如下所示(实际上只是两个表之间的连接):

select table1.id, table1.name, table1.othercolumn, table2.description
from table1, table2
where table1.descriptionid=table2.id

相关问题 更多 >