用Python高效读取Foxpro DBF文件

2 投票

1 回答

1778 浏览

数据工程师

提问于 2025-04-18 09:20

我试过两个可以读取dbf文件的模块，它们都能正常工作（分别是dbf和dbfpy），但是我必须一条一条地查看数据库里的记录才能找到我想要的东西。这对于大数据库来说真的很慢。有没有什么模块可以直接查询表格或者使用CDX索引的？

数据处理 dbf文件数据库查询 cdx索引

1 个回答

我不认为 dbfpy 支持索引文件，而且我知道 dbf 是不支持的。

不过，在 dbf 中，你可以创建一个临时索引，然后用这个索引来查询数据：

big_table = dbf.Table('/path/to/some/big_table')
def criteria(record):
    "index the table using these fields"
    return record.income, record.age
index = big_table.create_index(key=criteria)

现在可以对 index 进行遍历，或者搜索以返回所有匹配的记录：

for record in index.search(match=(50000, 30)):
    print record

这是一个示例表：

table = dbf.Table('tempu', 'name C(25); age N(3,0); income N(7,0);')
table.open()
for name, age, income in (
        ('Daniel', 33, 55000),
        ('Mike', 59, 125000),
        ('Sally', 33, 77000),
        ('Cathy', 41, 50000),
        ('Bob', 19, 22000),
        ('Lisa', 19, 25000),
        ('Nancy', 27, 50000),
        ('Oscar', 41, 50000),
        ('Peter', 41, 62000),
        ('Tanya', 33, 125000),
        ):
    table.append((name, age, income))

index = table.create_index(lambda rec: (rec.age, rec.income))

还有一些方法可以用来查找范围的开始和结束：

# all the incomes of those who are 33
for rec in index.search(match=(33,), partial=True):
    print repr(rec)
print
# all the incomes of those between the ages of 40 - 59, inclusive
start = index.index_search(match=(40, ), nearest=True)
end = index.index_search(match=(60, ), nearest=True)
for rec in index[start:end]:
    print repr(rec)

这将打印出：

Daniel                    33  55000
Sally                     33  77000
Tanya                     33 125000

Cathy                     41  50000
Oscar                     41  50000
Peter                     41  62000
Mike                      59 125000

回答于 2025-04-18 由 Python大师

分享举报

用Python高效读取Foxpro DBF文件

1 个回答

撰写回答