如何在Python中处理完整性错误以继续插入数据库?

1 投票
3 回答
1440 浏览
提问于 2025-04-18 13:15

我有一个Postgres数据库,结构如下:

                                                        Table "public.available_dates"
   Column   |            Type             |                          Modifiers                           | Storage | Stats target | Description
------------+-----------------------------+--------------------------------------------------------------+---------+--------------+-------------
 id         | integer                     | not null default nextval('available_dates_id_seq'::regclass) | plain   |              |
 unix_day   | integer                     |                                                              | plain   |              |
 hour       | integer                     |                                                              | plain   |              |
 created_at | timestamp without time zone |                                                              | plain   |              |
 updated_at | timestamp without time zone |                                                              | plain   |              |
Indexes:
    "available_dates_pkey" PRIMARY KEY, btree (id)
    "index_available_dates_on_unix_day_and_hour" UNIQUE, btree (unix_day, hour)
    "index_available_dates_on_unix_day" btree (unix_day)
Has OIDs: no

这个数据库连接到我正在开发的Rails应用和一个作为服务运行的Python应用。我在Python中使用peewee作为ORM(对象关系映射)。Rails应用运行得很好,我需要这个数据库中的每一个索引。

Python应用会查看文件,并将日期添加到数据库中。现在我遇到的问题是,如果日期和小时已经存在,它就会失败并结束Python应用。我不想在数据库中出现重复数据,但我也希望应用在出现这种完整性错误时能够继续运行。以下是我目前的代码:

data_source = [{'unix_day': 1370044800, 'created_at': datetime.datetime(2014, 7, 14, 10, 12, 57, 488000), 'updated_at': datetime.datetime(2014, 7, 14, 10, 12, 57, 488000), 'hour': 1}, 
...
]

        try:
            with db.transaction():
                Available_Dates.insert_many(data_source).execute()
        except IntegrityError as e:
            print e
            db.rollback()
            pass
        else:
            db.commit()
        db.close()

这段代码会失败(这是应该的):

重复的键值违反了唯一约束 "index_available_dates_on_unix_day_and_hour" 详细信息:键 (unix_day, hour)=(1370044800, 10) 已经存在。

我该如何让我的代码说,“失败了?那没关系……我就试下一个。”

注意:我预计失败的次数会比成功的多。

编辑:根据第一个回答的选项1:

for data in data_source:
            try:
                av_date = Available_Dates()
                av_date.unix_day = data['unix_day']
                av_date.hour = data['hour']
                av_date.created_at = data['created_at']
                av_date.updated_at = data['updated_at']
                av_date.save()

            except Exception as e:
                pass

这似乎不太好,因为我捕获了所有异常……而且,我还得单独给每个成员赋值,这看起来效率不高。它还因为数据库上有锁而失败。

duplicate key value violates unique constraint "index_available_dates_on_unix_day_and_hour"
DETAIL:  Key (unix_day, hour)=(1370044800, 1) already exists.

current transaction is aborted, commands ignored until end of transaction block

current transaction is aborted, commands ignored until end of transaction block

current transaction is aborted, commands ignored until end of transaction block

current transaction is aborted, commands ignored until end of transaction block

current transaction is aborted, commands ignored until end of transaction block
 ...

3 个回答

1

我觉得如果你用一个函数,比如说insert_many,把所有东西都插入进去的话,一旦出现错误就会比较麻烦。我的建议是,逐个插入每个项目,使用一个for循环,然后把每次插入的操作放在try-except里,这样就能更好地处理可能出现的错误,而不是把所有操作都放在一起。

2

我这样做,是为了防止万一有用

data = [(a,b), (a,b), (a,b)]

with database.atomic():
    # try bulk
    try:
        MyModel.insert_many(data, fields=[MyModel.field1, MyModel.field2]).execute()
    # if it fails, switch to individually adding (slower)
    except IntegrityError:
        for d in data:
            try:
                entry = MyModel(
                    field1=d[0],
                    field2=d[1],)
                entry.save()
            except IntegrityError:
                pass
-1

这是我所做的(感谢ZJS的帮助):

for data in data_source:
            try:
                exists = Available_Dates.get(Available_Dates.unix_day == data['unix_day'],
                                             Available_Dates.hour == data['hour'])
            except Available_Dates.DoesNotExist:
                av_date = Available_Dates.create(**data)
                print av_date.id, "added to the database."
            except Exception as e:
                print e, type(e)
                pass

撰写回答