如何使用bulkuploader填充带有db.SelfReferenceProperty的类?

3 投票
2 回答
1389 浏览
提问于 2025-04-16 04:06

我有一个类,它使用 db.SelfReferenceProperty 来创建一个树状结构。

在尝试用 appcfg.py upload_data -- config_file=bulkloader.yaml --kind=Group --filename=group.csv (...) 把数据上传到数据库时,我遇到了一个异常,提示 BadValueError: name must not be empty(完整的错误信息在下面)

我尝试过调整数据的顺序,确保那些有外键指向它们的组在前面,但这样并没有解决问题。

通过在 bulkloader.yaml 文件中注释掉那行进行转换的代码 "import_transform: transform.create_foreign_key('Group')",数据可以上传了,但它把这个属性当成字符串存储,这样就破坏了我的应用逻辑。

- kind: Group
  connector: csv
  connector_options:
  property_map:
    - property: __key__
      external_name: key
      export_transform: transform.key_id_or_name_as_string

    - property: name
      external_name: name
      # Type: String Stats: 9 properties of this type in this kind.

    - property: section
      external_name: section
      # Type: Key Stats: 6 properties of this type in this kind.
      import_transform: transform.create_foreign_key('Group')
      export_transform: transform.key_id_or_name_as_string

有没有办法让 bulkloader 考虑自引用,还是说我应该在服务器端对批量加载的数据进行转换,或者自己实现一个批量加载的算法?

----
Traceback (most recent call last):
  File "/home/username/src/google_appengine/google/appengine/tools/adaptive_thread_pool.py", line 150, in WorkOnItems
    status, instruction = item.PerformWork(self.__thread_pool)
  File "/home/username/src/google_appengine/google/appengine/tools/bulkloader.py", line 691, in PerformWork
    transfer_time = self._TransferItem(thread_pool)
  File "/home/username/src/google_appengine/google/appengine/tools/bulkloader.py", line 846, in _TransferItem
    self.content = self.request_manager.EncodeContent(self.rows)
  File "/home/username/src/google_appengine/google/appengine/tools/bulkloader.py", line 1267, in EncodeContent
    entity = loader.create_entity(values, key_name=key, parent=parent)
  File "/home/username/src/google_appengine/google/appengine/ext/bulkload/bulkloader_config.py", line 382, in create_entity
    return self.dict_to_entity(input_dict, self.bulkload_state)
  File "/home/username/src/google_appengine/google/appengine/ext/bulkload/bulkloader_config.py", line 133, in dict_to_entity
    self.__run_import_transforms(input_dict, instance, bulkload_state_copy)
  File "/home/username/src/google_appengine/google/appengine/ext/bulkload/bulkloader_config.py", line 230, in __run_import_transforms
    value = self.__dict_to_prop(transform, input_dict, bulkload_state)
  File "/home/username/src/google_appengine/google/appengine/ext/bulkload/bulkloader_config.py", line 188, in __dict_to_prop
    value = transform.import_transform(value)
  File "/home/username/src/google_appengine/google/appengine/ext/bulkload/bulkloader_parser.py", line 93, in __call__
    return self.method(*args, **kwargs)
  File "/home/username/src/google_appengine/google/appengine/ext/bulkload/transform.py", line 114, in generate_foreign_key_lambda
        return datastore.Key.from_path(kind, value)
  File "/home/username/src/google_appengine/google/appengine/api/datastore_types.py", line 384, in from_path
    ValidateString(id_or_name, 'name')
  File "/home/username/src/google_appengine/google/appengine/api/datastore_types.py", line 109, in ValidateString
    raise exception('%s must not be empty.' % name)
BadValueError: name must not be empty.

2 个回答

2

transform.py(可能是最近才有的)里面有一个装饰器,可以解决这个问题:

def none_if_empty(fn):
  """A decorator which returns None if its input is empty else fn(x).

  Useful on import.  Can be used in config files
  (e.g. "transform.none_if_empty(int)" or as a decorator.

  Args:
    fn: Single argument transform function.

  Returns:
    Wrapped function.
  """

  def wrapper(value):


    if value == '' or value is None or value == []:
      return None
    return fn(value)

  return wrapper

所以使用下面的代码也能解决这个问题,而不需要额外创建一个自定义的 helpers.py 文件:

transform.none_if_empty(transform.create_foreign_key('Group'))
4

我参考了一个类似问题的答案,成功地通过创建一个小的 helpers.py 文件来解决这个问题,这个文件的作用是封装 transform.create_foreign_key

from google.appengine.api import datastore

def create_foreign_key(kind, key_is_id=False):
  def generate_foreign_key_lambda(value):
    if value is None:
      return None

    if key_is_id:
      value = int(value)
    try:
      return datastore.Key.from_path(kind, value)
    except:
      return None

  return generate_foreign_key_lambda

把这个文件放在和你的 yaml 批量上传配置文件(bulkloader.yaml)同一个文件夹里后,你需要在那个文件里添加以下内容:

python_preamble:
- (...)
- import: helpers

transformers:

- kind: Group
  connector: csv
  connector_options:
  property_map:
    - property: __key__
      external_name: key
      export_transform: transform.key_id_or_name_as_string

    - property: name
      external_name: name

    - property: section
      external_name: section
      import_transform: helpers.create_foreign_key('Group')
                      # ^^^^^^^ we use the wrapper instead
      export_transform: transform.key_id_or_name_as_string

做了这些修改后,批量上传就可以正常工作了。

在使用之前,你一定要修改那个捕获所有错误的 except,可以把它换成 except BadValueError

撰写回答