需要帮助创建GAE数据存储加载器类吗?
需要帮助创建一个 GAE 数据存储加载器类,用来通过 appcfg.py 上传数据吗?有没有其他更简单的方法来完成这个过程?有没有比这里更详细的例子?
当我尝试使用 bulkloader.yaml 时:
Uploading data records.
[INFO ] Logging to bulkloader-log-20100701.041515
[INFO ] Throttling transfers:
[INFO ] Bandwidth: 250000 bytes/second
[INFO ] HTTP connections: 8/second
[INFO ] Entities inserted/fetched/modified: 20/second
[INFO ] Batch Size: 10
[INFO ] Opening database: bulkloader-progress-20100701.041515.sql3
[INFO ] Connecting to livelihoodproducer.appspot.com/remote_api
[INFO ] Starting import; maximum 10 entities per post
[ERROR ] [Thread-1] WorkerThread:
Traceback (most recent call last):
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/tools/adaptive_thread_pool.py", line 150, in WorkOnItems
status, instruction = item.PerformWork(self.__thread_pool)
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/tools/bulkloader.py", line 693, in PerformWork
transfer_time = self._TransferItem(thread_pool)
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/tools/bulkloader.py", line 848, in _TransferItem
self.content = self.request_manager.EncodeContent(self.rows)
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/tools/bulkloader.py", line 1269, in EncodeContent
entity = loader.create_entity(values, key_name=key, parent=parent)
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/ext/bulkload/bulkloader_config.py", line 385, in create_entity
return self.dict_to_entity(input_dict, self.bulkload_state)
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/ext/bulkload/bulkloader_config.py", line 133, in dict_to_entity
self.__run_import_transforms(input_dict, instance, bulkload_state_copy)
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/ext/bulkload/bulkloader_config.py", line 233, in __run_import_transforms
value = self.__dict_to_prop(transform, input_dict, bulkload_state)
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/ext/bulkload/bulkloader_config.py", line 188, in __dict_to_prop
value = transform.import_transform(value)
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/ext/bulkload/bulkloader_parser.py", line 93, in __call__
return self.method(*args, **kwargs)
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/ext/bulkload/transform.py", line 143, in generate_foreign_key_lambda
return datastore.Key.from_path(kind, value)
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/api/datastore_types.py", line 387, in from_path
'received %r (a %s).' % (i + 2, id_or_name, typename(id_or_name)))
BadArgumentError: Expected an integer id or string name as argument 2; received None (a NoneType).
[INFO ] [Thread-3] Backing off due to errors: 1.0 seconds
[INFO ] Unexpected thread death: Thread-1
[INFO ] An error occurred. Shutting down...
[ERROR ] Error in Thread-1: Expected an integer id or string name as argument 2; received None (a NoneType).
[INFO ] 30 entites total, 0 previously transferred
[INFO ] 0 entities (733 bytes) transferred in 2.8 seconds
[INFO ] Some entities not successfully transferred
在这个过程中,我手动下载了 csv 数据并插入到 appspot.com。当我尝试上传自己的 csv 数据时,列的顺序需要和从 appspot.com 下载的 csv 一模一样吗?空值该怎么处理呢?
2 个回答
0
看起来你有一些引用属性的值是None,也就是空值。这些空值在批量加载工具的帮助程序中处理得不太对劲。
3
我创建了一个名为 config.yaml 的文件,用来配置批量加载器,并且写了一个简单的辅助函数来处理 None 引用。我不明白为什么原来的辅助函数没有做到这一点。
这个辅助函数(文件 helpers.py
)非常简单,只需把它放在和 config.yaml
同一个文件夹里就可以了:
from google.appengine.api import datastore
def create_foreign_key(kind, key_is_id=False):
def generate_foreign_key_lambda(value):
if value is None:
return None
if key_is_id:
value = int(value)
return datastore.Key.from_path(kind, value)
return generate_foreign_key_lambda
这是我 config.yaml
文件的一部分:
python_preamble:
- import: helpers # this will import our helper
[other imports]
...
- kind: ArticleComment
connector: simplexml
connector_options:
xpath_to_nodes: "/blog/Comments/Comment"
style: element_centric
property_map:
- property: __key__
external_name: key
export_transform: transform.key_id_or_name_as_string
- property: parent_comment
external_name: parent-comment
export_transform: transform.key_id_or_name_as_string
import_transform: helpers.create_foreign_key('ArticleComment')
# ^^^^^^^ here it is
# use this instead of transform.create_foreign_key