Google App Engine 大量加载器“意外线程终止”
我正在尝试通过批量加载功能,把一个中等大小的csv文件上传到谷歌应用引擎,但在上传过程中似乎出现了问题,结果是这样的:
[INFO ] Logging to bulkloader-log-20110328.181531
[INFO ] Throttling transfers:
[INFO ] Bandwidth: 250000 bytes/second
[INFO ] HTTP connections: 8/second
[INFO ] Entities inserted/fetched/modified: 20/second
[INFO ] Batch Size: 10
[INFO ] Opening database: bulkloader-progress-20110328.181531.sql3
[INFO ] Connecting to notmyrealappname.appspot.com/_ah/remote_api
[INFO ] Starting import; maximum 10 entities per post
...............................................................[INFO ] Unexpected thread death: WorkerThread-7
[INFO ] An error occurred. Shutting down...
.........[ERROR ] Error in WorkerThread-7: <urlopen error [Errno -2] Name or service not known>
[INFO ] 1740 entites total, 0 previously transferred
[INFO ] 720 entities (472133 bytes) transferred in 32.3 seconds
[INFO ] Some entities not successfully transferred
我想上传的19,000条记录中,它只上传了大约700条。我在想为什么会失败。我检查了csv文件,看看有没有像多余的逗号这样的错误,这些错误可能会影响Python的csv读取器,而且我也把非ASCII字符去掉了。
1 个回答
6
提高批处理限制(batch_size)和每秒请求限制(rps_limit)是有效的。我把批处理大小设置为1000,每秒请求限制设置为500:
appcfg.py upload_data --url= --application= --filename= --email= --batch_size=1000 --rps_limit=500