模拟curl的URL请求 --data-binary

1 投票

2 回答

1764 浏览

提问于 2025-04-17 19:53

我想发送一个网址请求，内容相当于在发送的数据中用换行符分隔的多个json对象。这是为了在Elasticsearch中批量索引两个项目。

这样做是可以的：

curl -XPOST 'localhost:9200/myindex/mydoc?pretty=true' --data-binary @myfile.json

这里的myfile.json文件是：

{"index": {"_parent": "btaCovzjQhqrP4s3iPjZKQ"}}    
{"title": "hello"}
{"index": {"_parent": "btaCovzjQhqrP4s3iPjZKQ"}}
{"title": "world"}

当我尝试使用：

req = urllib2.Request(url,data=
json.dumps({"index": {"_parent": "btaCovzjQhqrP4s3iPjZKQ"}}) + "\n" +
json.dumps({"title":"hello"}) + "\n" + 
json.dumps({"index": {"_parent": "btaCovzjQhqrP4s3iPjZKQ"}}) + "\n" +
json.dumps({"title":"world"})

我得到的结果是：

HTTP Error 500: Internal Server Error

2 个回答

对我来说，这个用例实际上是针对ElasticSearch的批量索引请求。

使用rawes可以让这个过程变得简单很多：

import rawes
es = rawes.Elastic('localhost:9200')

with open('myfile.json') as f:
   lines = f.readlines()

es.post('someindex/sometype/_bulk', data=lines)

回答于 2025-04-17 由 Python大师

分享举报

“HTTP错误500”可能是因为忘记写索引名称或索引类型。

另外：在批量插入数据时，elasticsearch需要在最后一条记录后面加一个换行符“\n”，否则它不会插入那条记录。

可以试试：

import urllib2
import json

url = 'http://localhost:9200/myindex/mydoc/_bulk?pretty=true'

data = json.dumps({"index": {"_parent": "btaCovzjQhqrP4s3iPjZKQ"}}) + "\n" + json.dumps({"title":"hello"}) + "\n" + json.dumps({"index": {"_parent": "btaCovzjQhqrP4s3iPjZKQ"}}) + "\n" + json.dumps({"title":"world"})

req = urllib2.Request(url,data=data+"\n")

f = urllib2.urlopen(req)
print f.read()

或者，稍微改一下代码：

import urllib2
import json

url = 'http://localhost:9200/myindex/mydoc/_bulk?pretty=true'

data = [
    {"index": {"_parent": "btaCovzjQhqrP4s3iPjZKQ"}},
    {"title":"hello"},
    {"index": {"_parent": "btaCovzjQhqrP4s3iPjZKQ"}},
    {"title":"world"}
]

encoded_data = "\n".join(map(json.dumps,data)) + "\n"

req = urllib2.Request(url,data=encoded_data)

f = urllib2.urlopen(req)
print f.read()

回答于 2025-04-17 由 Python大师

分享举报

模拟curl的URL请求 --data-binary

2 个回答

撰写回答