input {
file {
path => "/an/absolute/path" #The path has to be absolute
start_position => beginning
}
}
output {
elasticsearch{
hosts => ["host1:port1", "host2:port2"] #most of the time the host being the DNS name (localhost as the most basic one), the port is 9200
index => "my_crawler_urls"
workers => 4 #to define depending on your available resources/expected performance
}
}
有关Logstash的参考请参见:https://www.elastic.co/guide/en/logstash/current/getting-started-with-logstash.html
否则,举一个例子,把你的爬虫输出放入一个文件,每个url有一行,你可以有下面的logstash配置,在这个例子中,logstash将读取一行作为消息,并将其发送到host1和host2上的弹性服务器。在
当然,您可能需要对爬虫程序的输出进行一些过滤、后处理,因此Logstash可以使用codecs和/或{a3}
相关问题 更多 >
编程相关推荐