OpenStreetMap:将请求连接成循环,遍历每个3166国家代码,使用Python解析响应为DF
我现在正在处理一个结合请求的项目,这个请求是运行在Overpass-Turbo的API端。我的目标是把下面这样的请求合并在一起:
[out:csv(::id,::type,"name","addr:postcode","addr:city","addr:street","addr:housenumber","website"," contact:email=*")][timeout:600];
area["ISO3166-1"="NL"]->.a;
( node(area.a)[amenity=childcare];
way(area.a)[amenity=childcare];
rel(area.a)[amenity=childcare];);
out;
使用ISO3166-1的国家代码,具体可以查看这个链接:https://de.wikipedia.org/wiki/ISO-3166-1-Kodierliste。我想在Python中运行这个请求,使用不同的国家代码来进行编码。
Netherlands, Germany, Austria, Switzerland, France,
接下来,我想知道如何编码这些请求,以便我们可以在Python中一次性运行所有请求,并将结果放入一个数据框中,格式为逗号分隔值。
我认为,要把多个不同的ISO3166-1国家代码的请求合并,并在Python中一次性运行,我们需要构建一个循环,遍历这些国家代码,适当地修改请求,然后把结果合并成一个完整的数据框。
使用请求库来发送HTTP请求,并用pandas来处理数据,这样做是合适的,可以完成这个任务:
import requests
import pandas as pd
from io import StringIO
# List of ISO3166-1 country codes
country_codes = ["NL", "DE", "AT", "CH", "FR"] # Add more country codes as needed
# Base request template
base_request = """
[out:csv(::id,::type,"name","addr:postcode","addr:city","addr:street","addr:housenumber","website"," contact:email=*")][timeout:600];
area["ISO3166-1"="{}"]->.a;
( node(area.a)[amenity=childcare];
way(area.a)[amenity=childcare];
rel(area.a)[amenity=childcare];);
out;
"""
# List to store individual DataFrames
dfs = []
# Loop through each country code
for code in country_codes:
# Construct the request for the current country
request = base_request.format(code)
# Send the request to the Overpass API
response = requests.post("https://overpass-api.de/api/interpreter", data=request)
# Check if the request was successful
if response.status_code == 200:
# Parse the response as CSV and convert it to DataFrame
try:
df = pd.read_csv(StringIO(response.text), error_bad_lines=False)
except pd.errors.ParserError as e:
print(f"Error parsing CSV data for {code}: {e}")
continue
# Add country code as a new column
df['country_code'] = code
# Append the DataFrame to the list
dfs.append(df)
else:
print(f"Error retrieving data for {code}")
# Merge all DataFrames into a single DataFrame
result_df = pd.concat(dfs, ignore_index=True)
# Save the DataFrame to a CSV file or perform further processing
result_df.to_csv("merged_childcare_data.csv", index=False)
我在Google-Colab上运行这个代码:
我想实现的目标是:
a. 获取国家代码,也就是包含我们想查询的国家的ISO3166-1国家代码。
b. 创建一个基础请求,这个请求应该是我们Overpass API请求的基本模板,里面有一个占位符{},用来放对应的国家代码。
循环:这个循环应该遍历每个国家代码,使用当前的国家代码修改基础请求,然后发送请求,最后把响应解析成一个数据框,并将其添加到dfs列表中。
我最终想要实现的目标是:将dfs中的所有数据框合并成一个单独的数据框result_df,这样我们就可以将其保存为CSV文件或根据需要进一步处理。
不过现在我在Google-Colab上遇到了一些错误,具体情况如下:
<ipython-input-3-67ee61d1e734>:33: FutureWarning: The error_bad_lines argument has been deprecated and will be removed in a future version. Use on_bad_lines in the future.
df = pd.read_csv(StringIO(response.text), error_bad_lines=False)
Skipping line 337: expected 1 fields, saw 2
Skipping line 827: expected 1 fields, saw 2
<ipython-input-3-67ee61d1e734>:33: FutureWarning: The error_bad_lines argument has been deprecated and will be removed in a future version. Use on_bad_lines in the future.
df = pd.read_csv(StringIO(response.text), error_bad_lines=False)
Skipping line 27: expected 1 fields, saw 2
Skipping line 132: expected 1 fields, saw 2
Skipping line 366: expected 1 fields, saw 2
Skipping line 539: expected 1 fields, saw 2
Skipping line 633: expected 1 fields, saw 2
Skipping line 881: expected 1 fields, saw 2
Skipping line 1394: expected 1 fields, saw 2
Skipping line 1472: expected 1 fields, saw 2
Skipping line 1555: expected 1 fields, saw 4
Skipping line 1580: expected 1 fields, saw 2
Skipping line 1630: expected 1 fields, saw 2
Skipping line 1649: expected 1 fields, saw 2
Skipping line 1766: expected 1 fields, saw 2
Skipping line 1843: expected 1 fields, saw 2
Skipping line 2067: expected 1 fields, saw 2
Skipping line 2208: expected 1 fields, saw 2
Skipping line 2349: expected 1 fields, saw 3
Skipping line 2414: expected 1 fields, saw 2
Skipping line 2419: expected 1 fields, saw 2
Skipping line 2423: expected 1 fields, saw 2
Skipping line 2464: expected 1 fields, saw 2
Skipping line 2515: expected 1 fields, saw 2
Skipping line 2581: expected 1 fields, saw 2
Skipping line 2855: expected 1 fields, saw 2
Skipping line 2899: expected 1 fields, saw 2
Skipping line 2950: expected 1 fields, saw 2
<ipython-input-3-67ee61d1e734>:33: FutureWarning: The error_bad_lines argument has been deprecated and will be removed in a future version. Use on_bad_lines in the future.
df = pd.read_csv(StringIO(response.text), error_bad_lines=False)
<ipython-input-3-67ee61d1e734>:33: FutureWarning: The error_bad_lines argument has been deprecated and will be removed in a future version. Use on_bad_lines in the future.
df = pd.read_csv(StringIO(response.text), error_bad_lines=False)
Skipping line 114: expected 1 fields, saw 2
Skipping line 212: expected 1 fields, saw 2
Skipping line 339: expected 1 fields, saw 2
Skipping line 340: expected 1 fields, saw 4
Skipping line 351: expected 1 fields, saw 3
Skipping line 357: expected 1 fields, saw 2
Skipping line 359: expected 1 fields, saw 3
Skipping line 510: expected 1 fields, saw 6
Skipping line 535: expected 1 fields, saw 2
Skipping line 546: expected 1 fields, saw 3
Skipping line 590: expected 1 fields, saw 4
Skipping line 596: expected 1 fields, saw 4
Skipping line 602: expected 1 fields, saw 3
Skipping line 659: expected 1 fields, saw 3
Skipping line 764: expected 1 fields, saw 2
Skipping line 836: expected 1 fields, saw 2
Skipping line 838: expected 1 fields, saw 2
<ipython-input-3-67ee61d1e734>:33: FutureWarning: The error_bad_lines argument has been deprecated and will be removed in a future version. Use on_bad_lines in the future.
df = pd.read_csv(StringIO(response.text), error_bad_lines=False)
Skipping line 50: expected 1 fields, saw 3
Skipping line 302: expected 1 fields, saw 2
Skipping line 303: expected 1 fields, saw 2
Skipping line 740: expected 1 fields, saw 2
Skipping line 758: expected 1 fields, saw 2
Skipping line 1440: expected 1 fields, saw 2
Skipping line 1476: expected 1 fields, saw 3
Skipping line 1680: expected 1 fields, saw 3
Skipping line 1687: expected 1 fields, saw 2
Skipping line 1954: expected 1 fields, saw 3
1 个回答
0
你好,亲爱的ouroboros1
多亏了你的帮助,我得到了继续前进的动力
import requests
import pandas as pd
from io import StringIO
# List of ISO3166-1 country codes
country_codes = ["NL", "DE", "AT", "CH", "FR"] # Add more country codes as needed
# Base request template
base_request = """
[out:csv(::id,::type,"name","addr:postcode","addr:city","addr:street","addr:housenumber","website"," contact:email=*")][timeout:600];
area["ISO3166-1"="{}"]->.a;
( node(area.a)[amenity=childcare];
way(area.a)[amenity=childcare];
rel(area.a)[amenity=childcare];);
out;
"""
# List to store individual DataFrames
dfs = []
# Loop through each country code
for code in country_codes:
# Construct the request for the current country
request = base_request.format(code)
# Send the request to the Overpass API
response = requests.post("https://overpass-api.de/api/interpreter", data=request)
# Check if the request was successful
if response.status_code == 200:
# Parse the response as CSV and convert it to DataFrame
try:
df = pd.read_csv(StringIO(response.text), sep='\t')
except pd.errors.ParserError as e:
print(f"Error parsing CSV data for {code}: {e}")
continue
# Add country code as a new column
df['country_code'] = code
# Append the DataFrame to the list
dfs.append(df)
else:
print(f"Error retrieving data for {code}")
# Merge all DataFrames into a single DataFrame
result_df = pd.concat(dfs, ignore_index=True)
# Save the DataFrame to a CSV file or perform further processing
result_df.to_csv("merged_childcare_data.csv", index=False)
问题解决了,恢复了570 KB的数据 -
非常感谢你