我需要做以下工作:
使用第三方api抓取推文,并将推文内容和其他详细信息存储在mysql数据库中
在db表中添加新行之前,我需要获取tweet发源地的temp,并在表中添加temp值
为了完成第二个任务,我编写了两个python脚本。第一个python脚本将mysql触发器添加到数据库中。在mysql触发器内部,我在读取http://crazytechthoughts.blogspot.sg/2011/12/call-external-program-from-mysql.html后使用sys.eval()调用第二个python脚本,该脚本使用api获取温度数据
这两个脚本各自工作得非常好,但触发器无法从第二个脚本获取温度数据。我怎么修理它
Script to add trigger
def create_Triggers(self):
try:
mysql_trig = """
CREATE TRIGGER mysql_Trigger
BEFORE INSERT ON tweets FOR EACH ROW BEGIN
DECLARE loc CHAR(255); DECLARE result Double; DECLARE city CHAR(255);
DECLARE station CHAR(255); DECLARE dtime CHAR(255); SET city = NEW.city;
SET station = "OYSN:9:YE"; SET dtime = NEW.dtime;
SET loc = CONCAT('python D:test case scripts/Weather_enrichment_triggers/weather_enrichment.py',city,station,dtime);
SET result = sys_eval(loc);
SET NEW.Temperature = result;
END;
"""
self.curr.execute(mysql_trig)
print("trigger executed")
self.connection.commit()
#self.curr.execute(postgre_trig)
#self.connection.commit()
except Exception as e:
print(e)
def test_triggers(self, query):
self.curr.execute(query)
self.connection.commit()
Script to fetch temp data(called from inside the trigger)
city = sys.argv[1]
station = sys.argv[2]
dtime = sys.argv[3]
def weather_info(city,station,dtime):
#Get weather information for a given city and date
template_request = "https://api.weather.com/v1/location/{station}/observations/historical.json?apiKey=apikey&units=m&startDate={start_date}&endDate={end_date}"
df_header = ["City", "Year", "Month", "Day", "Hour", "Temperature(C)", "Condition"]
def get_weather_data(city, year, month, day, station):
start_date = "%d%02d%02d" % (year, month, day)
end_date = "%d%02d%02d" % (year, month, day)
request = template_request.format(station=station, start_date=start_date, end_date=end_date)
request_data = json.loads(requests.get(request).content)
weather_data = []
last_dt = None
for observation in request_data["observations"]:
dt = datetime.fromtimestamp(observation["valid_time_gmt"]+3600)
if last_dt and dt.hour > (last_dt.hour + 1):
last_row = deepcopy(weather_data[-1])
last_row[4] = last_row[4]+1
weather_data.append(last_row)
weather_data.append([city, year, month, dt.day, dt.hour, observation["temp"], observation["wx_phrase"]])
last_dt = dt
return weather_data
dtime = datetime.strptime(dtime, '%Y-%m-%d %H:%M:%S%z')
data = get_weather_data(city, dtime.year, dtime.month, dtime.day, station)
weather_df = pd.DataFrame(data, columns=df_header).drop_duplicates(subset=["City", "Year", "Month", "Day", "Hour"])
avg = (weather_df["Temperature(C)"].values).mean()
weather_df = pd.DataFrame()
return float(avg)
temp = weather_info(city, station, dtime)
temp = str(temp)
sys.stdout.write(temp)
sys_eval函数只返回它调用的脚本的exit code
它似乎没有提供一种从外部脚本将数据传递回MySql服务器的干净方法
另外,这是荒谬的不安全。只是说说而已
编辑
一种普遍接受的方法是轮询
让第一个将数据插入表的脚本将
tweets.Temperaure
设置为NULL摆脱触发器和系统评估扩展
把这个索引放在那列上
让您的第二个python脚本在循环中运行,每隔几秒钟就运行一次
然后,对于检索到的每一行,让脚本获取城市的温度并执行以下操作
索引使那些频繁使用的SELECT语句非常快,尤其是当它们不返回行时。权衡:你最新的一排还没有温度
另一种选择是最简单的:让您的第一个python程序执行插入操作,以查找每条tweet的温度,并将其与行的其余数据一起直接插入
值得一提的是,很少有应用程序使用像sys_eval这样的MySQL扩展,如果将应用程序放在生产MySQL服务器上,它就不太可能有扩展
相关问题 更多 >
编程相关推荐