如何从MYSQL触发器(嵌入在另一个py脚本中)内部调用的python脚本捕获输出?

2024-04-23 16:01:04 发布

您现在位置:Python中文网/ 问答频道 /正文

我需要做以下工作:

  1. 使用第三方api抓取推文,并将推文内容和其他详细信息存储在mysql数据库中

  2. 在db表中添加新行之前,我需要获取tweet发源地的temp,并在表中添加temp值

为了完成第二个任务,我编写了两个python脚本。第一个python脚本将mysql触发器添加到数据库中。在mysql触发器内部,我在读取http://crazytechthoughts.blogspot.sg/2011/12/call-external-program-from-mysql.html后使用sys.eval()调用第二个python脚本,该脚本使用api获取温度数据

这两个脚本各自工作得非常好,但触发器无法从第二个脚本获取温度数据。我怎么修理它

Script to add trigger

        def create_Triggers(self):
        
        try:
     
            mysql_trig = """
            
            CREATE TRIGGER mysql_Trigger 
            BEFORE INSERT ON tweets FOR EACH ROW BEGIN 
            DECLARE loc CHAR(255); DECLARE result Double; DECLARE city CHAR(255); 
            DECLARE station CHAR(255); DECLARE dtime CHAR(255); SET city = NEW.city; 
            SET station =  "OYSN:9:YE"; SET dtime = NEW.dtime; 
            SET loc = CONCAT('python D:test case scripts/Weather_enrichment_triggers/weather_enrichment.py',city,station,dtime); 
            SET result = sys_eval(loc); 
            SET NEW.Temperature = result; 
            END;
            """
           
            self.curr.execute(mysql_trig)
            print("trigger executed")
            self.connection.commit()
            
            #self.curr.execute(postgre_trig)
            #self.connection.commit()
        
        except Exception as e:
            print(e)

    def test_triggers(self, query):

            self.curr.execute(query)
            self.connection.commit()

Script to fetch temp data(called from inside the trigger)

city  = sys.argv[1]
station = sys.argv[2]
dtime = sys.argv[3]

def weather_info(city,station,dtime):
        
#Get weather information for a given city and date
   template_request = "https://api.weather.com/v1/location/{station}/observations/historical.json?apiKey=apikey&units=m&startDate={start_date}&endDate={end_date}"
   df_header = ["City", "Year", "Month", "Day", "Hour", "Temperature(C)", "Condition"]

   def get_weather_data(city, year, month, day, station):
       start_date = "%d%02d%02d" % (year, month, day)
       end_date = "%d%02d%02d" % (year, month, day)
       request = template_request.format(station=station, start_date=start_date, end_date=end_date)
       request_data = json.loads(requests.get(request).content)
       weather_data = []
       last_dt = None
       for observation in request_data["observations"]:
           dt = datetime.fromtimestamp(observation["valid_time_gmt"]+3600)
           if last_dt and dt.hour > (last_dt.hour + 1):
               last_row = deepcopy(weather_data[-1])
               last_row[4] = last_row[4]+1
               weather_data.append(last_row)
           weather_data.append([city, year, month, dt.day, dt.hour, observation["temp"], observation["wx_phrase"]])
           last_dt = dt
       return weather_data
    
   dtime = datetime.strptime(dtime, '%Y-%m-%d %H:%M:%S%z')
   data = get_weather_data(city, dtime.year, dtime.month, dtime.day, station)

   weather_df = pd.DataFrame(data, columns=df_header).drop_duplicates(subset=["City", "Year", "Month", "Day", "Hour"])
   avg = (weather_df["Temperature(C)"].values).mean()
   weather_df = pd.DataFrame()
   return float(avg)



temp = weather_info(city, station, dtime)
temp = str(temp)
sys.stdout.write(temp)

1条回答
网友
1楼 · 发布于 2024-04-23 16:01:04

sys_eval函数只返回它调用的脚本的exit code

它似乎没有提供一种从外部脚本将数据传递回MySql服务器的干净方法

另外,这是荒谬的不安全。只是说说而已

编辑

一种普遍接受的方法是轮询

  1. 让第一个将数据插入表的脚本将tweets.Temperaure设置为NULL

  2. 摆脱触发器和系统评估扩展

  3. 把这个索引放在那列上

    ALTER TABLE tweets ADD INDEX (Temperature, city)
    
  4. 让您的第二个python脚本在循环中运行,每隔几秒钟就运行一次

    SELECT id, city 
      FROM tweets
     WHERE temperature IS NULL
    

    然后,对于检索到的每一行,让脚本获取城市的温度并执行以下操作

    UPDATE tweets 
       SET Temperature = ###your_value###
     WHERE id = ###the_id_you_SELECTed###
    

索引使那些频繁使用的SELECT语句非常快,尤其是当它们不返回行时。权衡:你最新的一排还没有温度

另一种选择是最简单的:让您的第一个python程序执行插入操作,以查找每条tweet的温度,并将其与行的其余数据一起直接插入

值得一提的是,很少有应用程序使用像sys_eval这样的MySQL扩展,如果将应用程序放在生产MySQL服务器上,它就不太可能有扩展

相关问题 更多 >