通过beeline将配置单元查询的输出结果存储在字符串中。也试着和波本一起跑步,但没有

2024-06-02 07:30:58 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在从Python脚本运行配置单元查询。当我使用subprocess.getstatusoutput,但无法将结果存储到变量中。所以我试着用波彭,我得到一个错误,说不能

dd1 = '10-Sep-12'
table = 'testing_table'
1> query = "select distinct(input__file__name) from <db_name>." + table + " where as_of_date =" +"'"+ dd1 +"'"+ " limit 2"

2> cmd = 'beeline -u "jdbc:hive2:<connection string>" -e "'+query + ';"'

3> stat, query_output = subprocess.getstatusoutput(cmd)

这是可行的,但当我试图打印query_输出时,它会打印所有输出(比如关于所有阶段的“info”标记和查询的精确o/p)

当我使用子流程.Popen或者subprocess.check_输出我收到如下错误:

^{pr2}$

Tags: name脚本cmdinput错误tablequerytesting
1条回答
网友
1楼 · 发布于 2024-06-02 07:30:58

Attached是一个python片段,用于从具有表列表的文件中读取,并对列表中的每个表运行配置单元查询,并使用子进程将结果附加到文件中

The cmd variable stores the command to be executed which is called from the subprocess fns and the output is stored into the variable which is later written to the file. The next set of steps reads the file created in the first step and does another query and writes into into another file.

import subprocess
cmd= """ hive -e "use database; show tables;" """
val= subprocess.check_output(cmd,shell=True)        
fl = open('/home/ouput_all_table_list.txt', 'w')
fl.write(val)
fl.close()

fl = open('/home/ouput_all_table_list.txt', 'r')
content = fl.read().splitlines()
for var in content:
    tbl_nm= "'" + var + "'" 
    cmd_ay= 'hive -e "use database; select collect_list(cast(file_dt as string)) as dt, collect_list(cast(cnt as string)) as cnt, '+ tbl_nm +' from (select count(1) cnt,file_dt from database.' + var + ' group by file_dt having count(1) > 0  order by file_dt desc) a;"'
    print cmd_ay
    cmd_out= subprocess.check_output(cmd_ay,shell=True)
    print cmd_out
    fh = open('/home/ouput_all_hive_count_data.txt', 'a')
    fh.write(cmd_out)
    fh.close()

相关问题 更多 >