使用Python从Hadoop查询

2024-04-24 08:10:20 发布

男 | 程序猿一只，喜欢编程写python代码。

希望这个问题能解决。目前，这项工作：

import pyodbc, sys, os
import pandas as pd**

def get_data(SQL_statement):# insert HQL Statement with the usual '''<QUERY>''' 
    pyodbc.autocommit = True
    #Connection settings- DSN can be replaced with STG or DEV as required, depending on where you want to connect.
    conn = pyodbc.connect("DSN=HDP_PROD", autocommit=True)
    cursor = conn.cursor()
    #V1.1-- Config settings to limit TEZ container size preventing out of memory error, query takes slightly longer to run. 
    cursor.execute("set hive.tez.container.size=8192")
    cursor.execute("set hive.auto.convert.join.noconditionaltask.size=6553")
    #cursor.execute("set hive.auto.convert.join=false")
    cursor.execute(SQL_statement)
    #Creates df from SQL/HQL statement
    df = pd.read_sql(SQL_statement,conn)
    #Returns df to memory 
    return df


HIVE = gethive('''SELECT *
                FROM sp_commercial.INTERACTIONS_LAST6M''')

如果在函数errors out上方的select语句中添加where条件。你知道吗

因此，我想知道如何使用where条件从python查询hue/hadoop？你知道吗

Tags： to import df execute sql size as conn

0条回答

目前没有回答

使用Python从Hadoop查询

相关问题更多 >

编程相关推荐

热门问题

热门文章

使用Python从Hadoop查询

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >