我在这里通过这个网址将pyspark连接到redshift:
我创建了一个文件夹,下载了
RedshiftJDBC42-1.2.12.1017.jar 并创建了Python文件样品.py用下面的代码
from pyspark.conf import SparkConf
from pyspark.sql import SparkSession
aws_access_key = "xxxx"
aws_secret_key = "xxxxyyyy"
bucket = "redshiftbucketadrian"
spark = SparkSession.builder.master("yarn").appName("Connect to redshift").enableHiveSupport().getOrCreate()
sc = spark.sparkContext
sql_context = HiveContext(sc)
sc._jsc.hadoopConfiguration().set("fs.s3n.awsAccessKeyId", aws_access_key)
sc._jsc.hadoopConfiguration().set("fs.s3n.awsSecretAccessKey", aws_secret_key)
df = sql_context.read\
.format("com.databricks.spark.redshift")\
.option("url", "jdbc:redshift://xxxxx")\
.option("dbtable", "dev")\
.option("tempdir", "s3n://xxxx/")\
.load()
然后我执行了以下命令:
^{pr2}$但是,它一直在给我看这个
.0:8032. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixed
Sleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-04-04 21:27:35 INFO Client:871 - Retrying connect to server: 0.0.0.0/0.0.0
.0:8032. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixed
Sleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-04-04 21:27:37 INFO Client:871 - Retrying connect to server: 0.0.0.0/0.0.0
.0:8032. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixed
Sleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-04-04 21:27:39 INFO Client:871 - Retrying connect to server: 0.0.0.0/0.0.0
.0:8032. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixed
Sleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
我错过了什么?在
目前没有回答
相关问题 更多 >
编程相关推荐