无法使用python文件提交spark作业

2024-05-14 04:13:30 发布

您现在位置:Python中文网/ 问答频道 /正文

在 我尝试使用下面的命令使用Python(.py文件)运行spark作业。 $SPARK_HOME/bin/SPARK提交~/Project/斯巴克泰斯特--py文件~/Project/SparkTest.py 在

在 作业失败,出现异常“无法分析主URL:” 在

在 我做了一些调试,发现当作业开始时火花大师正在设置为“”而不是火花://10.0.0.5:31016“这是我在spark中配置的主ip和端口-默认值.conf 在

在 以下是提交spark作业后的完整输出 在

Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
17/11/19 22:25:43 INFO SparkContext: Running Spark version 2.2.0
17/11/19 22:25:43 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/11/19 22:25:44 INFO SparkContext: Submitted application: SparkTest.py
17/11/19 22:25:44 INFO SparkContext: Spark configuration:
spark.app.name=SparkTest.py
spark.driver.cores=2
spark.driver.memory=3g
spark.eventLog.dir=hdfs://10.0.0.5:31001/spark_log
spark.eventLog.enabled=true
spark.executor.memory=3g
spark.files=file:/home/admin/Project/SparkTest.py
spark.kryoserializer.buffer.max=1536m
spark.logConf=true
spark.master=<pyspark.conf.SparkConf object at 0x7fb6b70e3898>
spark.rdd.compress=True
spark.serializer=org.apache.spark.serializer.KryoSerializer
spark.serializer.objectStreamReset=100
spark.submit.deployMode=client
17/11/19 22:25:44 INFO SecurityManager: Changing view acls to: admin
17/11/19 22:25:44 INFO SecurityManager: Changing modify acls to: admin
17/11/19 22:25:44 INFO SecurityManager: Changing view acls groups to:
17/11/19 22:25:44 INFO SecurityManager: Changing modify acls groups to:
17/11/19 22:25:44 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(admin); groups with view permissions: Set(); users  with modify permissions: Set(admin); groups with modify permissions: Set()
17/11/19 22:25:44 INFO Utils: Successfully started service 'sparkDriver' on port 41829.
17/11/19 22:25:44 INFO SparkEnv: Registering MapOutputTracker
17/11/19 22:25:44 INFO SparkEnv: Registering BlockManagerMaster
17/11/19 22:25:44 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
17/11/19 22:25:44 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
17/11/19 22:25:44 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-4007fc95-6531-4447-a095-0730713d7758
17/11/19 22:25:44 INFO MemoryStore: MemoryStore started with capacity 1458.6 MB
17/11/19 22:25:44 INFO SparkEnv: Registering OutputCommitCoordinator
17/11/19 22:25:44 INFO Utils: Successfully started service 'SparkUI' on port 4040.
17/11/19 22:25:44 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://10.0.0.5:4040
17/11/19 22:25:44 INFO SparkContext: Added file file:/home/admin/Project/SparkTest.py at spark://10.0.0.5:41829/files/SparkTest.py with timestamp 1511130344827
17/11/19 22:25:44 INFO Utils: Copying /home/admin/Project/SparkTest.py to /tmp/spark-940a6faa-cf59-4d47-87c6-b3f39296c19d/userFiles-d3c17550-6141-496d-aacd-0f83f813a3a0/SparkTest.py
17/11/19 22:25:44 ERROR SparkContext: Error initializing SparkContext.
org.apache.spark.SparkException: Could not parse Master URL: '<pyspark.conf.SparkConf object at 0x7fb6b70e3898>'
        at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2760)
        at org.apache.spark.SparkContext.<init>(SparkContext.scala:501)
        at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
        at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
        at py4j.Gateway.invoke(Gateway.java:236)
        at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
        at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
        at py4j.GatewayConnection.run(GatewayConnection.java:214)
        at java.lang.Thread.run(Thread.java:748)
17/11/19 22:25:44 INFO SparkUI: Stopped Spark web UI at http://10.0.0.5:4040
17/11/19 22:25:44 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
17/11/19 22:25:44 INFO MemoryStore: MemoryStore cleared
17/11/19 22:25:44 INFO BlockManager: BlockManager stopped
17/11/19 22:25:44 INFO BlockManagerMaster: BlockManagerMaster stopped
17/11/19 22:25:44 WARN MetricsSystem: Stopping a MetricsSystem that is not running
17/11/19 22:25:44 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
17/11/19 22:25:44 INFO SparkContext: Successfully stopped SparkContext
Traceback (most recent call last):
  File "/home/admin/Project/SparkTest.py", line 21, in <module>
    sc = SparkContext(conf)
  File "/home/admin/spark-2.2.0-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/context.py", line 118, in __init__
  File "/home/admin/spark-2.2.0-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/context.py", line 180, in _do_init
  File "/home/admin/spark-2.2.0-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/context.py", line 273, in _initialize_context
  File "/home/admin/spark-2.2.0-bin-hadoop2.7/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1401, in __call__
  File "/home/admin/spark-2.2.0-bin-hadoop2.7/python/lib/py4j-0.10.4-src.zip/py4j/protocol.py", line 319, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.
: org.apache.spark.SparkException: Could not parse Master URL: '<pyspark.conf.SparkConf object at 0x7fb6b70e3898>'
        at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2760)
        at org.apache.spark.SparkContext.<init>(SparkContext.scala:501)
        at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
        at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
        at py4j.Gateway.invoke(Gateway.java:236)
        at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
        at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
        at py4j.GatewayConnection.run(GatewayConnection.java:214)
        at java.lang.Thread.run(Thread.java:748)

17/11/19 22:25:44 INFO ShutdownHookManager: Shutdown hook called
17/11/19 22:25:44 INFO ShutdownHookManager: Deleting directory /tmp/spark-940a6faa-cf59-4d47-87c6-b3f39296c19d

Tags: pyorginfohomeadminapachejavaat