spark提交至minikube与krb5.conf相关的错误

2024-05-23 23:29:05 发布

您现在位置:Python中文网/ 问答频道 /正文

我跟随这个文件到spark-submit到minikube:https://gist.github.com/jjstill/8099669931cdfbb90ce6f4c307971514

这是我修改过的版本spark-minikube.sh

minikube --memory 8192 --cpus 3 start

kubectl create namespace spark

kubectl create serviceaccount spark-serviceaccount --namespace spark
kubectl create clusterrolebinding spark-rolebinding --clusterrole=edit --serviceaccount=spark:spark-serviceaccount --namespace=spark

cd $SPARK_HOME

# Asking local environment to use Docker daemon inside the Minikube
eval $(minikube docker-env)

# docker build -t spark:latest -f /path/to/Dockerfile .
IMG_NAME=asia.gcr.io/project-id/my-image:latest

# Submitting SparkPi example job
# $KUBERNETES_MASTER can be taken from output of kubectl cluster-info
KUBERNETES_MASTER=https://127.0.0.1:<port_number>

spark-submit --master k8s://$KUBERNETES_MASTER \
                 --deploy-mode cluster \
                 --name spark-pi \
                 --jars jars/gcs-connector-hadoop2-2.0.1-shaded.jar,jars/spark-bigquery-latest_2.12.jar \
                 --conf spark.executor.instances=2 \
                 --conf spark.kubernetes.namespace=spark \
                 --conf spark.kubernetes.container.image=${IMG_NAME} \
                 --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark-serviceaccount \
                 local:///app/main.py

我得到了这个错误:

WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.apache.spark.unsafe.Platform (file:/usr/local/Cellar/apache-spark/3.1.1/libexec/jars/spark-unsafe_2.12-3.1.1.jar) to constructor java.nio.DirectByteBuffer(long,int)
WARNING: Please consider reporting this to the maintainers of org.apache.spark.unsafe.Platform
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
21/05/23 17:33:44 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
21/05/23 17:33:45 INFO SparkKubernetesClientFactory: Auto-configuring K8S client using current context from users K8S config file
21/05/23 17:33:45 INFO KerberosConfDriverFeatureStep: You have not specified a krb5.conf file locally or via a ConfigMap. Make sure that you have the krb5.conf locally on the driver image.
Exception in thread "main" org.apache.spark.SparkException: Please specify spark.kubernetes.file.upload.path property.
        at org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:299)
        at org.apache.spark.deploy.k8s.KubernetesUtils$.$anonfun$uploadAndTransformFileUris$1(KubernetesUtils.scala:248)
        at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238)
        at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
        at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
        at scala.collection.TraversableLike.map(TraversableLike.scala:238)
        at scala.collection.TraversableLike.map$(TraversableLike.scala:231)
        at scala.collection.AbstractTraversable.map(Traversable.scala:108)
        at org.apache.spark.deploy.k8s.KubernetesUtils$.uploadAndTransformFileUris(KubernetesUtils.scala:247)
        at org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.$anonfun$getAdditionalPodSystemProperties$1(BasicDriverFeatureStep.scala:173)
        at scala.collection.immutable.List.foreach(List.scala:392)
        at org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.getAdditionalPodSystemProperties(BasicDriverFeatureStep.scala:164)
        at org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.$anonfun$buildFromFeatures$3(KubernetesDriverBuilder.scala:60)
        at scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126)
        at scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122)
        at scala.collection.immutable.List.foldLeft(List.scala:89)
        at org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.buildFromFeatures(KubernetesDriverBuilder.scala:58)
        at org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:106)
        at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$3(KubernetesClientApplication.scala:213)
        at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$3$adapted(KubernetesClientApplication.scala:207)
        at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2611)
        at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:207)
        at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:179)
        at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:951)
        at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
        at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
        at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1030)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1039)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
21/05/23 17:33:45 INFO ShutdownHookManager: Shutdown hook called
21/05/23 17:33:45 INFO ShutdownHookManager: Deleting directory /private/var/folders/t2/psknqk615q7chtsr41qymznm0000gp/T/spark-100c4448-32bb-4fac-b5b5-d7a1b20d8525

可能与此错误消息有关:

INFO KerberosConfDriverFeatureStep: You have not specified a krb5.conf file locally or via a ConfigMap. Make sure that you have the krb5.conf locally on the driver image.

我怎样才能解决这个问题?反正我找不到krb5.conf


Tags: thetoorgapacheconfserviceatcollection
1条回答
网友
1楼 · 发布于 2024-05-23 23:29:05

尽管这条消息很神秘,但它实际上想说的是,您没有指定从何处获取依赖项jar。 从official documentation

If your application’s dependencies are all hosted in remote locations like HDFS or HTTP servers, they may be referred to by their appropriate remote URIs. Also, application dependencies can be pre-mounted into custom-built Docker images. Those dependencies can be added to the classpath by referencing them with local:// URIs and/or setting the SPARK_EXTRA_CLASSPATH environment variable in your Dockerfiles. The local:// scheme is also required when referring to dependencies in custom-built Docker images in spark-submit. We support dependencies from the submission client’s local file system using the file:// scheme or without a scheme (using a full path), where the destination should be a Hadoop compatible filesystem.

因此,要解决您的问题,您只需要在jars参数前面加上local://

jars local:///full/path/to/jars/gcs-connector-hadoop2-2.0.1-shaded.jar,local:///full/path/to/jars/spark-bigquery-latest_2.12.jar

相关问题 更多 >