PySpark 2.4.4在ipython 3.7中遇到JNI错误问题

2024-04-20 06:04:10 发布

您现在位置:Python中文网/ 问答频道 /正文

我下载了最新的Spark,只是在配置上做了一些改动

spark-env.sh中的更改:

PYSPARK_PYTHON=/data/software/miniconda3/bin/ipython

当我运行pyspark时,它引发了如下错误。 错误日志:

Python 3.7.3 (default, Mar 27 2019, 22:11:17) 
Type 'copyright', 'credits' or 'license' for more information
IPython 7.6.1 -- An enhanced Interactive Python. Type '?' for help.
Error: A JNI error has occurred, please check your installation and try again
Exception in thread "main" java.lang.NoClassDefFoundError: org/slf4j/Logger
    at java.lang.Class.getDeclaredMethods0(Native Method)
    at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
    at java.lang.Class.privateGetMethodRecursive(Class.java:3048)
    at java.lang.Class.getMethod0(Class.java:3018)
    at java.lang.Class.getMethod(Class.java:1784)
    at sun.launcher.LauncherHelper.validateMainClass(LauncherHelper.java:544)
    at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:526)
Caused by: java.lang.ClassNotFoundException: org.slf4j.Logger
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    ... 7 more
[TerminalIPythonApp] WARNING | Unknown error in handling PYTHONSTARTUP file /data/software/spark-2.4.4-bin-without-hadoop/python/pyspark/shell.py:
---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
/data/software/miniconda3/lib/python3.7/site-packages/IPython/core/shellapp.py in _exec_file(self, fname, shell_futures)
    338                                                  self.shell.user_ns,
    339                                                  shell_futures=shell_futures,
--> 340                                                  raise_exceptions=True)
    341         finally:
    342             sys.argv = save_argv

/data/software/miniconda3/lib/python3.7/site-packages/IPython/core/interactiveshell.py in safe_execfile(self, fname, exit_ignore, raise_exceptions, shell_futures, *where)
   2716                 py3compat.execfile(
   2717                     fname, glob, loc,
-> 2718                     self.compile if shell_futures else None)
   2719             except SystemExit as status:
   2720                 # If the call was made with 0 or None exit status (sys.exit(0)

/data/software/miniconda3/lib/python3.7/site-packages/IPython/utils/py3compat.py in execfile(fname, glob, loc, compiler)
    186     with open(fname, 'rb') as f:
    187         compiler = compiler or compile
--> 188         exec(compiler(f.read(), fname, 'exec'), glob, loc)
    189 
    190 # Refactor print statements in doctests.

/data/software/spark-2.4.4-bin-without-hadoop/python/pyspark/shell.py in <module>
     36     SparkContext.setSystemProperty("spark.executor.uri", os.environ["SPARK_EXECUTOR_URI"])
     37 
---> 38 SparkContext._ensure_initialized()
     39 
     40 try:

/data/software/spark-2.4.4-bin-without-hadoop/python/pyspark/context.py in _ensure_initialized(cls, instance, gateway, conf)
    314         with SparkContext._lock:
    315             if not SparkContext._gateway:
--> 316                 SparkContext._gateway = gateway or launch_gateway(conf)
    317                 SparkContext._jvm = SparkContext._gateway.jvm
    318 

/data/software/spark-2.4.4-bin-without-hadoop/python/pyspark/java_gateway.py in launch_gateway(conf)
     44     :return: a JVM gateway
     45     """
---> 46     return _launch_gateway(conf)
     47 
     48 

/data/software/spark-2.4.4-bin-without-hadoop/python/pyspark/java_gateway.py in _launch_gateway(conf, insecure)
    106 
    107             if not os.path.isfile(conn_info_file):
--> 108                 raise Exception("Java gateway process exited before sending its port number")
    109 
    110             with open(conn_info_file, "rb") as info:

Exception: Java gateway process exited before sending its port number

In [1]: exit                                                                                                                                                                                    
dennis@device2:/data/software/spark-2.4.4-bin-without-hadoop/conf$ java -version
java version "1.8.0_121"
Java(TM) SE Runtime Environment (build 1.8.0_121-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode)
dennis@device2:/data/software/spark-2.4.4-bin-without-hadoop/conf$ export PYSPARK_SUBMIT_ARGS="--master local[2] pyspark-shell"
dennis@device2:/data/software/spark-2.4.4-bin-without-hadoop/conf$ pyspark 
Python 3.7.3 (default, Mar 27 2019, 22:11:17) 
Type 'copyright', 'credits' or 'license' for more information
IPython 7.6.1 -- An enhanced Interactive Python. Type '?' for help.
Error: A JNI error has occurred, please check your installation and try again
Exception in thread "main" java.lang.NoClassDefFoundError: org/slf4j/Logger
    at java.lang.Class.getDeclaredMethods0(Native Method)
    at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
    at java.lang.Class.privateGetMethodRecursive(Class.java:3048)
    at java.lang.Class.getMethod0(Class.java:3018)
    at java.lang.Class.getMethod(Class.java:1784)
    at sun.launcher.LauncherHelper.validateMainClass(LauncherHelper.java:544)
    at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:526)
Caused by: java.lang.ClassNotFoundException: org.slf4j.Logger
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    ... 7 more
[TerminalIPythonApp] WARNING | Unknown error in handling PYTHONSTARTUP file /data/software/spark-2.4.4-bin-without-hadoop/python/pyspark/shell.py:
---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
/data/software/miniconda3/lib/python3.7/site-packages/IPython/core/shellapp.py in _exec_file(self, fname, shell_futures)
    338                                                  self.shell.user_ns,
    339                                                  shell_futures=shell_futures,
--> 340                                                  raise_exceptions=True)
    341         finally:
    342             sys.argv = save_argv

/data/software/miniconda3/lib/python3.7/site-packages/IPython/core/interactiveshell.py in safe_execfile(self, fname, exit_ignore, raise_exceptions, shell_futures, *where)
   2716                 py3compat.execfile(
   2717                     fname, glob, loc,
-> 2718                     self.compile if shell_futures else None)
   2719             except SystemExit as status:
   2720                 # If the call was made with 0 or None exit status (sys.exit(0)

/data/software/miniconda3/lib/python3.7/site-packages/IPython/utils/py3compat.py in execfile(fname, glob, loc, compiler)
    186     with open(fname, 'rb') as f:
    187         compiler = compiler or compile
--> 188         exec(compiler(f.read(), fname, 'exec'), glob, loc)
    189 
    190 # Refactor print statements in doctests.

/data/software/spark-2.4.4-bin-without-hadoop/python/pyspark/shell.py in <module>
     36     SparkContext.setSystemProperty("spark.executor.uri", os.environ["SPARK_EXECUTOR_URI"])
     37 
---> 38 SparkContext._ensure_initialized()
     39 
     40 try:

/data/software/spark-2.4.4-bin-without-hadoop/python/pyspark/context.py in _ensure_initialized(cls, instance, gateway, conf)
    314         with SparkContext._lock:
    315             if not SparkContext._gateway:
--> 316                 SparkContext._gateway = gateway or launch_gateway(conf)
    317                 SparkContext._jvm = SparkContext._gateway.jvm
    318 

/data/software/spark-2.4.4-bin-without-hadoop/python/pyspark/java_gateway.py in launch_gateway(conf)
     44     :return: a JVM gateway
     45     """
---> 46     return _launch_gateway(conf)
     47 
     48 

/data/software/spark-2.4.4-bin-without-hadoop/python/pyspark/java_gateway.py in _launch_gateway(conf, insecure)
    106 
    107             if not os.path.isfile(conn_info_file):
--> 108                 raise Exception("Java gateway process exited before sending its port number")
    109 
    110             with open(conn_info_file, "rb") as info:

Exception: Java gateway process exited before sending its port number

环境:

爪哇:

java version "1.8.0_121"
Java(TM) SE Runtime Environment (build 1.8.0_121-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode)

Spark版本是:park-2.4.4-bin-without-hadoop

Hadoop是3.0.0 (CDH-6.2.0)


Tags: inpyhadooplangdatabinsoftwarejava
1条回答
网友
1楼 · 发布于 2024-04-20 06:04:10

首先,发生异常不是因为ipython 3.7,而是因为Spark在初始化SparkContext时(在本例中是在启动pyspark期间),在类路径中找不到这个类org.slf4j.Logger

根据您的描述,您正在使用Spark的“无hadoop”构建,而Spark依赖于Hadoop,因此您需要根据Spark的文档明确告诉Spark从何处获取hadoop的包JAR:https://spark.apache.org/docs/latest/hadoop-provided.html,我认为我们上面提到的类在某种程度上与这些JAR绑定,因此Spark无法找到它

您可以尝试两种解决方案:

  1. 尝试更新spark-env.sh中的SPARK_DIST_CLASSPATH,以明确告诉Spark在哪里可以找到与Hadoop相关的JAR(如果您的机器中有Hadoop)

  2. 如果您的计算机中没有Hadoop,请尝试使用此构建spark-2.4.4-bin-hadoop2.7.tgz。在这个版本中,与Hadoop相关的JAR已经与Spark一起组装,所以您不必担心这个问题

相关问题 更多 >