scala - ClassNotFoundException for Spark job on Yarn-cluster mode -
so trying run spark job on yarn-cluster mode kicked off via oozie workflow, have been encountering following error (relevant stacktrace below)
java.sql.sqlexception: error 103 (08004): unable establish connection. @ org.apache.phoenix.exception.sqlexceptioncode$factory$1.newexception(sqlexceptioncode.java:388) @ org.apache.phoenix.exception.sqlexceptioninfo.buildexception(sqlexceptioninfo.java:145) @ org.apache.phoenix.query.connectionqueryservicesimpl.openconnection(connectionqueryservicesimpl.java:296) @ org.apache.phoenix.query.connectionqueryservicesimpl.access$300(connectionqueryservicesimpl.java:179) @ org.apache.phoenix.query.connectionqueryservicesimpl$12.call(connectionqueryservicesimpl.java:1917) @ org.apache.phoenix.query.connectionqueryservicesimpl$12.call(connectionqueryservicesimpl.java:1896) @ org.apache.phoenix.util.phoenixcontextexecutor.call(phoenixcontextexecutor.java:77) @ org.apache.phoenix.query.connectionqueryservicesimpl.init(connectionqueryservicesimpl.java:1896) @ org.apache.phoenix.jdbc.phoenixdriver.getconnectionqueryservices(phoenixdriver.java:180) @ org.apache.phoenix.jdbc.phoenixembeddeddriver.connect(phoenixembeddeddriver.java:132) @ org.apache.phoenix.jdbc.phoenixdriver.connect(phoenixdriver.java:151) @ java.sql.drivermanager.getconnection(drivermanager.java:664) @ java.sql.drivermanager.getconnection(drivermanager.java:208) ... caused by: java.io.ioexception: java.lang.reflect.invocationtargetexception @ org.apache.hadoop.hbase.client.connectionfactory.createconnection(connectionfactory.java:240) @ org.apache.hadoop.hbase.client.connectionmanager.createconnection(connectionmanager.java:414) @ org.apache.hadoop.hbase.client.connectionmanager.createconnectioninternal(connectionmanager.java:323) @ org.apache.hadoop.hbase.client.hconnectionmanager.createconnection(hconnectionmanager.java:144) @ org.apache.phoenix.query.hconnectionfactory$hconnectionfactoryimpl.createconnection(hconnectionfactory.java:47) @ org.apache.phoenix.query.connectionqueryservicesimpl.openconnection(connectionqueryservicesimpl.java:294) ... 28 more caused by: java.lang.reflect.invocationtargetexception @ sun.reflect.nativeconstructoraccessorimpl.newinstance0(native method) @ sun.reflect.nativeconstructoraccessorimpl.newinstance(nativeconstructoraccessorimpl.java:62) @ sun.reflect.delegatingconstructoraccessorimpl.newinstance(delegatingconstructoraccessorimpl.java:45) @ java.lang.reflect.constructor.newinstance(constructor.java:422) @ org.apache.hadoop.hbase.client.connectionfactory.createconnection(connectionfactory.java:238) ... 33 more caused by: java.lang.unsupportedoperationexception: unable find org.apache.hadoop.hbase.ipc.controller.clientrpccontrollerfactory @ org.apache.hadoop.hbase.util.reflectionutils.instantiatewithcustomctor(reflectionutils.java:36) @ org.apache.hadoop.hbase.ipc.rpccontrollerfactory.instantiate(rpccontrollerfactory.java:58) @ org.apache.hadoop.hbase.client.connectionmanager$hconnectionimplementation.createasyncprocess(connectionmanager.java:2317) @ org.apache.hadoop.hbase.client.connectionmanager$hconnectionimplementation.<init>(connectionmanager.java:688) @ org.apache.hadoop.hbase.client.connectionmanager$hconnectionimplementation.<init>(connectionmanager.java:630) ... 38 more caused by: java.lang.classnotfoundexception: org.apache.hadoop.hbase.ipc.controller.clientrpccontrollerfactory @ java.net.urlclassloader.findclass(urlclassloader.java:381) @ java.lang.classloader.loadclass(classloader.java:424) @ sun.misc.launcher$appclassloader.loadclass(launcher.java:331) @ java.lang.classloader.loadclass(classloader.java:357) @ java.lang.class.forname0(native method) @ java.lang.class.forname(class.java:264) @ org.apache.hadoop.hbase.util.reflectionutils.instantiatewithcustomctor(reflectionutils.java:32) ... 42 more some background information:
- the job runs on spark 1.4.1 (specified correct spark.yarn.jar field in spark.conf file).
- oozie.libpath set hdfs directory in jar of program resides.
- org.apache.hadoop.hbase.ipc.controller.clientrpccontrollerfactory, class not found, exists in phoenix-4.5.1-hbase-1.0-client.jar. i've specified jar in spark.driver.extraclasspath , spark.executor.extraclasspath in spark.conf file. i've added phoenix-core dependency in pom file, class exists in shaded project jar well.
observations far:
- adding field in spark.conf file spark.driver.userclasspathfirst , setting true gets rid of classnotfound exception. however, prevents me initializing spark context (null pointer exception). googling around seems including field messes classpaths, may not way go since cannot initialize spark context way.
- i noticed in oozie stdout log, not see classpath of phoenix jar. maybe reason spark.driver.extraclasspath , spark.executor.extraclasspath aren't picking jar extraclasspath? know i'm specifying correct jar file path, other jobs have spark.conf files same parameters.
- i found hacky way make phoenix jar show in classpath (in oozie stdout log) copying jar same directory program jar resides. works whether or not spark.executor.extraclasspath changed point new jar location. however, classnotfound exception persists, though see clientrpccontrollerfactory jar when unzip jar)
other things i've tried:
- i tried using sparkconf.setjars() , sparkcontext.addjar() methods, still encountered same error
- added jar in spark.driver.extraclasspath field in job properties file, hasn't seemed (spark docs indicated field necessary when running in client mode, may not relevant case)
any help/ideas/suggestions appreciated.
i use cdh 5.5.1 + phoenix 4.5.2 (both installed parcels) , faced same problem. think problem disappeared after switched client mode. can't verify because getting other error cluster mode now.
i tried trace phoenix source code , found interesting things. hope java / scala expert identify root cause.
- the
phoenixdriverclass loaded. showed jar found initially. after layers of class loader / context switch (?), jar lost classpath. if
class.forname()non-existing class in program, there no need callsun.misc.launcher$appclassloader.loadclass(launcher.java:331). stack like:java.lang.classnotfoundexception: nonexistingclass @ java.net.urlclassloader.findclass(urlclassloader.java:381) @ java.lang.classloader.loadclass(classloader.java:424) @ java.lang.classloader.loadclass(classloader.java:357) @ java.lang.class.forname0(native method) @ java.lang.class.forname(class.java:264)i copied phoenix code program testing. still
classnotfoundexcpetionif callconnectionqueryservicesimpl.init (connectionqueryservicesimpl.java:1896). however, callconnectionqueryservicesimpl.openconnection (connectionqueryservicesimpl.java:296)returned usable hbase connection. seemsphoenixcontextexecutorcausing loss of jar, don't know how.
source code of cloudera phoenix 4.5.2 : https://github.com/cloudera-labs/phoenix/blob/phoenix1-4.5.2_1.2.0/phoenix-core/src/main/java/org/apache/
(not sure whether should post comment... have no reputation anyway)
Comments
Post a Comment