scala - How do I use Spark's Feature Importance on Random Forest? -


the documentation random forests not include feature importances. however, listed on jira resolved , in source code. here says "the main differences between api , original mllib ensembles api are:

  • support dataframes , ml pipelines
  • separation of classification vs. regression
  • use of dataframe metadata distinguish continuous , categorical features
  • more functionality random forests: estimates of feature importance, predicted probability of each class (a.k.a. class conditional probabilities) classification."

however, cannot figure out syntax works call new feature.

scala> model res13: org.apache.spark.mllib.tree.model.randomforestmodel =  treeensemblemodel classifier 10 trees  scala> model.featureimportances <console>:60: error: value featureimportances not member of org.apache.spark.mllib.tree.model.randomforestmodel               model.featureimportances 

you have use new random forests. check imports. old:

import org.apache.spark.mllib.tree.randomforest import org.apache.spark.mllib.tree.model.randomforestmodel 

the new random forests use:

import org.apache.spark.ml.classification.randomforestclassificationmodel import org.apache.spark.ml.classification.randomforestclassifier 

Comments

Popular posts from this blog

how to insert data php javascript mysql with multiple array session 2 -

multithreading - Exception in Application constructor -

windows - CertCreateCertificateContext returns CRYPT_E_ASN1_BADTAG / 8009310b -