scala - How do I use Spark's Feature Importance on Random Forest? -
the documentation random forests not include feature importances. however, listed on jira resolved , in source code. here says "the main differences between api , original mllib ensembles api are:
- support dataframes , ml pipelines
- separation of classification vs. regression
- use of dataframe metadata distinguish continuous , categorical features
- more functionality random forests: estimates of feature importance, predicted probability of each class (a.k.a. class conditional probabilities) classification."
however, cannot figure out syntax works call new feature.
scala> model res13: org.apache.spark.mllib.tree.model.randomforestmodel = treeensemblemodel classifier 10 trees scala> model.featureimportances <console>:60: error: value featureimportances not member of org.apache.spark.mllib.tree.model.randomforestmodel model.featureimportances
you have use new random forests. check imports. old:
import org.apache.spark.mllib.tree.randomforest import org.apache.spark.mllib.tree.model.randomforestmodel the new random forests use:
import org.apache.spark.ml.classification.randomforestclassificationmodel import org.apache.spark.ml.classification.randomforestclassifier
Comments
Post a Comment