scala - MLlib: Calculating Precision and Recall for multiple threshold values -


i set setting threshold value of logistic regression 0.5 before use scoring. want precision, recall, f1 score value. unfortunately, when try doing threshold values see 1.0 , 0.0. how metrics threshold values other 0 , 1.

for example here o/p:

threshold is: 1.0, precision is: 0.85

threshold is: 0.0, precision is: 0.312641

i don't precision threshold 0.5. here relevant code.

// setting threshold value of logistic regression model here.

model.setthreshold(0.5)  // compute score , generate rdd prediction , label values.   val predictionandlabels = data.map {    case labeledpoint(label, features) => (model.predict(features), label) } 

// want compute precision , recall , other metrics. since have set model threshold 0.5, want pr @ value.

val metrics = new binaryclassificationmetrics(predictionandlabels) val precision = metrics.precisionbythreshold()  precision.foreach {    case (t, p) => {     println(s"threshold is: $t, precision is: $p")      if (t == 0.5) {       println(s"desired: threshold is: $t, precision is: $p")             } } 

the precisionbythreshold() method trying different thresholds , giving corresponding precision values. since thresholded data, have 0s , 1s.

let's have: [0 0 0 1 1 1] after thresholding , real labels [f f f f t t].

then thresholding 0 have [t t t t t t] gives 4 false positive , 2 true positive hence precision of 2 / (2 + 4) = 1/3

now thresholding 1 have [f f f t t t] , gives 1 false positive , 2 true positive hence precision of 2 /(2 + 1) = 2/3

you can see using threshold of .5 give [f f f t t t], same thresholding 1, precision threshold 1 looking for.

this bit confusing because thresholded predictions. if not threshold predictions, , let's had [.3 .4 .4 .6 .8 .9] (to stay consistent [0 0 0 1 1 1] have been using).

then precisionbythreshold() give precisions values threshold 0, .3, .4, .6 .8 .9, because these threshold giving different results , different precisions, , value threshold .5 still take value next larger threshold (.6) because again, give same predictions hence same precision.


Comments

Popular posts from this blog

how to insert data php javascript mysql with multiple array session 2 -

multithreading - Exception in Application constructor -

windows - CertCreateCertificateContext returns CRYPT_E_ASN1_BADTAG / 8009310b -