Title: | Creates the MCC-F1 Curve and Calculates the MCC-F1 Metric and the Best Threshold |
---|---|
Description: | The MCC-F1 analysis is a method to evaluate the performance of binary classifications. The MCC-F1 curve is more reliable than the Receiver Operating Characteristic (ROC) curve and the Precision-Recall (PR)curve under imbalanced ground truth. The MCC-F1 analysis also provides the MCC-F1 metric that integrates classifier performance over varying thresholds, and the best threshold of binary classification. |
Authors: | Chang Cao [aut, cre], Michael Hoffman [aut], Davide Chicco [aut] |
Maintainer: | Chang Cao <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.1 |
Built: | 2025-03-16 04:34:22 UTC |
Source: | https://github.com/cran/mccf1 |
'autoplot.mccf1()' plots the MCC-F1 curve using ggplot2.
## S3 method for class 'mccf1' autoplot(object, xlab = "F1 score", ylab = "normalized MCC", ...)
## S3 method for class 'mccf1' autoplot(object, xlab = "F1 score", ylab = "normalized MCC", ...)
object |
S3 object of class "mccf1" from the 'mccf1()' |
xlab , ylab
|
x- and y- axis annotation (default: "F1 score","normalized MCC") |
... |
further arguments passed to and from method 'ggplot()' |
the ggplots object
response <- c(rep(1, 1000), rep(0, 10000)) predictor <- c(rbeta(300, 12, 2), rbeta(700, 3, 4), rbeta(10000, 2, 3)) autoplot(mccf1(response, predictor))
response <- c(rep(1, 1000), rep(0, 10000)) predictor <- c(rbeta(300, 12, 2), rbeta(700, 3, 4), rbeta(10000, 2, 3)) autoplot(mccf1(response, predictor))
'mccf1()' performs MCC (Matthews correlation coefficient)-F1 analysis for paired vectors of binary response classes and fractional prediction scores representing the performance of a binary classification task.
mccf1(response, predictor)
mccf1(response, predictor)
response |
numeric vector representing ground truth classes (0 or 1). |
predictor |
numeric vector representing prediction scores (in the range [0,1]). |
S3 object of class "mccf1", a list with the following members: 'thresholds': vector of doubles describing the thresholds; 'normalized_mcc': vector of doubles representing normalized MCC for each threshold; 'f1': vector of doubles representing F1 for each threshold.
response <- c(rep(1L, 1000L), rep(0L, 10000L)) set.seed(2017) predictor <- c(rbeta(300L, 12, 2), rbeta(700L, 3, 4), rbeta(10000L, 2, 3)) x <- mccf1(response, predictor) head(x$thresholds) # [1] Inf 0.9935354 0.9931493 0.9930786 0.9925507 0.9900520 head(x$normalized_mcc) # [1] NaN 0.5150763 0.5213220 0.5261152 0.5301566 0.5337177 head(x$f1) # [1] NaN 0.001998002 0.003992016 0.005982054 0.007968127 0.009950249
response <- c(rep(1L, 1000L), rep(0L, 10000L)) set.seed(2017) predictor <- c(rbeta(300L, 12, 2), rbeta(700L, 3, 4), rbeta(10000L, 2, 3)) x <- mccf1(response, predictor) head(x$thresholds) # [1] Inf 0.9935354 0.9931493 0.9930786 0.9925507 0.9900520 head(x$normalized_mcc) # [1] NaN 0.5150763 0.5213220 0.5261152 0.5301566 0.5337177 head(x$f1) # [1] NaN 0.001998002 0.003992016 0.005982054 0.007968127 0.009950249
'summary.mccf1()' calculates the MCC-F1 metric and the best threshold for a binary classification.
## S3 method for class 'mccf1' summary(object, digits, bins = 100, ...)
## S3 method for class 'mccf1' summary(object, digits, bins = 100, ...)
object |
S3 object of class "mccf1" object resulting from the function 'mccf1()' |
digits |
integer, used for number formatting with |
bins |
integer, representing number of bins used to divide up the range of normalized MCC when calculating the MCC-F1 metric (default = 100L) |
... |
other arguments ignored (for compatibility with generic) |
data.frame that shows the MCC-F1 metric (in the range [0,1]) and the best threshold (in the range [0,1])
response <- c(rep(1L, 1000L), rep(0L, 10000L)) set.seed(2017) predictor <- c(rbeta(300L, 12, 2), rbeta(700L, 3, 4), rbeta(10000L, 2, 3)) ## Not run: summary(mccf1(response, predictor)) # mccf1_metric best_threshold # 0.3508904 0.786905 summary(mccf1(response, predictor), bins = 50) # mccf1_metric best_threshold # 0.3432971 0.786905 ## Not run: summary(mccf1(response, predictor), digits = 3) # mccf1_metric best_threshold # 0.351 0.787
response <- c(rep(1L, 1000L), rep(0L, 10000L)) set.seed(2017) predictor <- c(rbeta(300L, 12, 2), rbeta(700L, 3, 4), rbeta(10000L, 2, 3)) ## Not run: summary(mccf1(response, predictor)) # mccf1_metric best_threshold # 0.3508904 0.786905 summary(mccf1(response, predictor), bins = 50) # mccf1_metric best_threshold # 0.3432971 0.786905 ## Not run: summary(mccf1(response, predictor), digits = 3) # mccf1_metric best_threshold # 0.351 0.787