metadata – Burk et al. (2024)

Measures

The following table provides a brief overview of the measures used in this benchmark. Unfortunately, the identifiers used under the hood do not directly correspond to measure names and abbreviations as used in the paper.

ID refers to the shorthand used in the result files listed above.
mlr3 ID refers to the measure as it is implemented in mlr3proba
Label refers to the measure as it is named consistently throughout the paper and resulting plots.

ID	mlr3 ID	Label
`harrell_c`	`surv.cindex`	Harrell’s C
`uno_c`	`surv.cindex`	Uno’s C
`isll`	`surv.isll`	Integrated Survival Log-Likelihood (ISLL)
`isll_erv`	`surv.isll`	Integrated Survival Log-Likelihood (ISLL) [ERV]
`isbs`	`surv.brier`	Integrated Survival Brier Score (ISBS)
`isbs_erv`	`surv.brier`	Integrated Survival Brier Score (ISBS) [ERV]
`dcalib`	`surv.dcalib`	D-Calibration
`alpha_calib`	`surv.calib_alpha`	Van Houwelingen’s Alpha

Tasks

The following table gives a summary of the included datasets (tasks) in the benchmark.

Code

# tasks = load_task_data()
tasktab = load_tasktab()

ℹ Loading '/home/burk/projects/paper_2023_survival_benchmark/tables/tasktab.csv'

Code

tasktab |>
  dplyr::select(task_id, n, p, events, censprop, n_uniq_t) |>
  dplyr::arrange(-n) |>
  dplyr::mutate(
    n = n,
    p = p,
    events = events,
    censprop = round(100 * censprop, 1),
    n_uniq_t = n_uniq_t,
    repeats = assign_repeats(events)
  ) |>
  setNames(c("Dataset", "N", "p", "Events", "Censoring %", "# Unique Time Points", "# CV Repeats")) |>
  reactable::reactable(sortable = TRUE, filterable = TRUE, pagination = FALSE)

Learners

This table shows the models (learners) used in the benchmark with their mlr3 IDs and additional metadata.

“Parameters” is 0 for learners such as KM, NA, CPH, which do not have any hyperparameters. It is also 0 for CoxBoost, which uses its own tuning method.
“Survival Prediction” indicates whether the learner provides a survival probability prediction (a distr object), which allows evaluation with measures like the ISBS.
“Internal CV” indicates learners internally perform CV, i.e., GLMN (via cv.glmnet) and CoxB (via optimCoxBoostPenalty()).
“Exhaustive Search” indicates whether the tuning space was small enough to perform exhaustive grid search with fewer than 50 evaluations
“Scale” analogously indicates whether scaling to unit variance and 0 mean is performed.
“Encode” indicates whether treatment encoding is performed as part of the pre-processing pipeline before the learner sees the data.

Code

lrntbab = load_lrntab()

ℹ Loading '/home/burk/projects/paper_2023_survival_benchmark/tables/learners.csv'

Code

lrntab |>
  dplyr::select(id, base_lrn, surv_pred, params, internal_cv, grid, scale, encode) |>
  dplyr::mutate(dplyr::across(dplyr::where(is.logical), \(x) ifelse(x, "\u2705", ""))) |>
  kableExtra::kbl(
    align = "llccccc",
    caption = "Learner IDs in benchmark with associated mlr3 identifiers",
    col.names = c(
      "Learner",
      "mlr3 ID",
      "Survival Prediction",
      "Parameters",
      "Internal CV",
      "Exhaustive Search",
      "Scale",
      "Encode"
    )
  ) |>
  kableExtra::kable_styling(bootstrap_options = c("striped", "hover")) |>
  kableExtra::column_spec(2, width = "5%") |>
  kableExtra::column_spec(3, width = "5%") |>
  kableExtra::column_spec(4, width = "20%") |>
  kableExtra::column_spec(7, width = "25%")

Learner IDs in benchmark with associated mlr3 identifiers
Learner	mlr3 ID	Survival Prediction	Parameters	Internal CV	Exhaustive Search	Scale	Encode
KM	surv.kaplan	✅	0
NEL	surv.nelson	✅	0
AK	surv.akritas	✅	1
CPH	surv.coxph	✅	0
GLMN	surv.cv_glmnet	✅	1	✅
Pen	surv.penalized	✅	2
AFT	surv.parametric	✅	1		✅
Flex	surv.flexible	✅	1		✅
RFSRC	surv.rfsrc	✅	5
RAN	surv.ranger	✅	5
CIF	surv.cforest	✅	5
ORSF	surv.aorsf	✅	2
RRT	surv.rpart		1		✅
MBSTCox	surv.mboost	✅	4
MBSTAFT	surv.mboost		4
CoxB	surv.cv_coxboost	✅	0	✅			✅
XGBCox	surv.xgboost.cox	✅	5				✅
XGBAFT	surv.xgboost.aft		7				✅
SSVM	surv.svm		4			✅	✅