=========== Performance =========== -------- Overview -------- * Both models perform better (on average) than the NBA's model, with a particular advantage in early game prediction. * The Lifelines model outperforms the XGBoost model. For generating player ratings, we will use the Lifelines model. ----------- Performance ----------- .. important:: At this time the models have been trained on data from the 2005-06 through the 2019-20 seasons (pre-bubble). The final build dataset had 10 765 games with 1 188 655 rows; the tuning/stopping datasets had 3 589 games with 396 863 rows; the holdout dataset had 3 589 games with 396 189 rows. Figure 1 shows the AUROC over game-time for each model. .. image:: ../_static/auroc.png :align: center :alt: Figure 1 Figure 2 directly shows the AUROC lift of each survival model against the NBA win probability model. .. image:: ../_static/auroc_lift.png :align: center :alt: Figure 2 Overall, the average AUROC lift for each model is summarized below: +-----------+---------------+--------------------------------+ | Model | Average AUROC | Percentage lift over NBA model | | | | | +===========+===============+================================+ | XGBoost | 0.808 | 3.3% | +-----------+---------------+--------------------------------+ | Lifelines | 0.831 | 6.315% | +-----------+---------------+--------------------------------+ --------------------- Model Characteristics --------------------- ~~~~~~~~~ Lifelines ~~~~~~~~~ Figure 3 shows the hyperparameter tuning results for the ``lifelines`` model. The tuning was done using 1 000 evaluations. .. image:: ../_static/lifelines-tuning.png :align: center :alt: Figure 3 Tuning led to the following final hyperparameters: +----------------+----------------------+ | Hyperparameter | Value | | | | +================+======================+ | ``l1_ratio`` | 0.007994777879269076 | +----------------+----------------------+ | ``penalizer`` | 0.09127606625097757 | +----------------+----------------------+ Isotonic regression produced the following calibration plot: .. image:: ../_static/lifelines-calibration.png :align: center :alt: Figure 4 ~~~~~~~ XGBoost ~~~~~~~ Figure 5 shows the hyperparameter tuning results for the ``xgboost`` model. The tuning was done using 1 000 evaluations. .. image:: ../_static/xgboost-tuning.png :align: center :alt: Figure 5 Tuning led to the following hyperparameters: +--------------------------+--------------------------------------------+ | Hyperparameter | Value | | | | +==========================+============================================+ | ``colsample_bylevel`` | 1 | +--------------------------+--------------------------------------------+ | ``colsample_bynode`` | 0.15221691031911938 | +--------------------------+--------------------------------------------+ | ``colsample_bytree`` | 0.6308916896893483 | +--------------------------+--------------------------------------------+ | ``gamma`` | 0.8083332824721229 | +--------------------------+--------------------------------------------+ | ``learning_rate`` | 0.0006959999989275942 | +--------------------------+--------------------------------------------+ | ``max_delta_step`` | 1 | +--------------------------+--------------------------------------------+ | ``max_depth`` | 4 | +--------------------------+--------------------------------------------+ | ``min_child_weight`` | 518 | +--------------------------+--------------------------------------------+ | ``monotone_constraints`` | (0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0) | +--------------------------+--------------------------------------------+ | ``reg_alpha`` | 0.8166483270728037 | +--------------------------+--------------------------------------------+ | ``reg_lambda`` | 0.2533343088849453 | +--------------------------+--------------------------------------------+ | ``subsample`` | 0.5043820990853263 | +--------------------------+--------------------------------------------+ Isotonic regression produced the following calibration plot: .. image:: ../_static/xgboost-calibration.png :align: center :alt: Figure 6 Since XGBoost doesn't produce directly interpretable coefficients like a linear model, we will use `SHAP `_ to produce feature importances: .. image:: ../_static/xgboost-shap.png :align: center :alt: Figure 7