AI & ML Paradigm Shift

Gradient boosting exhibits a 'first-mover bias' where correlated features selected early in the tree sequence gain an artificial, self-reinforcing importance in SHAP rankings.

March 25, 2026

Original Paper

First-Mover Bias in Gradient Boosting Explanations: Mechanism, Detection, and Resolution

Drake Caraker, Bryan Arnold, David Rhoads

arXiv · 2603.22346

The Takeaway

This explains why feature importance in popular models like XGBoost is often unstable and misleading under multicollinearity. The proposed DASH method allows researchers to generate far more reliable and stable feature rankings.

From the abstract

We isolate and empirically characterize first-mover bias -- a path-dependent concentration of feature importance caused by sequential residual fitting in gradient boosting -- as a specific mechanistic cause of the well-known instability of SHAP-based feature rankings under multicollinearity. When correlated features compete for early splits, gradient boosting creates a self-reinforcing advantage for whichever feature is selected first: subsequent trees inherit modified residuals that favor the i

Read the original paper →

← Back to today's papers