Gradient boosting exhibits a 'first-mover bias' where correlated features selected early in the tree sequence gain an artificial, self-reinforcing importance in SHAP rankings.
March 25, 2026
Original Paper
First-Mover Bias in Gradient Boosting Explanations: Mechanism, Detection, and Resolution
arXiv · 2603.22346
The Takeaway
This explains why feature importance in popular models like XGBoost is often unstable and misleading under multicollinearity. The proposed DASH method allows researchers to generate far more reliable and stable feature rankings.
From the abstract
We isolate and empirically characterize first-mover bias -- a path-dependent concentration of feature importance caused by sequential residual fitting in gradient boosting -- as a specific mechanistic cause of the well-known instability of SHAP-based feature rankings under multicollinearity. When correlated features compete for early splits, gradient boosting creates a self-reinforcing advantage for whichever feature is selected first: subsequent trees inherit modified residuals that favor the i