What is it about?

This research paper presents a novel approach to optimize non-differentiable machine learning models by using differentiable surrogate models. The authors address a fundamental challenge in optimization: while modern machine learning algorithms like XGBoost and random forests excel at prediction accuracy, they are non-differentiable or discontinuous, making traditional gradient-based optimization methods inapplicable. The proposed method combines two models: XGBoost for accurate predictions and neural networks as differentiable surrogates to provide gradient information. During optimization, the neural network's gradients guide the SLSQP (Sequential Least Squares Programming) optimizer to find optimal solutions, while the final predictions are obtained using XGBoost's superior accuracy. The researchers tested their approach on three classical benchmark functions (Rosenbrock, Levy, and Rastrigin) with varying dimensions and constraint conditions, as well as a real-world steel alloy optimization problem. The experiments consistently demonstrated that the differentiable surrogate model achieved solutions up to 40% better than traditional methods while reducing computation time by orders of magnitude. The method maintained near-zero constraint violations across all test cases, even as problem complexity increased. The framework offers a practical solution for optimizing non-differentiable machine learning models and can be extended to other tree-based ensemble algorithms like LightGBM and random forests, providing flexibility while preserving optimization advantages.

Featured Image

Why is it important?

Bridges a critical gap: It resolves the fundamental tension between model accuracy (non-differentiable models like XGBoost) and optimization efficiency (requiring gradients) Practical applications: Enables efficient optimization in real-world engineering and industrial applications where accurate predictions and optimal solutions are both crucial Computational efficiency: Dramatically reduces optimization time from minutes to seconds while improving solution quality Constraint handling: Maintains excellent constraint satisfaction, critical for real-world applications with strict physical or regulatory requirements Broad applicability: The methodology can be extended to various tree-based ensemble algorithms, making it widely useful across different domains

Perspectives

Scalability challenges: The method requires training two models, increasing initial computational costs, which may limit applications in time-constrained scenarios Architecture sensitivity: Performance depends on neural network architecture and training parameters, requiring careful tuning Extension opportunities: Multi-objective optimization problems Adaptive training strategies to reduce computational burden Applications in diverse domains like process optimization and structural design Industrial impact: The successful application to steel alloy optimization demonstrates practical value, suggesting potential for broader industrial adoption Theoretical development: Further research could explore theoretical convergence guarantees and optimal surrogate model selection strategies

Shikun Chen
Ningbo University of Finance and Economics

Read the Original

This page is a summary of: Optimization of non-smooth functions via differentiable surrogates, PLOS One, May 2025, PLOS,
DOI: 10.1371/journal.pone.0321862.
You can read the full text:

Read
Open access logo

Contributors

The following have contributed to this page