AI-SearchPlanner: Modular Agentic Search via Pareto-Optimal Multi-Objective Reinforcement Learning
Paper
•
2508.20368
•
Published
Artificial Intelligence
ROOT: Robust Orthogonalized Optimizer for Neural Network Training
Revealing the Power of Post-Training for Small Language Models via Knowledge Distillation