Sigmoid Scaling Lets Teams Predict RL Post-Training Returns for LLMs
‘New research shows RL post-training progress follows sigmoid compute-performance curves and presents ScaleRL, a recipe validated up to 100k GPU-hours for predictable scaling of LLMs.’