[2503.24235] What, How, Where, and How Well? A Survey on Test-Time Scaling in Large Language Models
[Submitted on 31 Mar 2025 (v1), last revised 16 Apr 2025 (this version, v2)] Authors:Qiyuan Zhang, Fuyuan Lyu, Zexu Sun, […]
[Submitted on 31 Mar 2025 (v1), last revised 16 Apr 2025 (this version, v2)] Authors:Qiyuan Zhang, Fuyuan Lyu, Zexu Sun, […]
arXiv:2504.11792v1 Announce Type: new Abstract: The ability to predict drug overdose risk from a patient’s medical records is crucial for
Large Language Models for Drug Overdose Prediction from Longitudinal Medical Records Read Post »
[Submitted on 16 Feb 2025 (v1), last revised 16 Apr 2025 (this version, v2)] View a PDF of the paper
arXiv:2504.11765v1 Announce Type: new Abstract: Recent large language models (LLMs) face increasing inference latency as input context length and model
[Submitted on 21 Nov 2024 (v1), last revised 15 Apr 2025 (this version, v2)] View a PDF of the paper
arXiv:2504.11741v1 Announce Type: new Abstract: Recent supervised fine-tuning (SFT) approaches have significantly improved language models’ performance on mathematical reasoning tasks,
Climbing the Ladder of Reasoning: What LLMs Can-and Still Can't-Solve after SFT? Read Post »
[Submitted on 21 Oct 2024 (v1), last revised 16 Apr 2025 (this version, v3)] Authors:Jiaxu Li, Kejia Fan, Songning Lai,
Closed-Form Solution for Time Series-oriented Continual Learning Read Post »
arXiv:2504.11524v1 Announce Type: new Abstract: There is growing interest in hypothesis generation with large language models (LLMs). However, fundamental questions
HypoBench: Towards Systematic and Principled Benchmarking for Hypothesis Generation Read Post »
arXiv:2504.11543v1 Announce Type: new Abstract: We introduce REAL, a benchmark and framework for multi-turn agent evaluations on deterministic simulations of
REAL: Benchmarking Autonomous Agents on Deterministic Simulations of Real Websites Read Post »
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website. Both individuals
Structuring Graph-based RAG with Heterogeneous Nodes Read Post »