Preference-Based Optimization with Human Feedback
[Submitted on 24 Jul 2024 (v1), last revised 16 Apr 2025 (this version, v2)] View a PDF of the paper […]
Preference-Based Optimization with Human Feedback Read Post »
[Submitted on 24 Jul 2024 (v1), last revised 16 Apr 2025 (this version, v2)] View a PDF of the paper […]
Preference-Based Optimization with Human Feedback Read Post »
[Submitted on 4 Apr 2025 (v1), last revised 15 Apr 2025 (this version, v3)] View a PDF of the paper
Scaling Deep Research via Reinforcement Learning in Real-world Environments Read Post »
arXiv:2504.11453v1 Announce Type: cross Abstract: Progress in offline reinforcement learning (RL) has been impeded by ambiguous problem definitions and entangled
A Clean Slate for Offline Reinforcement Learning Read Post »
[Submitted on 22 Nov 2024 (v1), last revised 15 Apr 2025 (this version, v2)] View a PDF of the paper
[2411.15244] Adversarial Prompt Distillation for Vision-Language Models Read Post »
[Submitted on 10 Jul 2024 (v1), last revised 15 Apr 2025 (this version, v2)] View a PDF of the paper
[2407.07612] Teaching Transformers Causal Reasoning through Axiomatic Training Read Post »
[Submitted on 7 Apr 2025 (v1), last revised 14 Apr 2025 (this version, v2)] View a PDF of the paper
[2504.05408] Frontier AI’s Impact on the Cybersecurity Landscape Read Post »
[Submitted on 9 Feb 2025 (v1), last revised 15 Apr 2025 (this version, v3)] View a PDF of the paper
Efficient Evolutionary Merging on Consumer-grade GPUs Read Post »
[Submitted on 15 Apr 2025] View a PDF of the paper titled Single-Input Multi-Output Model Merging: Leveraging Foundation Models for
Leveraging Foundation Models for Dense Multi-Task Learning Read Post »
[Submitted on 15 Dec 2024 (v1), last revised 14 Apr 2025 (this version, v2)] Authors:Elham Kiyani (1), Manav Manav (2),
[Submitted on 15 Aug 2024 (v1), last revised 12 Apr 2025 (this version, v4)] Authors:Andy K. Zhang, Neil Perry, Riya
A Framework for Evaluating Cybersecurity Capabilities and Risks of Language Models Read Post »