Direct Preference Optimization Python

FPO: Fine-Grained Preference Optimization Improves Zero-Shot Text-to-Speech

Abstract: Integrating reinforcement learning to align generated speech with human preferences has proven effective in improving the robustness of modern text-to-speech (TTS) systems. Current ...

IEEE

MDPO: Multi-Granularity Direct Preference Optimization for Mathematical Reasoning

Abstract: Mathematical reasoning presents a significant challenge for Large Language Models (LLMs) as it requires ensuring the correctness of each reasoning step. Researchers have been strengthening ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

FPO: Fine-Grained Preference Optimization Improves Zero-Shot Text-to-Speech

MDPO: Multi-Granularity Direct Preference Optimization for Mathematical Reasoning

Trending now