Audio Models
Audio AI models on LORY
10 audio files available — explore each model's strengths, then try it in your own story project.
ACE-Step (text-to-audio)
Song-focused text-to-audio model that works well for structured style tags and lyric guidance.
Chatterbox Speech-to-Speech
Voice conversion model for transforming source speech while preserving delivery rhythm.
Chatterbox Turbo (TTS)
Fast, lightweight TTS model for rapid voiceover drafts and iterative narration passes.
ElevenLabs Multilingual v2 (TTS)
High-quality multilingual TTS suited for polished narration and character voice lines.
ElevenLabs Music
ElevenLabs Eleven Music (music_v1) — full-track music generation direct from ElevenLabs with vocal or instrumental output, multilingual singing, and 44.1…
ElevenLabs v3 (TTS)
ElevenLabs v3 direct — the most expressive TTS model for cinematic trailers, dramatic narration, and emotional dialogue.
Minimax Music 2.6
Full-track music generator with structured lyrics support — vocal or instrumental, configurable sample rate/format/bitrate, and lyric structure tags…
Sonauto v2 (Text-to-Music)
Music generation model optimized for prompt-driven song ideas with strong style control.
Stable Audio 2.5
General-purpose text-to-audio model for longer-form ambient, score, and sound-design outputs.
Stable Audio 2.5 (Audio-to-Audio)
Audio-to-audio transformation model — restyle a source clip with a target-sound prompt and a strength slider to balance source preservation vs. prompt…