ACE-Step – Text to Audio
Text-to-Audio model by ACE Studio — China.
Song-focused text-to-audio model that works well for structured style tags and lyric guidance.
Practical specs for planning a generation in LORY. These details come from the model contract we use when routing a request.
Try ACE-Step – Text to Audio on LORY
Start with free welcome credits — no subscription, and we never ask for payment info during your trial. Pay only when you decide to top up.
MiniMax Music 2.6
Full-track music generation with optional structured lyrics, vocal or instrumental output, and configurable audio settings.
Stable Audio 2.5
Text-to-audio generations for full-length music and SFX (up to ~3 minutes).
ElevenLabs Music
ElevenLabs Eleven Music (music_v1) — full-track music generation with vocals or instrumental, multilingual singing, and 44.1 kHz studio-quality output.
Stable Audio 2.5 – Audio to Audio
Audio-to-audio transformation with prompt-driven restyling and a strength control to preserve or replace the source.
Eleven Multilingual v2 – Text to Speech
High-quality multilingual text-to-speech by ElevenLabs with 21 preset voices, style control, and speed adjustment.
Chatterbox – Speech to Speech
Voice conversion from a source clip with an optional target voice reference.