VideoImage-to-Video

Grok Imagine – Reference to Video

Image-to-Video model by xAI — US.

4/5 LORY rating
About this model

Grok Imagine reference-to-video is useful when one image is not enough. You can build a single shot from several reference images, combining subject, setting, and style cues in one clip, with synced audio included.

CapabilityImage-to-Video
ProviderxAI
OriginUS
OutputVideo
ModesGenerate
Model details

Practical specs for planning a generation in LORY. These details come from the model contract we use when routing a request.

InputsPrompt required · Up to 7 source images · Up to 7 reference images
OutputMP4 · Native audio
Resolution480P, 720P
Aspect ratios16:9, 4:3, 3:2, 1:1, 2:3, 3:4, 9:16
Duration1s, 2s, 3s, 4s, 5s, 6s, 7s, 8s +2 more
Creative controlsPrompt up to 3,500 characters

Try Grok Imagine – Reference to Video on LORY

Start with free welcome credits — no subscription, and we never ask for payment info during your trial. Pay only when you decide to top up.

Similar modelsMore video models you might like
VideoImage-to-Video

Grok Imagine – Image to Video

Image-to-video via xAI's Grok Imagine with native audio (1-15s, up to 720p).

xAI · USView details
VideoImage-to-Video

Veo 3.1 Lite – Image to Video

Animate a single image into a short Veo 3.1 Lite clip with optional native audio.

Google · USView details
VideoImage-to-Video

LTX-2.3 Fast – Image to Video

LTX-2.3 Fast image-to-video with native audio, up to 20s, and up to 4K resolution.

Lightricks · IsraelView details
VideoImage-to-Video

LTX-2.3 Pro – Image to Video

LTX-2.3 Pro image-to-video: higher fidelity with better motion stability and visual detail. Up to 10s, up to 4K, native audio. Best for final renders.

Lightricks · IsraelView details
VideoImage-to-Video

Vidu Q3 – Image to Video

Vidu Q3 image-to-video with optional end-frame transitions and native audio.

ShengShu Technology · ChinaView details
VideoImage-to-Video

Seedance 1.5 Pro – Image to Video

Seedance image-to-video with start/end frame conditioning, camera control, and native audio.

ByteDance · ChinaView details
Try on LORY