AI Models
186 models · 3 new in 60d
- ▾Seedance 2.0 Pro
ByteDance · N/A · credit-based → per-second
Best for: Cost-sensitive Chinese-market video, fast iteration on social shorts, longer narrative clips.
How: Two tiers — Seedance 2.0 Pro for quality and Seedance 2.0 Lite for fast/cheap drafts. Both expose text-to-video and image-to-video; v2 adds longer shot length and stronger prompt adherence over v1.
Example: POST volcengineapi.com/seedance/v2/videos { prompt: 'a hummingbird in flight, slow motion', mode: 'pro' }
high-fidelity 1080p (with 4K up-res)longer coherent shotsimproved motion physicsLite tier for cheap iterationAPI: Volcengine / ByteDance API
Successor to Seedance 1.0 (2025-06). ByteDance's competitor to Sora / Veo / Kling. The Lite tier remains notably cheaper than competitors at comparable quality.
- ▾Sora 2
OpenAI · N/A · see openai.com/pricing → per-second tiered
Best for: Marketing reels, b-roll, storyboard previz, social-media shorts.
How: Generate up to 60s clips from a text prompt or seed image. Audio and lip-sync included.
Example: client.videos.generate(model='sora-2', prompt='aerial shot of a coastal city at sunrise, 1080p, 10s')
high-fidelity 1080p videorealistic motion physicslong-shot consistencyaudio + dialogue generationAPI: api.openai.com — client.videos.generate(model='sora-2')
Successor to Sora 1 — adds native audio and longer coherent shots.
- ▾Wan 2.2Open
Alibaba · N/A · self-host
Best for: Best-in-class open-source video. The 5B variant runs on a single 24GB consumer card.
How: ComfyUI nodes ship official support. Or `python generate.py --task t2v-A14B --prompt '...'` from the WanX repo.
Example: python generate.py --task t2v-A14B --prompt 'a corgi running on the beach at sunset' --resolution 720P
MoE video architectureopen weightsT2V + I2V5B small variant for consumer GPUs720p outputHardware to self-hostVRAM: 24GB (5B variant) / 80GB (A14B)GPU: RTX 4090 for 5B · H100 for A14BRAM: 32–64GB system RAM5B model fits a single 4090. A14B MoE delivers Sora-class quality but needs an H100 or 2× 4090 with offload.
API: huggingface.co/Wan-AI/Wan2.2-T2V-A14B · Wan-AI/Wan2.2-T2V-5B
- ▾Veo 3
Google · N/A · Vertex AI pricing → per-second tiered
Best for: Photoreal cinematic clips, ad creative, talking-head shorts with audio.
How: Vertex AI: generate(model='veo-3.0-generate-preview', prompt='...'). Gemini API exposes the same model.
Example: ai.models.generate_videos(model='veo-3.0-generate-preview', prompt='timelapse of a city under heavy rain')
1080p / 4K up-ressynchronized audiostrong prompt adherence8s native, longer with stitchingAPI: Vertex AI / Gemini API — model: veo-3.0-generate-preview
- ▾Kling 2.1
Kuaishou · N/A · credit-based → per-second
Best for: Cost-sensitive video generation, dance / sports content.
How: Cheaper alternative to Sora/Veo with strong human-motion fidelity.
Example: POST klingai.com/v1/videos/text2video { prompt: '...', duration: 10 }
realistic human motionlong shot generationcompetitive quality at lower priceAPI: klingai.com — REST API
- ▾Runway Gen-4
Runway · N/A · credit-based → per-second
Best for: Short narrative content where the same character appears in multiple scenes.
How: Pass a reference image and a prompt; returns a 5–10s clip with consistent characters across shots.
Example: POST runwayml.com/v1/image_to_video { promptImage: ..., promptText: '...' }
character & object consistency across shotsimage-to-videolip-synccreator workflowAPI: runwayml.com — REST API + web app
- ▾Step-Video-T2VOpen
StepFun · N/A · self-host
Best for: Highest-quality open-source video model when you have the hardware to run it.
How: Clone stepfun-ai/Step-Video-T2V repo, install requirements, run sample.py with your prompt.
Example: python sample_video.py --prompt 'underwater coral reef, schools of fish' --num-frames 204 --resolution 544x992
30B paramsMIT licensecompetitive with Sora-class qualitydeep compression video VAEHardware to self-hostVRAM: 80GB (FP16) / 40GB (FP8)GPU: H100 80GB · or 2× A100 40GB · or A100 40GB with FP8RAM: 128GB system RAM30B is genuinely heavy but the MIT license + quality tradeoff is uniquely permissive in the open-source video space.
API: huggingface.co/stepfun-ai/stepvideo-t2v
- ▾Pika 2.2
Pika Labs · N/A · credit-based → per-clip
Best for: Social shorts, music videos, rapid creative iteration.
How: Strong for short, stylized clips and quick iteration. Pikaframes lets you set start/end frames.
Example: Use Pikaframes: upload start + end image, prompt the in-between motion.
fast iterationpikaframes (keyframe interpolation)lipsyncAPI: pika.art — web app + API
- ▾HunyuanVideoOpen
Tencent · N/A · self-host
Best for: Self-hosted video generation, research, building custom pipelines.
How: Clone repo, install diffusers, run sample.py with your prompt. Or use ComfyUI workflows.
Example: python sample_video.py --prompt 'a cat surfing at sunset' --video-length 129 --infer-steps 50
fully open weights13B paramscompetitive qualityfine-tunableHardware to self-hostVRAM: 60GBGPU: H100 80GB or 2× RTX 4090 (with offload)RAM: 64GB system RAMQuantized to 8-bit fits on a single A100 40GB. Comfy workflows offload UNet to CPU at the cost of speed.
API: huggingface.co/tencent/HunyuanVideo
- ▾LTX-VideoOpen
Lightricks · N/A · self-host
Best for: When you need quick turnaround — prototypes, drafts, dataset generation. Speed-first open-source video.
How: diffusers LTXPipeline. Generates a 5-second 768×512 clip in seconds on an H100; a few minutes on a 4090.
Example: from diffusers import LTXPipeline; pipe = LTXPipeline.from_pretrained('Lightricks/LTX-Video'); pipe(prompt='falling autumn leaves').frames
real-time generation on H1002B params (small footprint)fast iteration loop13B variant for higher qualityHardware to self-hostVRAM: 12GB (2B) / 24GB (13B)GPU: RTX 3090 / 4090 for 2B · H100 for 13BRAM: 32GB system RAMSmallest/fastest of the open-source video models — great for iterating on prompts before committing GPU time to bigger models.
API: huggingface.co/Lightricks/LTX-Video · Lightricks/LTX-Video-13B
- ▾Mochi 1Open
Genmo · N/A · self-host
Best for: Open-source video where commercial use matters — the Apache 2.0 license is unrestricted.
How: Diffusers pipeline or the official genmoai/models repo. ComfyUI workflows are well-documented.
Example: python -m mochi_preview.cli --prompt 'time lapse of a city street at golden hour' --num-frames 84
Apache 2.0 (commercial-friendly)10B paramshigh motion fidelityactive community fine-tunesHardware to self-hostVRAM: 60GB (full precision) / 24GB with quantizationGPU: H100 80GB · or RTX 4090 with FP8 quant + offloadRAM: 64GB system RAMMemory-hungry at full precision but FP8 / GGUF quants from the community fit a single 4090.
API: huggingface.co/genmo/mochi-1-preview
- ▾CogVideoX-5BOpen
THUDM / Zhipu AI · N/A · self-host
Best for: Pioneer open-source T2V — solid baseline for self-hosted experimentation and fine-tuning.
How: pip install diffusers; CogVideoXPipeline.from_pretrained('THUDM/CogVideoX-5b') and run with a text prompt.
Example: from diffusers import CogVideoXPipeline; pipe = CogVideoXPipeline.from_pretrained('THUDM/CogVideoX-5b'); pipe(prompt='a panda playing piano').frames
fully open weights5B paramsfits on a single 24GB carddiffusers integrationHardware to self-hostVRAM: 18GB (with CPU offload) / 24GB nativeGPU: RTX 4090 24GBRAM: 32GB system RAMQuantized + offload tricks let it run on 12GB. Slower than newer entries but the most fine-tuned-on open video model.
API: huggingface.co/THUDM/CogVideoX-5b
- ▾SCAIL 2NewOpen
zai-org · self-host
Best for: Trending on HuggingFace (191 likes this week)
How: Available on Hugging Face.
Example: from transformers import AutoModelForCausalLM; model = AutoModelForCausalLM.from_pretrained("zai-org/SCAIL-2")
diffuserscharacter-animationvideo-generationpose-drivendiffusionAPI: huggingface.co/zai-org/SCAIL-2
Auto-discovered from HuggingFace trending. 191 likes, 0 downloads.
- ▾SANA WM_bidirectionalNewOpen
Efficient-Large-Model · self-host
Best for: Trending on HuggingFace (86 likes this week)
How: Available on Hugging Face.
Example: from transformers import AutoModelForCausalLM; model = AutoModelForCausalLM.from_pretrained("Efficient-Large-Model/SANA-WM_bidirectional")
diffuserssafetensorstext-to-videoimage-to-videocamera-controlAPI: huggingface.co/Efficient-Large-Model/SANA-WM_bidirectional
Auto-discovered from HuggingFace trending. 86 likes, 0 downloads.
- ▾Sulphur 2 BaseNewOpen
SulphurAI · self-host
Best for: Trending on HuggingFace (1537 likes this week)
How: Available on Hugging Face. 1666K downloads.
Example: from transformers import AutoModelForCausalLM; model = AutoModelForCausalLM.from_pretrained("SulphurAI/Sulphur-2-base")
diffusersgguftext-to-videobase_model:Lightricks/LTX-2.3base_model:quantized:Lightricks/LTX-2.3API: huggingface.co/SulphurAI/Sulphur-2-base
Auto-discovered from HuggingFace trending. 1537 likes, 1.7M downloads.
- ▾LTX 2.3 WorkflowsOpen
RuneXX · self-host
Best for: Trending on HuggingFace (564 likes this week)
How: Available on Hugging Face.
Example: from transformers import AutoModelForCausalLM; model = AutoModelForCausalLM.from_pretrained("RuneXX/LTX-2.3-Workflows")
ltxltx-2comfyuicomfyggufAPI: huggingface.co/RuneXX/LTX-2.3-Workflows
Auto-discovered from HuggingFace trending. 564 likes, 0 downloads.
- ▾LTX2.3 10ErosOpen
TenStrip · self-host
Best for: Trending on HuggingFace (281 likes this week)
How: Available on Hugging Face. 136K downloads.
Example: from transformers import AutoModelForCausalLM; model = AutoModelForCausalLM.from_pretrained("TenStrip/LTX2.3-10Eros")
diffusersimage-to-videoregion:usAPI: huggingface.co/TenStrip/LTX2.3-10Eros
Auto-discovered from HuggingFace trending. 281 likes, 136K downloads.
- ▾NVIDIA DLSS 4.5
NVIDIA · N/A · api
Best for: AI-powered game development
How: Integrate NVIDIA DLSS 4.5 with Unreal Engine 5
Example: Game developers can enhance game performance and visuals
Dynamic Multi Frame GenerationMulti Frame Generation 6Xsecond-generation RTXAuto-discovered from news articles.
- ▾Seedance 1.0 Pro
ByteDance · N/A · credit-based → per-second
Best for: Cost-sensitive Chinese-market video, fast iteration on social shorts.
How: Two tiers: Seedance 1.0 Pro for top quality and Seedance 1.0 Lite for fast/cheap drafts. Both expose text-to-video and image-to-video.
Example: POST volcengineapi.com/seedance/v1/videos { prompt: 'a hummingbird in flight, slow motion', mode: 'pro' }
fast generation1080p outputstrong prompt adherenceLite variant for cheap iterationAPI: Volcengine / ByteDance API
ByteDance's competitor to Sora / Veo / Kling. Lite tier is notably cheaper than competitors at similar quality.