Skip to content

Stable Video Diffusion

Video Lecture

Section Video Links
Stable Video Diffusion Stable Video Diffusion Stable Video Diffusion 

 (Pay Per View)

You can use PayPal to purchase a one time viewing of this video for $1.49 USD.

Pay Per View Terms

  • One viewing session of this video will cost the equivalent of $1.49 USD in your currency.
  • After successful purchase, the video will automatically start playing.
  • You can pause, replay and go fullscreen as many times as needed in one single session for up to an hour.
  • Do not refresh the browser since it will invalidate the session.
  • If you want longer-term access to all videos, consider purchasing full access through Udemy or YouTube Memberships instead.
  • This Pay Per View option does not permit downloading this video for later viewing or sharing.
  • All videos are Copyright © 2019-2025 Sean Bradley, all rights reserved.
Video Timings 00:00 Introduces Stable Video Diffusion (SVD) for converting still images to video
00:15 Explains downloading SVD models from Civitai
00:40 Demonstrates setting up the img2vid checkpoint loader in Comfy UI
01:30 Details workflow configuration including VAE decode and WebP output
02:00 Notes optimal SVD output is 1024x576 at 14 frames, 6 frames/sec
02:50 Shows examples; SVD is effective at separating foreground and background
04:00 Explains using "motion bucket ID" (127 often optimal) to control movement
04:45 Discusses augmentation level for removing noise, potentially affecting quality
05:30 Demonstrates cropping images to guide SVD generation focus
07:00 Highlights model naming differences between Civitai and Hugging Face downloads

Description

Stable Video Diffusion (SVD) Image-to-Video is a diffusion model that takes in a still image as a conditioning frame, and generates a video from it.

Platform Links
Civitai img2vid | img2vid-xt | img2vid-xt-1.1
HuggingFace img2vid | img2vid-xt | img2vid-xt-1.1

The base img2vid model was trained to generate 14 frames at 1024x576.

img2vid-xt was trained to generate 25 frames at 1024x576.

img2vid-xt-1.1 is a more finely tuned version of img2vid-xt.

Important

CivitAI and Hugging Face use different file names for all the img2vid files. When you use them in ComfyUI, you will need to remember which file name was saved whether you got the m,opdel from CivitAI or Hugging Face

SVD_img2vid_Conditioning

The SVD_img2vid_Conditioning node controls the motion behavior during image-to-video generation.

For best quality, the width and height should be 1024x576. You can also get get results using 576x1024.

The Frames should be 14 when using img2vid, and 25 when using img2vid-xt or img2vid-xt-1.1.

The Motion Bucket ID is default 127 and normally produces adequate results. You can change the value between 0 to 255. The value refers to a pre selected set of discrete "motion buckets" that the model was trained on. The value controls the intensity and complexity of motion in the generated video. Lower numbers will make the movement appear more static, verses higher numbers more dramatic. But numbers higher the 127 tend to produce more unstable results.

The Augmentation Level is another factor that can effect the final camera shifting, cropping, colours, contrast, gaussian noise injection, texture distortion. Higher numbers tend to filter out noises more so details, such as skin texture can appear more smoothed.

Sample Input Images

"a car on a dusty road", SD1.5, 512x512

a car on a dusty road

"coral reef", 515-inpainting, 768x768

fish coral reef

"A traditional english village, tilt-shift photography", Flux Schnell, 1024x1024

English Village Tilt Focus

"a person with freckels reading a newspaper", Flux Schnell, 1024x1024

Girl with freckels reading newspaper

"photo of a person wearing a high tech scifi armor", SD3.5, 1024x1024

Cyborg

"a fighter pilots view of flying at street level between city buildings in the sunset", Flux Schnell, 1024x1024

Jet Flying Thru Street

"photograph beautiful scenery nature mountains alps river rapids snow sky cumulus clouds", Flux Schnell, 1024x1024

Rapids