Skip to content

Text to Image

Video Lecture

Section Video Links
Text to Image Text to Image

Description

We will recreate the basic Text to Image workflow using the v1-5-pruned-emaonly-fp16.safetensors model. This is the optimised version of the Stable Diffusion v1.5 model.

pruned means that this version of the model has had unnecessary parameters removed. This reduces its size and computational cost.

emaonly means that the checkpoint file was generated with the "Exponential Moving Average" (EMA) method, which is often used to improve generalization during training.

The .safetensors extension refers to a model serialization format that is faster and safer than earlier methods. The earlier format with extension .ckpt uses the Python pickle method which can contain arbitrary code, increasing its susceptibility to potential security vulnerabilities.

Recreate the basic text to image workflow from scratch

  • Load Checkpoint : Loads a checkpoint model (e.g., SD 1.5).
  • KSampler : The denoising engine. Uses the prompt, noise, and model to iteratively generate an image in latent space.
  • VAE Decode : Variational Autoencoder. Converts the latent image into a visible RGB image.
  • Save Image : Saves the final generated image to disk.
  • CLIP Text Encode (Positive Prompt) : Encodes your main text prompt into a format the model can use.
  • CLIP Text Encode (Negative Prompt) : Encodes undesired elements (e.g., "blurry, distorted") to help the model avoid them.
  • Empty Latent Image : Creates an initial noise image (latent space) of the desired resolution.

Some Example Prompts

  • a breathtaking alpine valley at sunrise
  • a car on a dusty road
  • a cat on a skateboard
  • a bicycle in amsterdam
  • speeding through a city with bright lights. strobe effect
  • a person reading a newspaper
  • a portrait of a person, in the style of picasso
  • modern architectural buildings with clean lines, beautiful gardens with water features, situated on the edge of a cliff, overlooking the fjords

Workflow embedded in image

Using a compatible browser, you can drag this image into ComfyUI and run the same workflow that generated this actual image.

ComfyUI_00001_.png