Appendum

Which Nvidia GPU supports which Data Type

Nvidia GPU	Data Types
50 series (blackwell)	fp16, bf16, fp8, fp4
40 series (ada)	fp16, bf16, fp8
30 series (ampere)	fp16, bf16
20 series (turing)	fp16
10 series (pascal) and below	only slow full precision fp32.

FP32 Version of Stable Diffusion 1.5 Pruned EMAOnly

Section	Video Links
Using ComfyUI with a 10 Series Nvidia GPU

If using a 10 Series NVidia GPU, modern generative AI will not be an enjoyable experience.

Many of the earlier lessons in this course use the Stable Diffusion 1.5 Pruned EMAOnly FP16 model.

If you have a 10 series Nvidia card, then you will be very limited in choices, since many AI models are released in FP16 format.

There is a FP32 version of SD1.5 Pruned EMAOnly that you can try instead.

It is twice as large to download and load into memory than the FP16 version, but may be faster to use if your 10 Series has enough VRAM.

SD1.5 Pruned EMAOnly Version	filesize	Link
FP16	2.13 GB	Files and versions (huggingface)
FP32	4.27 GB	Files and versions (huggingface)

Why is FP16 not advisable on 10 series GPUs

10 Series GPUs don't have native FP16 acceleration, so they need to emulate FP16 and this will cause extra resource management on the GPU.

10 Series don't have tensor cores, which are specialized hardware units designed to accelerate FP16 operations.

On 10-series GPUs, FP16 operations fall back to standard CUDA cores, which are optimized for FP32.

Which 10 Series GPUs can I use

GPU	VRAM	Notes
GTX 1050 / Ti	2–4 GB	Might crash with FP32 due to VRAM limits.
GTX 1060 3GB	3 GB	Might crash with FP32 due to VRAM limits.
GTX 1060 6GB	6 GB	May just barely run FP32 SD1.5; FP16 helps memory, not speed.
GTX 1070	8 GB	Can run FP32 model more comfortably. FP16 model saves memory but doesn’t improve speed.
GTX 1080	8 GB	Can run FP32 model more comfortably. FP16 model saves memory but doesn’t improve speed.
GTX 1080 Ti	11 GB	Runs FP32 fine; FP16 likely slightly slower due to conversion overhead.

Recommended KSampler Latent Image Input Sizes

In order to get the best results from the KSampler when using a particular checkpoint, then it is important to consider the input latent-image dimensions.

Below is a table that shows recommended latent image input dimensions for some popular checkpoints.

Model	Training Image Resolution	Ideal KSampler Input Size	Approx. VRAM Required
SD 1.5	512×512	512×512	~4 GB
SD 2.1	512×512 or 768×768	512×512 or 768×768	~6 GB
SDXL	1024×1024	1024×1024 (Other SDXL Sizes)	~8–12 GB
SD 3.5	~1024×1024	1024×1024 (dynamic sizes)	~12–16 GB
FLUX.1 Schnell	~1024×1024	1024×1024 (dynamic sizes)	~13–33 GB
FLUX.1 Dev	~1024×1024	1024×1024 (dynamic sizes)	~23–24 GB (FP16)
DreamShaper 8	512x512 (SD1.5 Base)	1024x1024	~4-5 GB
AbsoluteReality	512x512 (SD1.5 Base)	1024x1024	~8-12 GB
DreamShaper XL	1024x1024 (SDXL Base)	1024x1024 (Other SDXL Sizes)	~10-14 GB

Other SDXL Sizes

Dimension	Ratio
1024x1024	(1:1)
1152x896	(9:7)
896x1152	(7:9)
1216x832	(19:13)
832x1216	(13:19)
1344x768	(7:4)
768x1344	(4:7)
1536x640	(12:5)
640x1536	(5:12)

Useful Links

v1-5-pruned-emaonly-fp16.safetensors (huggingface)

v1-5-pruned-emaonly.safetensors FP32 (huggingface)

SD2.1 512x512 (huggingface)

SD2.1 768x768 (huggingface)

SDXL (huggingface)

SD3.5 (huggingface)

FLUX Dev FP8 (huggingface)

FLUX Schnell FP8 (huggingface)

Dreamshaper 8

AbsoluteReality

Dreamshaper XL

What’s the Difference Between Single-, Double-, Multi- and Mixed-Precision Computing? (Nvidia Blog)

Design Patterns in Python
		Kindle Edition Paperback
Design Patterns in TypeScript
		Kindle Edition Paperback

Design Patterns in Python
		Kindle Edition Paperback
Design Patterns in TypeScript
		Kindle Edition Paperback