DiffusionBlocks title animation
Left: Standard Training updates all blocks together with one backpropagation through every layer. Right: DiffusionBlocks samples one block per step — that block stays highlighted while the others dim, making clear only one block is trained at a time. A continuous noise band runs alongside the network with overlapping windows showing each block's noise range. The inset shows the sampled block keeps the same input to output structure as standard training, with noise added to its input.
Standard Training
output
backprop
input
trains all blocks together
Block-wise Training
Our method DiffusionBlocks — B× memory reduction
output
input
trains one block at a time
small noise
large noise
sample one block
output
input
noisy
output
one training step
— this block only