DiffusionBlocks title animation Left: Standard Training updates all blocks together with one backpropagation through every layer. Right: DiffusionBlocks samples one block per step — that block stays highlighted while the others dim, making clear only one block is trained at a time. A continuous noise band runs alongside the network with overlapping windows showing each block's noise range. The inset shows the sampled block keeps the same input to output structure as standard training, with noise added to its input. Standard Training output backprop input trains all blocks together Block-wise Training Our method DiffusionBlocks — B× memory reduction output input trains one block at a time small noise large noise sample one block output input noisy output one training step — this block only