In the companion letter in this issue (“Bayesian Spiking Neurons I: Inference”), we showed that the dynamics of spiking neurons can be interpreted as a form of Bayesian integration, accumulating evidence over time about events in the external world or the body. We proceed to develop a theory of Bayesian learning in spiking neural networks, where the neurons learn to recognize temporal dynamics of their synaptic inputs. Meanwhile, successive layers of neurons learn hierarchical causal models for the sensory input. The corresponding learning rule is local, spike-time dependent, and highly nonlinear. This approach provides a principled description of spiking and plasticity rules maximizing information transfer, while limiting the number of costly spikes, between successive layers of neurons.