Consider a self-motivated artificial agent who is exploring a complex environment. Part of the complexity is due to the raw high-dimensional sensory input streams, which the agent needs to make sense of. Such inputs can be compactly encoded through a variety of means; one of these is slow feature analysis (SFA). Slow features encode spatiotemporal regularities, which are information-rich explanatory factors (latent variables) underlying the high-dimensional input streams. In our previous work, we have shown how slow features can be learned incrementally, while the agent explores its world, and modularly, such that different sets of features are learned for different parts of the environment (since a single set of regularities does not explain everything). In what order should the agent explore the different parts of the environment? Following Schmidhuber’s theory of artificial curiosity, the agent should always concentrate on the area where it can learn the easiest-to-learn set of features that it has not already learned. We formalize this learning problem and theoretically show that, using our model, called curiosity-driven modular incremental slow feature analysis, the agent on average will learn slow feature representations in order of increasing learning difficulty, under certain mild conditions. We provide experimental results to support the theoretical analysis.