Many cognitive processes rely on the ability of the brain to hold sequences of events in short-term memory. Recent studies have revealed that such memory can be read out from the transient dynamics of a network of neurons. However, the memory performance of such a network in buffering past information has been rigorously estimated only in networks of linear neurons. When signal gain is kept low, so that neurons operate primarily in the linear part of their response nonlinearity, the memory lifetime is bounded by the square root of the network size. In this work, I demonstrate that it is possible to achieve a memory lifetime almost proportional to the network size, “an extensive memory lifetime,” when the nonlinearity of neurons is appropriately used. The analysis of neural activity revealed that nonlinear dynamics prevented the accumulation of noise by partially removing noise in each time step. With this error-correcting mechanism, I demonstrate that a memory lifetime of order can be achieved.