Spiking neural networks (SNNs) with the event-driven manner of transmitting spikes consume ultra-low power on neuromorphic chips. However, training deep SNNs is still challenging compared to convolutional neural networks (CNNs). The SNN training algorithms have not achieved the same performance as CNNs. In this letter, we aim to understand the intrinsic limitations of SNN training to design better algorithms. First, the pros and cons of typical SNN training algorithms are analyzed. Then it is found that the spatiotemporal backpropagation algorithm (STBP) has potential in training deep SNNs due to its simplicity and fast convergence. Later, the main bottlenecks of the STBP algorithm are analyzed, and three conditions for training deep SNNs with the STBP algorithm are derived. By analyzing the connection between CNNs and SNNs, we propose a weight initialization algorithm to satisfy the three conditions. Moreover, we propose an error minimization method and a modified loss function to further improve the training performance. Experimental results show that the proposed method achieves 91.53% accuracy on the CIFAR10 data set with 1% accuracy increase over the STBP algorithm and decreases the training epochs on the MNIST data set to 15 epochs (over 13 times speed-up compared to the STBP algorithm). The proposed method also decreases classification latency by over 25 times compared to the CNN-SNN conversion algorithms. In addition, the proposed method works robustly for very deep SNNs, while the STBP algorithm fails in a 19-layer SNN.