The majority of recently demonstrated Deep-Learning Side-Channel Attacks (DLSCAs) use neural networks trained on a segment of traces containing operations only related to the target subkey. However, when the number of training traces are restricted such as in this paper only 5K power traces, deep-learning models always suffer from underfitting since the insufficient training data. One data-level solution is called data augmentation, which is to use the additional synthetically modified traces to act as a regularizer to provide a better generalization capacity for deep-learning models. In this paper, we propose a cross-subkey training approach which acts as a trace augmentation. We train deep-learning models not only on a segment of traces containing the SBox operation of the target subkey of AES-128, but also on segments for other 15 subkeys. Experimental results show that the accuracy of the subkey combination training model is 28.20% higher than that of the individual subkey training model on trajectories captured in the microcontroller implementation of the STM32F3 with AES-128. At the same time, the number of traces that need to be captured when the model is trained is greatly reduced, demonstrating the effectiveness and practicality of the method.