Discriminative feature extraction and rolling element bearing failure diagnostics are very important to ensure the reliability of rotating machines. Therefore, in this paper, we propose multi-scale wavelet Shannon entropy as a discriminative fault feature to improve the diagnosis accuracy of bearing fault under variable work conditions. To compute the multi-scale wavelet entropy, we consider integrating stationary wavelet packet transform with both dispersion (SWPDE) and permutation (SWPPE) entropies. The multi-scale entropy features extracted by our proposed methods are then passed on to the kernel extreme learning machine (KELM) classifier to diagnose bearing failure types with different severities. In the end, both the SWPDE-KELM and the SWPPE-KELM methods are evaluated on two bearing vibration signal databases. We compare these two feature extraction methods to a recently proposed method called stationary wavelet packet singular value entropy (SWPSVE). Based on our results, we can say that the diagnosis accuracy obtained by the SWPDE-KELM method is slightly better than the SWPPE-KELM method and they both significantly outperform the SWPSVE-KELM method.