Total views : 498

Multi-Layer Neural Network Auto Encoders Learning Method, using Regularization for Invariant Image Recognition


  • PAWLIN Technologies Ltd., Dubna, Moscow Region, Russian Federation


Background/Objectives: This paper proposes a new type of regularization for deep learning neural networks that is capable of explicit separation of the lower dimensional hidden layer input pattern representation into two components: class information component and transform component. Methods: Currently, researchers involved in pattern recognition problems are actively searching for the replacement of deterministic feature extraction algorithms by unsupervised methods capable of generating optimal domain-specific image features during the training process of auto-associative multilayer neural networks. The result of the training process of the deep neural network with a “bottleneck” hidden layer is the task-oriented encoder capable of efficient input signal dimensionality reduction. Findings: Many important useful properties of the encoder including the degree of invariance of the feature extraction to input signal transformations (perturbations) greatly depend on the particular form of the regularization applied. In addition to the regular weight decay smoothing component the suggested regularization has two additional components: the first one minimizes the spread of the class-describing features under different pattern transforms and the other component minimizes the spread of the transformation description features for the objects with same perturbations but from the different classes. Class-membership information from the training sequence is used along with the introduced estimator of the similarity of pattern transform to compute the regularization terms. The research reveals that a private case of the suggested regularization corresponds to the well-known Frobenius norm of Jacobian matrix of the encoder activations, therefore the contribution of this paper can be seen as a non-local extension of the encoder Jacobian-based family of deep neural network regularizers embedding invariance to non-local input pattern transformations into the deep neural network feature extraction pipeline. Experiments carried out on the synthetic and real pattern datasets show promising results and encourage further investigation of the proposed approach. Improvements/Applications: This method can be used for areal images recognition invariant to lighting, weather and orientation, for example for the recognition of vehicles and other landmarks in the images obtained by the unmanned aerial vehicles (UAV).


Auto Encoder, Deep Neural Networks, Invariant Image Recognition, Regularization.

Full Text:

 |  (PDF views: 299)


  • Deng L, Yu D. Deep Learning: Methods and Applications, Foundations and Trends® in Signal Processing: 2014; 7(3–4):197-387. Available from:
  • Rainer L, Maydt J. An extended set of Haar-like features for rapid object detection. IEEE, Proceedings of International Conference on Image Processing. 2002; 1:I-900-I-903. DOI: 10.1109/ICIP.2002.1038171.
  • Lowe DG. Object recognition from local scale-invariant features. Proceedings of the 7th IEEE international conference on Computer Vision, IEEE. 1999; 2:1150-57. DOI: 10.1109/ICCV.1999.790410
  • Bay H, Tuytelaars T, Van Gool L. Speeded-up robust features (SURF). Computer Vision and Image Understanding. 2008; 110(3):346-59. DOI: 10.1016/j.cviu.2007.09.014.
  • Ojala T, Pietikainen M, Maenpaa T. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2002; 24(7):971-87. DOI: 10.1109/TPAMI.2002.1017623.
  • Dalal N, Triggs B. Histograms of oriented gradients for human detection. San Diego, United States: International Conference on Computer Vision & Pattern Recognition (CVPR ’05), Jun 2005. IEEE Computer Society. 2005; 1:886-93.
  • Matas J, Chum O, Urban M, Pajdla T. Robust wide-baseline stereo from maximally stable extremal regions. Image and vision computing. 2004; 22(10):761-67.
  • Tuytelaars T, Mikolajczyk K. (2008) Local Invariant Feature Detectors: A Survey. Found. Trends. Comput. Graph. Vis. 2008 July; 3(3):177-280. DOI: 10.1561/0600000017.
  • Bishop CM. (1995). New York: Oxford University Press, Inc.: Neural networks for pattern recognition. 1995. ISBN:0198538642.
  • Szegedy C, Toshev A, Erhan D. Deep Neural Networks for Object Detection. Advances in Neural Information Processing Systems. 2013; 26:2553-61.
  • Kramer MA. Nonlinear principal component analysis using auto associative neural networks. AIChE Journal. 1991; 37(2):233-43. DOI: 10.1002/aic.690370209.
  • Krogh A, Hertz JA. A Simple Weight Decay Can Improve Generalization. Moody JE, Hanson SJ, Lippmann RP, (Eds.). San Mateo: Advances in Neural Information Processing Systems. 1992; 4:950-57.
  • Ranzato M’A, Boureau Y-L, LeCun Y. Sparse feature learning for deep belief networks. Advances in Neural Information Processing Systems. 2007; 20:1185-92.
  • Hanlin G, Thome N, Cord M, Lim J-H. Top-Down Regularization of Deep Belief Networks. Advances in Neural Information Processing Systems. 2013; 26:1878-86.
  • Hinton G. A practical guide to training restricted Boltzmann machines. Momentum. 2010; 9(1):926-43.
  • Hanlin G, Kusmierz L, Lim J-H, Thome N, Cord M. Learning Invariant Color Features with Sparse Topographic Restricted Boltzmann Machines. Belgium: Proceedings of 18th IEEE International Conference on Image Processing. 2011; p. 1241-44. DOI: 10.1109/ICIP.2011.6115657.
  • Yu K, Xu W, Gong Y. (2009) Deep learning with kernel regularization for visual recognition. Advances in Neural Information Processing Systems. 2008; 1889-96.
  • Rifai S, Vincent P, Muller X, Glorot X, Bengio Y. Contractive auto-encoders: Explicit invariance during feature extraction. Proceedings of the 28th International Conference on Machine Learning (ICML-11). 2011. Date accessed: 17/03/2015: Available from:
  • Tahri O, Chaumette F. Complex objects pose estimation based on image moment invariants. Proceedings of the 2005 IEEE International Conference on Robotics and Automation (ICRA). 2005; p.438-43.
  • Tran DT, Lee J-H. A Robust Method for Head Orientation Estimation Using Histogram of Oriented Gradients. Proceedings of the International Conference Signal Processing, Image Processing and Pattern Recognition. Communications in Computer and Information Science. 2011; 260:391-400. DOI: 10.1007/978-3-642-27183-0_41.
  • Chen J, Lai J, Feng G. Gabor-Based Kernel Fisher Discriminant Analysis for Pose Discrimination. Springer, Berlin-Heidelberg: Advances in Biometric Person Authentication. 2005; 3338:153-61. DOI: 10.1007/978-3-540-30548-4_18.
  • Kouskouridas R, Gasteratos A, Emmanouilidis C. Efficient representation and feature extraction for neural network-based 3D object pose estimation. Neurocomputing. 2013; 120:90-100. DOI: 10.1016/j.neucom.2012.11.047.
  • Kouskouridas R, Gasteratos A. Establishing low dimensional manifolds for 3D object pose estimation. IEEE International Conference on Imaging Systems and Techniques (IST). 2012; p. 425-30. DOI: 10.1109/IST.2012.6295483.
  • Rui N, Ji G, Zhao W, Feng C. ANN hybrid ensemble learning strategy in 3D object recognition and pose estimation based on similarity. Springer, Berlin-Heidelberg: Advances in Intelligent Computing. 2005; 3644:650-60. DOI: 10.1007/11538059_68.
  • Wunsch P, Winkler S, Hirzinger G. Real-time pose estimation of 3D objects from camera images using neural networks. Proceedings of the IEEE International Conference on Robotics and Automation. 1997; 4:3232-37. DOI: 10.1109/ROBOT.1997.606781.
  • Riedmiller M, Braun H. A direct adaptive method for faster backpropagation learning: The RPROP algorithm. Proceedings of IEEE International Conference on Neural Networks, IEEE. 1993; 586-91. DOI: 10.1109/ICNN.1993.298623.
  • LeCun Y, Cortes C, Burges C. The MNIST handwritten digit database. 1998. Date accessed: 27/05/2016: Available from:
  • Chen M, Weinberger KQ, Xu Zh, Sha F, Bengio Y. Marginalized denoising auto-encoders for nonlinear representations. Proceedings of the 31st International Conference on Machine Learning (ICML-14). 2014; 32:1476-84.
  • Chen, F-Q, Wu Y, Guo-Dong Zhao G-D, Zhang J-M, Zhu M, Bai J. Contractive De-noising Auto-Encoder. Springer International Publishing: Intelligent Computing Theory. 2014; 8588:776-81. DOI: 10.1007/978-3-319-09333-8_84.


  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.