Improved semi-supervised autoencoder for deception detection

Autoři: Hongliang Fu aff001;  Peizhi Lei aff001;  Huawei Tao aff001;  Li Zhao aff002;  Jing Yang aff001
Působiště autorů: School of Information Science and Engineering, Henan University of Technology, Zhengzhou, China aff001;  Key Laboratory of Underwater Acoustic signal Processing of Ministry of Education, Southeast University, Nanjing, China aff002
Vyšlo v časopise: PLoS ONE 14(10)
Kategorie: Research Article
prolekare.web.journal.doi_sk: 10.1371/journal.pone.0223361


Existing algorithms of speech-based deception detection are severely restricted by the lack of sufficient number of labelled data. However, a large amount of easily available unlabelled data has not been utilized in reality. To solve this problem, this paper proposes a semi-supervised additive noise autoencoder model for deception detection. This model updates and optimizes the semi-supervised autoencoder and it consists of two layers of encoder and decoder, and a classifier. Firstly, it changes the activation function of the hidden layer in network according to the characteristics of the deception speech. Secondly, in order to prevent over-fitting during training, the specific ratio dropout is added at each layer cautiously. Finally, we directly connected the supervised classification task in the output of encoder to make the network more concise and efficient. Using the feature set specified by the INTERSPEECH 2009 Emotion Challenge, the experimental results on Columbia-SRI-Colorado (CSC) corpus and our own deception corpus show that the proposed model can achieve more advanced performance than other alternative methods with only a small amount of labelled data.

Klíčová slova:

Deception – Emotions – Games – Human learning – Learning – Neural networks – Speech – Support vector machines


1. Ekman P, O’Sullivan M, Friesen W V, Scherer K R. Invited article: Face, voice, an-d body in detecting deceit. Journal of Nonverbal Behavior. 1991; 15(2):125–135. doi: 10.1007/BF00998267

2. Gopalan K, Wenndt S. Speech Analysis using Modulation-Based Features for Detecting Deception. International Conference on Digital Signal Processing. 2007; 619–622.

3. Kirchhuebel, Christin. The acoustic and temporal characteristics of deceptive speech. University of York, 2013.

4. Enos F, Shriberg E, Graciarena M, Hirschberg J. Detecting Deception Using Critical Segments. Antwerp: ISCA-INST Speech Communication Assoc.2007; 2432–2435.

5. Zhou Y, Zhao H, Pan X, Shang L. Deception detecting from speech signal using relevance vector machine and non-liner dynamics features. Neurocomputing. 2015; 151: 1042–1052. doi: 10.1016/j.neucom.2014.04.083

6. Hinton G. E. and Salakhutdinov R. R. Reducing the dimensionality of data with neural networks. Science. 2006; 313(5786): 504–507. doi: 10.1126/science.1127647 16873662

7. Goodfellow I, Mirza M, Courville A, Bengio Y. Multi-Prediction Deep Boltzmann Machines. International Conference on Neural Information Processing Systems. Curran Associates Inc. 2013; 548–556.

8. Hinton G E, Zemel R S. Autoencoders, Minimum Description Length and Helmholtz Free Energy. Advances in neural information processing systems. 1994.

9. Mohamed A, Hinton G, Penn G. Understanding how Deep Belief Networks perform acoustic modelling. 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2012; 273–4276.

10. Srivastava, N, Dubey, S. Deception detection using artificial neural network and support vector machine. Second International Conference on Electronics, Commun-ication and Aerospace Technology. 2018; 1205–1208.

11. Zhou Y, Zhao H, Pan X. Lie Detection from Speech Analysis Based on K–SVD Deep Belief Network Model. Intelligent Computing Theories and Methodologies. 2015; 189–192.

12. Jia X, Li K, Li X. A Novel Semi-Supervised Deep Learning Framework for Affective State Recognition on EEG Signals. 2014 IEEE International Conference on Bioinformatics and Bioengineering. IEEE, 2014; 30–37.

13. Rasmus A.; Berglund M.; Honkala M.; Valpola H. Semi-supervised learning with ladder networks. Computer Science. 2015; 1(1); 3546–3554.

14. Deng J, Xu X, Zhang Z, Frühholz S, Schuller B. Semi-Supervised Autoencoders for Speech Emotion Recognition. IEEE/ACM Transactions on Audio, Speech, and Language Processing. 2017; 26(1): 31–43. doi: 10.1109/TASLP.2017.2759338

15. Weston J, Frédéric Ratle. Deep Learning via Semi-Supervised Embedding. International Conference on Machine Learning. 2012; 639–655.

16. Abbasnejad ME, Dick, Hengel AVD. Infinite Variational Autoencoder for Semi-Supervised Learning. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2017; 781–790.

17. Hirschberg J B, Enos F, Benus S, Cautin R L, Graciarena M. Personality Factors in Human Deception Detection: Comparing Human to Machine Performance. International Conference on Interspeech -icslp. DBLP. 2006; 813–816.

18. Hung H, Chittaranjan G. The Idiap Wolf corpus: exploring group behaviour in a competitive role-playing game. International Conference on Multimedea. 2010; 879–882.

19. Björn Schuller, Steidl S, Batliner A. The INTERSPEECH 2009 Emotion Challenge. Interspeech. 2009; 312–315.

20. Han K, Yu D, Tashev I. Speech emotion recognition using deep neural network and extreme learning machine. Fifteenth annual conference of the international speech communication association. 2014.

21. Vincent P, Larochelle H, Lajoie I, Bengio Y, Manzagol P, Bottou L et al. Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion. Journal of Machine Learning Research, 2010; 11(12): 3371–3408.

Článok vyšiel v časopise


2019 Číslo 10