User abnormal behavior recommendation via multilayer network

English version

Autoři: Chengyun Song ^aff001; Weiyi Liu ^aff002; Zhining Liu ^aff003; Xiaoyang Liu ^aff001
Působiště autorů: School of Computer Science and Engineering, Chongqing University of Technology, Chongqing, China ^aff001; JD Urban Computing Business Unit, Chengdu, China ^aff002; School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu, China ^aff003
Vyšlo v časopise: PLoS ONE 14(12)
Kategorie: Research Article
prolekare.web.journal.doi_sk: https://doi.org/10.1371/journal.pone.0224684

Souhrn

With the growing popularity of online services such as online banking and online shopping, one of the essential research topics is how to build a privacy-preserving user abnormal behavior recommendation system. However, a machine-learning based system may present a dilemma. On one aspect, such system requires large volume of features to pre-train the model, but on another aspect, it is challenging to design usable features without looking to plaintext private data. In this paper, we propose an unorthodox approach involving graph analysis to resolve this dilemma and build a novel private-preserving recommendation system under a multilayer network framework. In experiments, we use a large, state-of-the-art dataset (containing more than 40,000 nodes and 43 million encrypted features) to evaluate the recommendation ability of our system on abnormal user behavior, yielding an overall precision rate of around 0.9, a recall rate of 1.0, and an F1-score of around 0.94. Also, we have also reported a linear time complexity for our system. Last, we deploy our system on the “Wenjuanxing” crowd-sourced system and “Amazon Mechanical Turk” for other users to evaluate in all aspects. The result shows that almost all feedbacks have achieved up to 85% satisfaction.

Klíčová slova:

Network analysis – Algorithms – Internet – Machine learning algorithms – Machine learning – Computer architecture – Random walk – Vector spaces

Zdroje

1. Mollah MB, Azad MAK, Vasilakos A. Security and privacy challenges in mobile cloud computing: Survey and way ahead. Journal of Network and Computer Applications. 2017. doi: 10.1016/j.jnca.2017.02.001

2. Buczak AL, Guven E. A survey of data mining and machine learning methods for cyber security intrusion detection. IEEE Communications Surveys & Tutorials. 2016;18(2):1153–1176. doi: 10.1109/COMST.2015.2494502

3. Ohno-Machado L, Wang S, Wang X, Iranmehr A, Jiang X. Privacy, Security, and Machine Learning for Mobile Health Applications; 2017.

4. Bost R, Popa RA, Tu S, Goldwasser S. Machine Learning Classification over Encrypted Data. In: NDSS; 2015.

5. Anderson B, McGrew D. Machine learning for encrypted malware traffic classification: accounting for noisy labels and non-stationarity. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, NY, USA: ACM; 2017. p. 1723–1732.

6. Hitaj B, Ateniese G, Perez-Cruz F. Deep Models Under the GAN: Information Leakage from Collaborative Deep Learning. arXiv preprint arXiv:170207464. 2017.

7. Xie L, Baytas IM, Lin K, Zhou J. Privacy-Preserving Distributed Multi-Task Learning with Asynchronous Updates. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM; 2017. p. 1195–1204.

8. Boccaletti S, Bianconi G, Criado R, Del Genio CI, Gómez-Gardenes J, Romance M, et al. The structure and dynamics of multilayer networks. Physics Reports. 2014;544(1):1–122. doi: 10.1016/j.physrep.2014.07.001

9. De Domenico M, Solé-Ribalta A, Cozzo E, Kivelä M, Moreno Y, Porter MA, et al. Mathematical formulation of multilayer networks. Physical Review X. 2013;3(4):041022. doi: 10.1103/PhysRevX.3.041022

10. Zimek A, Schubert E, Kriegel HP. A survey on unsupervised outlier detection in high-dimensional numerical data. Statistical Analysis and Data Mining: The ASA Data Science Journal. 2012;5(5):363–387. doi: 10.1002/sam.11161

11. Schubert E, Zimek A, Kriegel HP. Local outlier detection reconsidered: a generalized view on locality with applications to spatial, video, and network outlier detection. Data Mining and Knowledge Discovery. 2014;28(1):190–237. doi: 10.1007/s10618-012-0300-z

12. Akoglu L, Tong H, Koutra D. Graph based anomaly detection and description: a survey. Data Mining and Knowledge Discovery. 2015;29(3):626–688. doi: 10.1007/s10618-014-0365-y

13. Meng L, Ding S, Xue Y. Research on denoising sparse autoencoder. International Journal of Machine Learning and Cybernetics. 2017;8(5):1719–1729. doi: 10.1007/s13042-016-0550-y

14. Rifai S, Vincent P, Muller X, Glorot X, Bengio Y. Contractive auto-encoders: Explicit invariance during feature extraction. In: Proceedings of the 28th international conference on machine learning (ICML-11); 2011. p. 833–840.

15. Qi Y, Wang Y, Zheng X, Wu Z. Robust feature learning by stacked autoencoder with maximum correntropy criterion. In: Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on. IEEE; 2014. p. 6716–6720.

16. Zhou C, Paffenroth RC. Anomaly Detection with Robust Deep Autoencoders. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, NY, USA: ACM; 2017. p. 665–674.

17. Wang Q, Guo W, Zhang K, Ororbia II AG, Xing X, Liu X, et al. Adversary Resistant Deep Neural Networks with an Application to Malware Detection. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, NY, USA: ACM; 2017. p. 1145–1153.

18. Song C, Ristenpart T, Shmatikov V. Machine Learning Models that Remember Too Much. arXiv preprint arXiv:170907886. 2017.

19. Ateniese G, Mancini LV, Spognardi A, Villani A, Vitali D, Felici G. Hacking smart machines with smarter ones: How to extract meaningful data from machine learning classifiers. International Journal of Security and Networks. 2015;10(3):137–150. doi: 10.1504/IJSN.2015.071829

20. Fredrikson M, Jha S, Ristenpart T. Model inversion attacks that exploit confidence information and basic countermeasures. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security. New York, NY, USA: ACM; 2015. p. 1322–1333.

21. Fredrikson M, Lantz E, Jha S, Lin S, Page D, Ristenpart T. Privacy in pharmacogenetics: an end-to-end case study of personalized warfarin dosing. In: Proceedings of the 23rd USENIX conference on Security Symposium. USENIX Association; 2014. p. 17–32.

22. Shokri R, Shmatikov V. Privacy-preserving deep learning. In: Proceedings of the 22nd ACM SIGSAC conference on computer and communications security. ACM; 2015. p. 1310–1321.

23. Gupta SK, Rana S, Venkatesh S. Differentially private multi-task learning. In: Pacific-Asia Workshop on Intelligence and Security Informatics. Springer. Berlin, Germany: Springer; 2016. p. 101–113.

24. Loe CW, Jensen HJ. Comparison of communities detection algorithms for multiplex. Physica A: Statistical Mechanics and its Applications. 2015;431:29–45. doi: 10.1016/j.physa.2015.02.089

25. Zhu G, Li K. A unified model for community detection of multiplex networks. In: International Conference on Web Information Systems Engineering. Springer. Berlin, Germany: Springer; 2014. p. 31–46.

26. Hmimida M, Kanawati R. Community detection in multiplex networks: A seed-centric approach. NHM. 2015;10(1):71–85. doi: 10.3934/nhm.2015.10.71

27. Carchiolo V, Longheu A, Malgeri M, Mangioni G. Communities unfolding in multislice networks. In: Complex Networks. Berlin, Germany: Springer; 2011. p. 187–195.

28. Mucha PJ, Richardson T, Macon K, Porter MA, Onnela JP. Community structure in time-dependent, multiscale, and multiplex networks. science. 2010;328(5980):876–878. doi: 10.1126/science.1184819

29. Hu H, van Gennip Y, Hunter B, Bertozzi AL, Porter MA. Multislice modularity optimization in community detection and image segmentation. In: Data Mining Workshops (ICDMW), 2012 IEEE 12th International Conference on. IEEE. IEEE; 2012. p. 934–936.

30. Amelio A, Tagarelli A. Revisiting Resolution and Inter-Layer Coupling Factors in Modularity for Multilayer Networks. arXiv preprint arXiv:170907253. 2017.

31. Perozzi B, Al-Rfou R, Skiena S. Deepwalk: Online learning of social representations. In: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. New York, NY, USA: ACM; 2014. p. 701–710.

32. Tang J, Qu M, Wang M, Zhang M, Yan J, Mei Q. Line: Large-scale information network embedding. In: Proceedings of the 24th International Conference on World Wide Web. ACM; 2015. p. 1067–1077.

33. Dundar M, Kou Q, Zhang B, He Y, Rajwa B. Simplicity of kmeans versus deepness of deep learning: A case of unsupervised feature learning with limited data. In: Machine Learning and Applications (ICMLA), 2015 IEEE 14th International Conference on. IEEE; 2015. p. 883–888.

34. Grover A, Leskovec J. node2vec: Scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, NY, USA: ACM; 2016. p. 855–864.

35. Zhang B, Choudhury S, Hasan MA, Ning X, Agarwal K, Purohit S, et al. Trust from the past: Bayesian Personalized Ranking based Link Prediction in Knowledge Graphs. In: SDM Workshop on Mining Networks and Graphs (MNG 2016); 2016.

36. Zhang B, Hasan MA. Name Disambiguation in Anonymized Graphs using Network Embedding. In: The 26th ACM International Conference on Information and Knowledge Management (CIKM 2017); 2017.

37. Chen S, Niu S, Akoglu L, Kovačević J, Faloutsos C. Fast, Warped Graph Embedding: Unifying Framework and One-Click Algorithm. arXiv preprint arXiv:170205764. 2017.

38. Goyal P, Ferrara E. Graph Embedding Techniques, Applications, and Performance: A Survey. arXiv preprint arXiv:170502801. 2017.

39. Liu W, Yeung S, Chen PY, Suzumura T, Chen L. Principled Multilayer Network Embedding. arXiv preprint arXiv:170903551. 2017.

40. Bazzi M, Porter MA, Williams S, McDonald M, Fenn DJ, Howison SD. Community detection in temporal multilayer networks, with an application to correlation networks. Multiscale Modeling & Simulation. 2016;14(1):1–41. doi: 10.1137/15M1009615

41. Morris RJ. Perspectives in Abnormal Behavior: Pergamon General Psychology Series. Elsevier; 2013.

42. Fortunato S, Barthélemy M. Resolution limit in community detection. Proceedings of the National Academy of Sciences. 2007;104(1):36–41. doi: 10.1073/pnas.0605965104

43. Baeza-Yates R, Ribeiro-Neto B, et al. Modern information retrieval. vol. 463. New York, NY, USA: ACM; 1999.

44. Hair JF, Black WC, Babin BJ, Anderson RE, Tatham RL, et al. Multivariate data analysis. vol. 5. New Jersey: Prentice hall Upper Saddle River, NJ; 1998.

User abnormal behavior recommendation via multilayer network

Souhrn

Klíčová slova:

Zdroje

PLOS One

Aktuální možnosti diagnostiky a léčby litiáz