Preview

Open Education

Advanced search

Building an Associative Classification Data Model Based on the Apriori Method

https://doi.org/10.21686/1818-4243-2020-4-4-12

Abstract

The purpose of the work is to explore the current problems and prospects of mining solution, big web data in real time, as well as the possibility of practical implementation of Web Mining technology for big web data on a practical example.
Materials and methods. The study included a review of bibliographic sources on big data mining. We used Web Mining technology for associative analysis of large web data, as well as computer modeling of the practical task of transaction analysis using a general-purpose scripting language (PHP).
Results. During the work, the specifics of the Data Mining technology were described, and a modern approach to the analysis of large web data –Web Mining was analyzed. A brief classification of tasks solved using Web Mining technology is given. The problem of data mining of large web data in a general-purpose scripting language (PHP) has been solved: the lack of libraries for data mining, the difficult normalization of data to the form necessary for associative analysis, interaction with the database management system. Also, an example showing an approach to the mining of large web data was implemented. Based on the understanding of Web Mining technology and the described difficulties of analyzing web data in the PHP language, methods for effectively solving the practical problem of analyzing web data based on transactions committed in a dynamic web application have been proposed. A module for associative analysis of customer transactions in the programming language PHP was developed. The module includes an intelligent data processing class. The structural scheme of the module and system architecture were developed. The constructed module allows us to solve the main part of the problem of associative analysis of large web data using Web Mining technology in order to solve the problem of identifying patterns in a large array of web data. Associative analysis of web data is much faster because of the combination of a general-purpose scripting language and an object-oriented approach.
Conclusion. According to the results of the study, it can be argued that the current state of the technology for the analysis of large web data allows efficiently process data objects, identify patterns, obtain hidden data and receive complete statistical data in real time. The results can be used both for the purpose of the initial research of technologies for analyzing large web data, and as an addition to the content management system for the intelligent analysis of web data. The usage of the technology of associative analysis and the created universal handler class makes the created module flexible, while the possibility of manual integration makes this module universal. With manual integration, the database management system is not important. Algorithm methods work with selected data. This factor greatly simplifies the further development of program code.

About the Authors

K. V. Mulyukova
Engineering and Technological Academy of the Southern Federal University
Russian Federation

Ksenia V. Mulyukova - Postgraduate student

Taganrog



V. M. Kureichik
Engineering and Technological Academy of the Southern Federal University

Victor M. Kureichik - Dr. Sci. (Technical sciences), Professor

Taganrog



References

1. Akimushkin V.A., Pozdnyakov S.N. Review of educational data mining methods for analyzing the protocols of student interaction with «scientific games». Komp’yuternyye instrumenty v obrazovanii = Computer tools in education. 2013; 6: 26-32. (In Russ.)

2. Marts N., Uorren D. Bol’shiye dannyye. Printsipy i praktika postroyeniya masshtabiruyemykh sistem obrabotki dannykh v real’nom vremeni = Big data. Principles and practice of building scalable real-time data processing systems. Moscow: Williams; 2017. 368 p. (In Russ.)

3. Koshik A. Veb-analitika 2.0 na praktike. Tonkosti i luchshiye metodiki = Web analytics 2.0 in practice. Subtleties and best practices. Moscow: Williams; 2014. 528 p. (In Russ.)

4. Novikova G.M., Azofeifa E.J. Semantics of big data in corporate management systems. Discrete and Continuous Models and Applied Computational Science. 2018; 4(26): 383 - 392.

5. Paklin H., Oreshkov V. Biznes-analitika: ot dannykh k znaniyam = Business analytics: from data to knowledge. Saint Petersburg: Peter; 2013. 704 p. (In Russ.)

6. Blagirev A. P., Khapayeva N. Big Data prostym yazykom = Big Data in simple language. Moscow: AST; 2019. 256 p. (In Russ.)

7. Kychkin A.V., Kvitko YA.I. Architectural and functional organization of the information system for managing big data in industry and energy. Vestnik Permskogo natsional’nogo issledovatel’skogo politekhnicheskogo universiteta. Elektrotekhnika, informatsionnyye tekhnologii, sistemy upravleniya = Bulletin of the Perm National Research Polytechnic University. Electrical engineering, information technology, control systems. 2018; 25: 109-125 (In Russ.)

8. Kastornova V.A. The technology of using software environments of the educational information space of the subject area «Informatics» in the implementation of knowledge control. Upravleniye obrazovaniyem: teoriya i praktika = Education management: theory and practice. 2018; 3(31): 33-49. (In Russ.)

9. Filyak P.YU., Baylarli E.E.O., Rastvorov V.V., Starchenko V.I. Tools for using Big Data and Data Mining in order to ensure information security - approaches, application experience. Vestnik Moskovskogo finansovo-yuridicheskogo universiteta = Bulletin of the Moscow University of Finance and Law. 2017; 2: 210-220 (In Russ.)

10. Pavlov N.V. The advising intellectual system as a tool for solving marketing problems and training marketing practitioners. Prakticheskiy marketing = Practical marketing. 2018; 3(253): 3-9. (In Russ.)

11. Bol’shiye Dannyye = Big Data [Internet]. Explanatory Dictionary on Academician. Available from: https://dic.academic.ru/dic.nsf/ruwiki/1422719 (cited 16.06.2020). (In Russ.)

12. Data Mining: chto vnutri = Data Mining: What’s Inside [Internet]. Habr. Available from: https://habr.com/ru/post/95209/ (cited 24.06.2020). (In Russ.)

13. Mulyukova K.V., Kureychik V.M. The problem of analyzing big web data and the use of Data Mining technology for processing and searching for patterns in a large array of web data on a practical example. Otkrytoye obrazovaniye = Open Education. 2019; 23(2): 42-49. (In Russ.)

14. Barsegyan A.A., Kupriyanov M.S., Stepanenko V.V., Kholod I.I. Tekhnologii analiza dannykh. Data Mining, Visual Mining, Text Mining, OLAP. 2 izd = Data analysis technologies. Data Mining, Visual Mining, Text Mining, OLAP. 2nd ed. Saint Petersburg: BHV-Petersburg; 2007. 384 p. (In Russ.)

15. Surkova A.S., Budenkov S.S. Building a model and a clustering algorithm in data mining. Vestnik Nizhegorodskogo universiteta im. N.I. Lobachevskogo = Bulletin of Nizhny Novgorod University. N.I. Lobachevsky. 2012; 2(1): 198-202. (In Russ.)

16. Grigorash A.S., Kureychik V.M., Kureychik V.V. Software complex for solving the clustering problem. Programmnyye produkty i sistemy = Software products and systems. 2017; 2(30): 261-269. (In Russ.)

17. Valitova YU.O., Fazanova A.D. Algorithm of automated data collection and analysis for the formation of a personality model of a specialist demanded by the labor market. Vestnik yevraziyskoy nauki = Bulletin of Eurasian Science. 2017; 2(9): 1-9. (In Russ.)

18. Sytnik A.A., Shul’ga T.E., Danilov N.A., Gvozdyuk I.V. Mathematical model of software users’ activity. Programmnyye produkty i sistemy = Software products and systems. 2018; 1(31): 79-84. (In Russ.)

19. Pivovarova N.V., Vidunova S.I. Data mining in pharmaceutical business. Vestnik yevraziyskoy nauki = Bulletin of Eurasian Science. 2016; 6(8): 1-8. (In Russ.)

20. Billig V.A., Ivanova O.V., TsaregorodtsevN.A. Construction of associative rules in the problem of medical diagnostics. Programmnyye produkty i sistemy = Software products and systems. 2016; 2(114): 146 -157. (In Russ.)

21. Olyanich I. A. Comparison of algorithms for constructing associative rules based on a set of data of consumer transactions. Izvestiya Samarskogo nauchnogo tsentra Rossiyskoy akademii nauk = Bulletin of the Samara Scientific Center of the Russian Academy of Sciences. 2018; 6-2(20): 379 - 382. (In Russ.)

22. Sviridov A.S., Lazarev V.S. Development of a basic abstraction of actions to perform mathematical operations in the PHP programming language. Izvestiya Yuzhnogo federal’nogo universiteta. Tekhnicheskiye nauki = News of the Southern Federal University. Technical science. 2015; (165): 217 – 224. (In Russ.)

23. Lagerev D.G., Savostin I.A., Gerasimchuk V.U., Polyakova M.S. Research of the propensity of users of an online store to purchase based on technical data on visits of visitors to an online store. Sovremennyye informatsionnyye tekhnologii i IT-obrazovaniye = Modern information technologies and IT -education. 2018; 4 (14): 911-922. (In Russ.)


Review

For citations:


Mulyukova K.V., Kureichik V.M. Building an Associative Classification Data Model Based on the Apriori Method. Open Education. 2020;24(4):4-12. (In Russ.) https://doi.org/10.21686/1818-4243-2020-4-4-12

Views: 782


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 1818-4243 (Print)
ISSN 2079-5939 (Online)