Preview

Open Education

Advanced search

Software Implementation of the Epps-Pulley Criterion in Matlab Modeling Environment

https://doi.org/10.21686/1818-4243-2024-2-59-72

Abstract

Purpose. Modeling systems and programming platforms provide ample opportunities for the use of statistical tools in research activities. Since the normal distribution is one of the most common distribution laws, the criterion for checking the sample for normality is in high demand among statistical assessment tools, among which the Epps-Pulley test has the status as one of the most powerful tests to check the deviation of the distribution from the normal one. There are a number of implementations of this test in the R and Python languages. However, this test is not implemented in one of the most popular Matlab modeling software. Thus, the purpose of this study is to develop a software implementation of the Epps-Pulley criterion in the Matlab environment and verify the correctness of the performed calculations.

Materials and Methods. We implemented the calculation of Epps-Pulley statistics by two methods – classical, using cycles, and matrix-vector, using linear algebra operations. The classical method requires calculating the intermediate values necessary to obtain the criterion statistics using two independent cycles, the second cycle being a double one, in which one cycle is nested into the other. The matrix-vector method requires fewer code by performing calculations using linear algebra operations on matrices and vectors. We obtained critical statistical values for the sample size from 8 to 1000 observations with two-dimensional linear interpolation of tabular values. We used an approximation by a beta function of the third kind for a sample of over 1000 elements.

Results. An assessment of the computational efficiency of the methods showed that the cyclic approach is about three times higher than the matrix-vector approach in terms of consumed time, which is presumably due to the processing of insignificant elements in triangular matrices when performing component-by-component operations. The correctness of the software implementation of the Epps-Pulley criterion was tested on several examples, which confirmed the compliance of the calculated values of the criterion statistics, as well as the critical values of statistics, with known data. We carried out a criterion statistical evaluation based on the empirical values of the error of the first kind. We obtained the error values correspondence to the specified significance levels. We performed comparative estimates of the Epps-Pulley test with the Anders-Darling and Shapiro-Wilk tests in terms of the criterion empirical power and tabulated the evaluation results. We published the software implementation of the Epps-Pulley test on the MATLAB Central Internet resource and for free use.

Conclusion. We developed software implementation of the Epps-Pulley criterion as a new research tool that was previously unavailable in the Matlab modeling environment. We used the time spent on calculations to make a reasonable choice of the calculation algorithm for the criterion statistics. We confirmed correctness of the calculation algorithms by a set of selective checks and statistical estimates that showed the compliance with well-known theoretical provisions.

About the Authors

A. A. Tipikin
Military Training and Scientific Center of the Navy «Naval Academy named after Admiral of the Fleet of the Soviet Union N.G. Kuznetsov
Russian Federation

Alexey A. Tipikin - Head of Department

St. Petersburg



A. A. Prusakov
Military Training and Scientific Center of the Navy «Naval Academy named after Admiral of the Fleet of the Soviet Union N.G. Kuznetsov
Russian Federation

Alexander A. Prusakov - Senior Researcher

St. Petersburg



N. A. Timoshenko
Military Training and Scientific Center of the Navy «Naval Academy named after Admiral of the Fleet of the Soviet Union N.G. Kuznetsov
Russian Federation

Nikolay A. Timoshenko - Junior research assistant

St. Petersburg



References

1. Gnatyuk V.I. Zakon optimal’nogo postroeniya tekhnocenozov: Monografiya. [The law of optimal construction of technocenoses: Monograph]. Kaliningrad: Izdatel’stvo KIC «Tekhnocenoz». 2019. 940 p. (In Russ.)

2. Murray-Smith D.J. Testing and Validation of Computer Simulation Models. Principles, Methods and Applications. New York: Springer, 2015. 258 p. DOI: 10.1007/978-3-319-15099-4

3. Bol’shev L.N., Smirnov N.V. Tablicy matematicheskoj statistiki [Tables of mathematical statistics]. 3rd ed. M.: Nauka, 1983. 416 p. (In Russ.)

4. Kobzar’ A.I. Prikladnaya matematicheskaya statistika. Dlya inzhenerov i nauchnyh rabotnikov [Applied mathematical statistics. For engineers and scientists]. M.: Fizmatlit, 2006. 816 p. (In Russ.)

5. Volchikhin V.I., Ivanov A.I., Bezyaev A.V., Kupriyanov E.N. Nejrosetevoj analiz normal’nosti malyh vyborok biometricheskih dannyh s ispol’zovaniem hi-kvadrat kriteriya i kriteriev Andersona-Darlinga. Inzhenernye tekhnologii i sistemy [The Neural Network Analysis of Normality of Small Samples of Biometric Data through Using the Chi-Square Test and Anderson–Darling Criteria]. // Inzhenernyye tekhnologii i sistemy [Engineering Technologies and Systems]. 2019. Vol. 29. No. 2. P. 205–217. DOI: 10.15507.2658-4123.029.201902.205-217.

6. Ivanov A.I., Vjatchanin S.E., Malygina E.A., Lukin V.S. Precision statistics: neuroet networking of chi-square test and Shapiro-Wilk test in the analysis of small selections of biometric data. // Nadezhnost’ i Kachestvo Slozhnyh Sistem [Reliability and Quality of Complex Systems]. 2019. No. 2(26). P. 27–34. DOI: 10.21685/2307-4205-2019-2-4.

7. Ebner B., Henze N. Bahadur efficiencies of the Epps–Pulley test for normality. // Journal of Mathematical Sciences. 2021. No. 273. P. 861–870. DOI: 10.1007/s10958-023-06547-2.

8. Lemeshko B.Yu. Kriterii proverki otkloneniya raspredeleniya ot normal’nogo zakona. Rukovodstvo po primeneniyu. Monografiya [Tests for checking the departure of the distribution from the normal law. Application guide. Monograph]. Novosibirsk: NGTU, 2014. 192 p. (In Russ.)

9. Lemeshko B.Yu. Statisticheskij analiz dannyh, modelirovanie i issledovanie veroyatnostnyh zakonomernostej. Komp’yuternyj podhod. [Statistical data analysis, modeling, and probabilistic patterns research. Computational approach]. Novosibirsk: NGTU, 2011. 888 p. (In Russ.)

10. Razali N.M., Wah Y.B. Power comparisons of Shapiro-Wilk, Kolmogorov-Smirnov, Lilliefors and Anderson-Darling tests. // Journal of Statistical Modeling and Analytics. 2011. Vol. 2. No. 1. P. 21–33.

11. Statistics and Machine Learning Toolbox User’s Guide. [Internet]. Available from: https://www.mathworks.com/help/pdf_doc/stats/stats.pdf. (Retrieved 15.12.2023).

12. BenSaida A. Shapiro-Wilk and Shapiro-Francia normality tests. [Internet]. Available from: https://www.mathworks.com/matlabcentral/fileexchange/13964-shapiro-wilk-and-shapiro-francia-normality-tests. (Retrieved: 15.01.2024).

13. Назаров А.А. Proverka normal’nosti raspredeleniya s ispol’zovaniem kriteriya Epps-Pulley sredstvami Python. [Checking the normality of the distribution using the Epps-Palley test with Python tools]. [Internet]. Available from: https://github.com/AANazarov/MyModulePython.git. (Retrieved: 15.01.2024).

14. Kelly D.E. Oceanographics Analysis with R. New York: Springer-Verlag, 2018. 280 p. DOI: 10.1007/978-1-4939-8844-0.

15. The R project for Statistical Computing. [Internet]. Available from: https://www.r-project.org. (Retrieved: 15.01.2024).

16. GOST R ISO 5479–2002. Proverka otkloneniya raspredeleniya veroyatnostej ot normal’nogo raspredeleniya [Test for departure from normal distribution]. M.: Gosstandard Rossii, 2002. (In Russ.)

17. International Standard ISO 5479-1997. Statistical interpretation of data – Test for departure from the normal Distribution. Geneva: International Standardization Organization, 1997.

18. Altar R.R., Samanta D., Konar D., Bhattacharryya S. Software Source Code: Statistical Modeling. Berlin: De Grunter, 2021. 358 p. DOI: 10.1515/9783110703399.

19. Tipikin A. A. Epps-Pulley test for departure from normal distribution. [Internet]. Available from: https://www.mathworks.com/matlabcentral/fileexchange/158036-eptest. (Retrieved: 15.01.2024).

20. Stigler S. M. Do robust estimators work with real data? // The Annals of Statistics. 1977. Vol. 5. № 6. P. 1055–1098. DOI: 10.1214/aos/1176343997.

21. Bessonov A.A. Iskusstvennyj intellect I matematicheskaya statistika v kriminalisticheskom izuchenii prestuplenij. Monografiya. [Artifical intelligence and mathematical statistics in crime analysis. Monograph]. M.: Prospekt, 2021. 816 p. (In Russ.)

22. Karamandis M., Beutler F. Ensemble slice sampling. // Statistics and Computing. 2021. Vol. 31, 61. P. 1–18. DOI: 10.1007/s11222-021-10038-2.


Review

For citations:


Tipikin A.A., Prusakov A.A., Timoshenko N.A. Software Implementation of the Epps-Pulley Criterion in Matlab Modeling Environment. Open Education. 2024;28(2):59-72. (In Russ.) https://doi.org/10.21686/1818-4243-2024-2-59-72

Views: 173


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 1818-4243 (Print)
ISSN 2079-5939 (Online)