A COMPARATIVE STUDY OF DIMENSIONALITY REDUCTION TECHNIQUES FOR HIGH-DIMENSIONAL STATISTICAL DATA
- Authors
-
-
Dhuha Salim Waheed
AL-Furat AL-AwsatTechnical University, Al-Qadisiyah Polytechnic College, Iraq
Author
-
Zahraa Saad Jasim
AL-Furat AL-AwsatTechnical University, Al-Qadisiyah Polytechnic College, Iraq
Author
-
Mohammed Guraibawi
AL-Furat AL-AwsatTechnical University, Al-Qadisiyah Polytechnic College, Iraq
Author
-
- Keywords:
- Dimensionality Reduction, PCA, t-SNE, UMAP, High-Dimensional Data, Visualization, Clustering.
- Abstract
-
Dimensionality reduction is: a necessary processing step in order to properly analyze large data sets with many variables; makes it easier to visualize data structures, and reduces computational complexity; reduces the curse of dimensionality. Three popular techniques for reducing dimensionality in high-dimensional datasets were compared with one another for this study. They are: Principal Component Analysis; t-Distributed Stochastic Neighbor Embedding (t-SNE); and Uniform Manifold Approximation and Projection (UMAP). The data used here is derived from the classic Iris dataset augmented by 50 random features obtained through some other means. According to PCA, linear projections can be used while still retaining maximum variance. t-SNE and UMAP give non-linear representations that allow for both local and global structure. Our experiments show that all methods preserve the underlying class structure, while t-SNE and UMAP provide more sharply clustered results. Silhouette analysis confirms the quality of clusters. These results indicate a trade-off between linear and non-linear methods to reduce dimensionality in high-dimensional data.
- References
-
1.Jolliffe, I. T. (2002). Principle Component Analysis, Second Edition, Springer.
2.Citation for van der Maaten, L., & Hinton, G. (2008). Visualizing Data using t-SNE. Machine Learning Research, 9, 2579-2605.
3.McInnes, L., Healy, J., & Melville, J. (2018). UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction arXiv:1802.03426.
4.R Core Team (2026). R: A language and environment for statistical computing
5.Shlens, J. (2014). Principal component analysis — A Beginner's guide arXiv:1404.1100.
6.Roweis, S. T., and Saul, L. K. (2000). Local Linear Embedding: A Method for Nonlinear Dimensionality Reduction Science, 290(5500), 2323-2326.
7.Great job training! Hinton, G. E., & Salakhutdinov, R. R. (2006). Neural networks for dimensionality reduction Science, 313(5786), 504-507.
8.Sainburg, T. L., & Grigorescu, D. M. (2021). LUpper to download a PDF file: Drawings in resistance: The half-empty cup is full. Dimensionality reduction and feature selection. In Machine Learning for Biomedical Applications (pp. 97–115). Springer. Springer.
9.McInnes L, Healy J. (2020). UMAP: A visualization and analysis tool for high-dimensional data. arXiv:2009.06603.
10.References Maaten, L. V. D., & Hinton, G. E. Visualizing data using t-SNE. JOURNAL OF MACHINE LEARNING RESEARCH, 9:2579--2605, November 2008
11.van der Maaten, L. (2014). Accelerating t-SNE using GPU. arXiv:1404.3776.
12.Johnson, J., & Zhang, S. (2017, September 22). Dimensionality reduction techniques for high-dimensional data: A review Machine Learning, 106(11), 1877-1889.
- Downloads
- Published
- 2026-02-18
- Issue
- Vol. 2 No. 2 (2026)
- Section
- Articles
- License
-

This work is licensed under a Creative Commons Attribution 4.0 International License.
How to Cite
Similar Articles
- Yusupbekov N. R., Avazov Y. Sh., Rashidov G. Kh., EXPERIMENTAL EVALUATION OF A THREE-CHANNEL HYBRID DEEP LEARNING FRAMEWORK FOR INDUSTRIAL PROCESS ANOMALY DETECTION , Eureka Journal of Artificial Intelligence and Data Innovation: Vol. 2 No. 2 (2026)
- Hayder Majid Sachit, ARTIFICIAL INTELLIGENCE–BASED PREDICTION OF CRITICAL FAILURES IN 5G NETWORK INFRASTRUCTURES AND REAL TIME QUALITY OF SERVICE (QOS) OPTIMIZATION , Eureka Journal of Artificial Intelligence and Data Innovation: Vol. 2 No. 2 (2026)
- Dr. A. Sharma, Dr. R. Miller, Prof. L. Tan, EXPLAINABLE AI MODELS FOR HIGH-STAKES DECISION-MAKING IN FINANCE , Eureka Journal of Artificial Intelligence and Data Innovation: Vol. 1 No. 1 (2025)
- Dr. Emilia Kuznetsova, GENERATIVE AI FOR PREDICTIVE MODELING IN HEALTHCARE SYSTEMS: A COMPREHENSIVE EVALUATION OF PERFORMANCE AND RELIABILITY , Eureka Journal of Artificial Intelligence and Data Innovation: Vol. 1 No. 1 (2025)
- Dr. Elena Moravik, AI-DRIVEN PREDICTIVE ANALYTICS FOR LARGE-SCALE CLIMATE RISK MANAGEMENT , Eureka Journal of Artificial Intelligence and Data Innovation: Vol. 1 No. 1 (2025)
- Nurbek Matyokubov, Farkhod Kasimov, Iskandarov Zohid Ergashboevich, ANALYSIS OF THE SAFETY OF MANIPULATORS USED IN INDUSTRIAL GAS FIRED FURNACES USING AN ONTOLOGY BASED HAZOP APPROACH , Eureka Journal of Artificial Intelligence and Data Innovation: Vol. 2 No. 2 (2026)
- Abdurasul Bobonazarov, Vazira Holboeva, DAILY MAXIMUM AQI PREDICTION IN TASHKENT USING CONVOLUTIONAL NEURAL NETWORKS (CNN) , Eureka Journal of Artificial Intelligence and Data Innovation: Vol. 2 No. 1 (2026)
- Kodirov Vosit Mansurovich, ARTIFICIAL INTELLIGENCE: CURRENT STATE, CHALLENGES, AND FUTURE PROSPECTS , Eureka Journal of Artificial Intelligence and Data Innovation: Vol. 2 No. 1 (2026)
- Dedaxanov Akramjon Oltmishboyevich, COTTON RAW MATERIAL DRYER TECHNOLOGY , Eureka Journal of Artificial Intelligence and Data Innovation: Vol. 1 No. 2 (2025)
- Mayser Elan Abbas AL-Owaidi, Maryam Alaa Abdulhussein, DERIVE A NEW RULE FOR FINDING THE VALUES OF NUMERICALLY DEFINED ONE SIDED INTEGRALS , Eureka Journal of Artificial Intelligence and Data Innovation: Vol. 2 No. 3 (2026)
You may also start an advanced similarity search for this article.








