1. Clustering Disjoint HJ-Biplot: A new tool for identifying pollution patterns in geochemical studies
- Author
-
M.P. Vicente-Galindo, A.B. Nieto-Librero, M.P. Galindo-Villardón, Omar Ruiz-Barzola, and C. Sierra
- Subjects
Factorial ,Geologic Sediments ,Environmental Engineering ,Biplot ,Computer science ,Health, Toxicology and Mutagenesis ,Environmental pollution ,Context (language use) ,Disjoint sets ,010501 environmental sciences ,computer.software_genre ,01 natural sciences ,Mining ,010104 statistics & probability ,Rivers ,Environmental Chemistry ,Cluster Analysis ,0101 mathematics ,Cluster analysis ,0105 earth and related environmental sciences ,Principal Component Analysis ,Public Health, Environmental and Occupational Health ,Environmental engineering ,General Medicine ,General Chemistry ,Models, Theoretical ,Data structure ,Pollution ,Principal component analysis ,Data mining ,Ecuador ,computer ,Algorithms ,Water Pollutants, Chemical ,Environmental Monitoring - Abstract
This paper introduces a new mathematical algorithm termed Clustering Disjoint HJ-Biplot (CDBiplot), which searches for the underlying data structure in order to find the best classification of the object groups in a reduced space. To this end, disjoint factorial axes are generated, in which each variable only contributes to the formation of one factorial axis. A graphical representation of the individuals and variables is performed using the HJ-Biplot method. In order to facilitate the use of this new method within any practical context, a function in the language R has been developed. This work applies the CDBiplot to study an environmental geochemistry case involving environmental pollution in river surface sediments. The study focuses on an area close to an important mining and metallurgical site, where sediments share a similar geological origin and chemical composition. The algorithm permitted a detailed study of the geochemical interactions and performed an excellent separation of the samples. Thus, the groups obtained were formed according to a similar geological origin, location, and nature of the anthropogenic inputs based only on chemical composition data. These results allowed clear identification of the sources of pollution and the delimitation of the polluted zones. All things considered, we conclude that the proposed algorithm is a powerful tool for studying environmental geochemistry data sets, and suggest that the application of this methodology be extended to other research fields.
- Published
- 2016