4 results on '"Nascimento, Márcia H.C."'
Search Results
2. Different strategies for the use of random forest in NMR spectra.
- Author
-
Lovatti, Betina P.O., Nascimento, Márcia H.C., Rainha, Karla P., Oliveira, Emanuele C.S., Neto, Álvaro C., Castro, Eustáquio V.R., and Filgueiras, Paulo R.
- Subjects
- *
RANDOM forest algorithms , *PRINCIPAL components analysis , *DISCRIMINANT analysis , *NUCLEAR magnetic resonance , *PETROLEUM , *PETROLEUM chemicals - Abstract
Nuclear magnetic resonance (NMR) can provide a large amount of information about an analyzed sample; however, its spectra contain above 6000 variables, making it difficult for random forest (RF) applications. Reducing the size of the original dataset can minimize this problem. In this paper, we compared RF classification models obtained with full NMR spectral range and from the reduction of NMR variables, using principal component analysis (PCA) and the Fisher discriminant (FD). Then, the variables used in the construction of RF trees were analyzed and identified. Here, we used 1H and 13C NMR spectra obtained from 126 petroleum samples and values of their total acidy number (TAN), as measured by ASTM D664, ranging from 0.03 to 4.96 mg KOH· g−1, to distinguish the oil samples from the TAN values. Of two classes that resulted, the first contained 78 samples with TAN values less than, or equal to, 0.3 mg KOH· g−1, while the second contained 48 samples with TAN values higher than 0.3 mg KOH· g−1. The 1H NMR results showed that the combination of FD and RF techniques provided the best accuracy (88%). For 13C NMR data, the most accurate model was obtained by the association of PCA and RF (84%). The identification of variables used in RF allowed a better understanding of the important chemical data contained in the spectra and the relationship to TAN in petroleum. The use of NMR associated with random forest method is effective for discriminating crude oil samples from the total acid number. When combined with methods of variable selection, such as principal component analysis and Fisher discriminant in the present paper, results reached 88% accuracy for 1H NMR data and 84% for 13C NMR data in less time. The variables indicated as most important show a chemical relationship with the modeled variable. The aromatic region made a somewhat greater contribution compared with the paraffinic region. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
3. Variable selection in support vector regression using angular search algorithm and variance inflation factor.
- Author
-
Folli, Gabriely S., Nascimento, Márcia H.C., Paulo, Ellisson H., Cunha, Pedro H.P., Romão, Wanderson, and Filgueiras, Paulo R.
- Subjects
- *
SEARCH algorithms , *STANDARD deviations , *PRICE inflation , *NUCLEAR magnetic resonance , *PETROLEUM - Abstract
Here, we combine angular search algorithm and variance inflation factor (ASA‐VIF) with support vector regression (SVR) (ASA‐VIF‐SVR) to estimate total acid number (TAN), basic nitrogen content (BNC), and sulfur content (SC) in Brazilian crude oils. To prevent the interference of outliers, we further developed a strategy for outlier identification and applied it to nonlinear models based on RMSE (root mean square error). ASA‐VIF‐SVR was applied to near‐ and mid‐infrared spectroscopy (NIR and MIR) and hydrogen nuclear magnetic resonance (1H NMR) spectroscopy data available in a range of 93–194 samples. The models were evaluated for accuracy (root mean square error of calibration [RMSEC] and root mean square error of prediction [RMSEP]) and linearity (coefficient of determination, R2). The removal of outliers increased accuracy and linearity of our models. The ASA‐VIF model for TAN, BNC, and SC selected 0.37%, 0.93%, and 0.30% of variables from full NIR spectra; 0.21%, 0.27%, and 0.21% from full MIR; and 0.20%, 0.42%, and 0.15% from full 1H NMR. In most cases, the best results were obtained with variable selection compared with the full dataset. Also, 1H NMR generated more accurate and linear models with RMSEP and R2p of 0.0071 wt% and 0.86 for BNC and 0.0623 wt% and 0.79 for SC. TAN showed a better MIR result with RMSEP of 0.1426 mg KOH g–1 and R2p of 0.47. The most important region for 1H NMR and MIR was the one with the largest quantity of unpaired electrons (aromatic region). A new variable selection methodology, ASA‐VIF‐SVR, was used to predict the physicochemical properties of crude oil. This methodology was applied to NIR, MIR, and 1H NMR spectroscopy data, reducing by more than 99% the number of original datasets. Models with variable selection have better accuracy than models with full data. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
4. Particle swarm optimization and ordered predictors selection applied in NMR to predict crude oil properties.
- Author
-
de Paulo, Ellisson H., Folli, Gabriely S., Nascimento, Márcia H.C., Moro, Mariana K., da Cunha, Pedro H.P., Castro, Eustáquio V.R., Neto, Alvaro Cunha, and Filgueiras, Paulo R.
- Subjects
- *
PARTICLE swarm optimization , *PETROLEUM , *STANDARD deviations , *PARTIAL least squares regression , *HEAT of combustion - Abstract
We used spectral information from nuclear magnetic resonance (1H NMR and 13C NMR) of nearly 150 Brazilian crude oil samples to predict some physicochemical properties. We build models to estimate API gravity (API), standardized kinematic viscosity at 50 °C (VIS st), heat combustion value (HCV), total acid number (TAN), saturates (SAT), aromatics (ARO), resins (RES) and asphaltenes (ASF) content. To obtain accurate models, particle swarm optimization (PSO) and ordered predictors selection (OPS) were applied as variable selection techniques coupled to partial least squares (PLS) regression. PSO-PLS and OPS-PLS hybrid models presented higher predictive capacity than PLS regression models. We were able to find the most relevant signal areas of the NMR spectra for each property. The best results of SAT, ARO and RES content were obtained with PSO-PLS of 13C NMR dataset with root mean squared error of prediction (RMSEP) of 4.54, 2.85 and 4.08 (wt%), respectively. For API, VIS st and TAN the RMSEP were equal to 0.74 (API), 0.02 and 0.16 (mg KOH·g−1), respectively, using OPS-PLS method and 1H NMR dataset. The more accurated models for ASF contents and HCV were built with PSO-PLS of 1H NMR, with RMSEP of 0.59 (wt%) and PSO-PLS of 13C NMR with RMSEP of 0.64 (MJ·kg−1), respectively. However, these two properties presented a coefficient of determination for the prediction set (R2 p) lower than 0.75, which means they are not very well adjusted to the regression model. Furthermore, using 13C NMR dataset, OPS-PLS models has shown similar results than PSO-PLS models, for some properties. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.