Author: "Koronaki, Eleni D." / Publication Type: Reports - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Koronaki, Eleni D."' showing total 9 results

Start Over Author "Koronaki, Eleni D." Publication Type Reports

9 results on '"Koronaki, Eleni D."'

1. Implementing LLMs in industrial process modeling: Addressing Categorical Variables

Author: Koronaki, Eleni D., Suntaxi, Geremy Loachamin, Papavasileiou, Paris, Giovanis, Dimitrios G., Kathrein, Martin, Boudouvis, Andreas G., and Bordas, Stéphane P. A.
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Important variables of processes are, in many occasions, categorical, i.e. names or labels representing, e.g. categories of inputs, or types of reactors or a sequence of steps. In this work, we use Large Language Models (LLMs) to derive embeddings of such inputs that represent their actual meaning, or reflect the ``distances" between categories, i.e. how similar or dissimilar they are. This is a marked difference from the current standard practice of using binary, or one-hot encoding to replace categorical variables with sequences of ones and zeros. Combined with dimensionality reduction techniques, either linear such as Principal Components Analysis (PCA), or nonlinear such as Uniform Manifold Approximation and Projection (UMAP), the proposed approach leads to a \textit{meaningful}, low-dimensional feature space. The significance of obtaining meaningful embeddings is illustrated in the context of an industrial coating process for cutting tools that includes both numerical and categorical inputs. The proposed approach enables feature importance which is a marked improvement compared to the current state-of-the-art (SotA) in the encoding of categorical variables.
Published: 2024

2. Conformal Disentanglement: A Neural Framework for Perspective Synthesis and Differentiation

Author: Kevrekidis, George A., Koronaki, Eleni D., and Kevrekidis, Yannis G.
Subjects: Computer Science - Machine Learning, Mathematics - Dynamical Systems
Abstract: For multiple scientific endeavors it is common to measure a phenomenon of interest in more than one ways. We make observations of objects from several different perspectives in space, at different points in time; we may also measure different properties of a mixture using different types of instruments. After collecting this heterogeneous information, it is necessary to be able to synthesize a complete picture of what is `common' across its sources: the subject we ultimately want to study. However, isolated (`clean') observations of a system are not always possible: observations often contain information about other systems in its environment, or about the measuring instruments themselves. In that sense, each observation may contain information that `does not matter' to the original object of study; this `uncommon' information between sensors observing the same object may still be important, and decoupling it from the main signal(s) useful. We introduce a neural network autoencoder framework capable of both tasks: it is structured to identify `common' variables, and, making use of orthogonality constraints to define geometric independence, to also identify disentangled `uncommon' information originating from the heterogeneous sensors. We demonstrate applications in several computational examples.
Published: 2024

3. On Learning what to Learn: heterogeneous observations of dynamics and establishing (possibly causal) relations among them

Author: Sroczynski, David W., Dietrich, Felix, Koronaki, Eleni D., Talmon, Ronen, Coifman, Ronald R., Bollt, Erik, and Kevrekidis, Ioannis G.
Subjects: Computer Science - Machine Learning, Mathematics - Dynamical Systems, Mathematics - Numerical Analysis
Abstract: Before we attempt to learn a function between two (sets of) observables of a physical process, we must first decide what the inputs and what the outputs of the desired function are going to be. Here we demonstrate two distinct, data-driven ways of initially deciding ``the right quantities'' to relate through such a function, and then proceed to learn it. This is accomplished by processing multiple simultaneous heterogeneous data streams (ensembles of time series) from observations of a physical system: multiple observation processes of the system. We thus determine (a) what subsets of observables are common between the observation processes (and therefore observable from each other, relatable through a function); and (b) what information is unrelated to these common observables, and therefore particular to each observation process, and not contributing to the desired function. Any data-driven function approximation technique can subsequently be used to learn the input-output relation, from k-nearest neighbors and Geometric Harmonics to Gaussian Processes and Neural Networks. Two particular ``twists'' of the approach are discussed. The first has to do with the identifiability of particular quantities of interest from the measurements. We now construct mappings from a single set of observations of one process to entire level sets of measurements of the process, consistent with this single set. The second attempts to relate our framework to a form of causality: if one of the observation processes measures ``now'', while the second observation process measures ``in the future'', the function to be learned among what is common across observation processes constitutes a dynamical model for the system evolution.
Published: 2024

4. Discovering deposition process regimes: leveraging unsupervised learning for process insights, surrogate modeling, and sensitivity analysis

Author: Suntaxi, Geremy Loachamín, Papavasileiou, Paris, Koronaki, Eleni D., Giovanis, Dimitrios G., Gakis, Georgios, Aviziotis, Ioannis G., Kathrein, Martin, Pozzetti, Gabriele, Czettl, Christoph, Bordas, Stéphane P. A., and Boudouvis, Andreas G.
Subjects: Physics - Chemical Physics, Computer Science - Machine Learning, Electrical Engineering and Systems Science - Systems and Control, Statistics - Applications
Abstract: This work introduces a comprehensive approach utilizing data-driven methods to elucidate the deposition process regimes in Chemical Vapor Deposition (CVD) reactors and the interplay of physical mechanism that dominate in each one of them. Through this work, we address three key objectives. Firstly, our methodology relies on process outcomes, derived by a detailed CFD model, to identify clusters of "outcomes" corresponding to distinct process regimes, wherein the relative influence of input variables undergoes notable shifts. This phenomenon is experimentally validated through Arrhenius plot analysis, affirming the efficacy of our approach. Secondly, we demonstrate the development of an efficient surrogate model, based on Polynomial Chaos Expansion (PCE), that maintains accuracy, facilitating streamlined computational analyses. Finally, as a result of PCE, sensitivity analysis is made possible by means of Sobol' indices, that quantify the impact of process inputs across identified regimes. The insights gained from our analysis contribute to the formulation of hypotheses regarding phenomena occurring beyond the transition regime. Notably, the significance of temperature even in the diffusion-limited regime, as evidenced by the Arrhenius plot, suggests activation of gas phase reactions at elevated temperatures. Importantly, our proposed methods yield insights that align with experimental observations and theoretical principles, aiding decision-making in process design and optimization. By circumventing the need for costly and time-consuming experiments, our approach offers a pragmatic pathway towards enhanced process efficiency. Moreover, this study underscores the potential of data-driven computational methods for innovating reactor design paradigms.
Published: 2024

5. Integrating supervised and unsupervised learning approaches to unveil critical process inputs

Author: Papavasileiou, Paris, Giovanis, Dimitrios G., Pozzetti, Gabriele, Kathrein, Martin, Czettl, Christoph, Kevrekidis, Ioannis G., Boudouvis, Andreas G., Bordas, Stéphane P. A., and Koronaki, Eleni D.
Subjects: Computer Science - Machine Learning
Abstract: This study introduces a machine learning framework tailored to large-scale industrial processes characterized by a plethora of numerical and categorical inputs. The framework aims to (i) discern critical parameters influencing the output and (ii) generate accurate out-of-sample qualitative and quantitative predictions of production outcomes. Specifically, we address the pivotal question of the significance of each input in shaping the process outcome, using an industrial Chemical Vapor Deposition (CVD) process as an example. The initial objective involves merging subject matter expertise and clustering techniques exclusively on the process output, here, coating thickness measurements at various positions in the reactor. This approach identifies groups of production runs that share similar qualitative characteristics, such as film mean thickness and standard deviation. In particular, the differences of the outcomes represented by the different clusters can be attributed to differences in specific inputs, indicating that these inputs are critical for the production outcome. Leveraging this insight, we subsequently implement supervised classification and regression methods using the identified critical process inputs. The proposed methodology proves to be valuable in scenarios with a multitude of inputs and insufficient data for the direct application of deep learning techniques, providing meaningful insights into the underlying processes.
Published: 2024
Full Text: View/download PDF

6. Nonlinear Manifold Learning Determines Microgel Size from Raman Spectroscopy

Author: Koronaki, Eleni D., Kaven, Luise F., Faust, Johannes M. M., Kevrekidis, Ioannis G., and Mitsos, Alexander
Subjects: Computer Science - Machine Learning, Electrical Engineering and Systems Science - Signal Processing
Abstract: Polymer particle size constitutes a crucial characteristic of product quality in polymerization. Raman spectroscopy is an established and reliable process analytical technology for in-line concentration monitoring. Recent approaches and some theoretical considerations show a correlation between Raman signals and particle sizes but do not determine polymer size from Raman spectroscopic measurements accurately and reliably. With this in mind, we propose three alternative machine learning workflows to perform this task, all involving diffusion maps, a nonlinear manifold learning technique for dimensionality reduction: (i) directly from diffusion maps, (ii) alternating diffusion maps, and (iii) conformal autoencoder neural networks. We apply the workflows to a data set of Raman spectra with associated size measured via dynamic light scattering of 47 microgel (cross-linked polymer) samples in a diameter range of 208nm to 483 nm. The conformal autoencoders substantially outperform state-of-the-art methods and results for the first time in a promising prediction of polymer size from Raman spectra., Comment: 51 pages, 12 figures, 4 tables
Published: 2024

7. Nonlinear dimensionality reduction then and now: AIMs for dissipative PDEs in the ML era

Author: Koronaki, Eleni D., Evangelou, Nikolaos, Martin-Linares, Cristina P., Titi, Edriss S., and Kevrekidis, Ioannis G.
Subjects: Mathematics - Dynamical Systems, Computer Science - Machine Learning
Abstract: This study presents a collection of purely data-driven workflows for constructing reduced-order models (ROMs) for distributed dynamical systems. The ROMs we focus on, are data-assisted models inspired by, and templated upon, the theory of Approximate Inertial Manifolds (AIMs); the particular motivation is the so-called post-processing Galerkin method of Garcia-Archilla, Novo and Titi. Its applicability can be extended: the need for accurate truncated Galerkin projections and for deriving closed-formed corrections can be circumvented using machine learning tools. When the right latent variables are not a priori known, we illustrate how autoencoders as well as Diffusion Maps (a manifold learning scheme) can be used to discover good sets of latent variables and test their explainability. The proposed methodology can express the ROMs in terms of (a) theoretical (Fourier coefficients), (b) linear data-driven (POD modes) and/or (c) nonlinear data-driven (Diffusion Maps) coordinates. Both Black-Box and (theoretically-informed and data-corrected) Gray-Box models are described; the necessity for the latter arises when truncated Galerkin projections are so inaccurate as to not be amenable to post-processing. We use the Chafee-Infante reaction-diffusion and the Kuramoto-Sivashinsky dissipative partial differential equations to illustrate and successfully test the overall framework., Comment: 27 pages, 22 figures
Published: 2023

8. Physics-agnostic and Physics-infused machine learning for thin films flows: modeling, and predictions from small data

Author: Martin-Linares, Cristina P., Psarellis, Yorgos M., Karapetsas, Georgios, Koronaki, Eleni D., and Kevrekidis, Ioannis G.
Subjects: Physics - Fluid Dynamics, Mathematics - Numerical Analysis
Abstract: Numerical simulations of multiphase flows are crucial in numerous engineering applications, but are often limited by the computationally demanding solution of the Navier-Stokes (NS) equations. Here, we present a data-driven workflow where a handful of detailed NS simulation data are leveraged into a reduced-order model for a prototypical vertically falling liquid film. We develop a physics-agnostic model for the film thickness, achieving a far better agreement with the NS solutions than the asymptotic Kuramoto-Sivashinsky (KS) equation. We also develop two variants of physics-infused models providing a form of calibration of a low-fidelity model (i.e. the KS) against a few high-fidelity NS data. Finally, predictive models for missing data are developed, for either the amplitude, or the full-field velocity and even the flow parameter from partial information. This is achieved with the so-called "Gappy Diffusion Maps", which we compare favorably to its linear counterpart, Gappy POD.
Published: 2023

9. From partial data to out-of-sample parameter and observation estimation with Diffusion Maps and Geometric Harmonics

Author: Koronaki, Eleni D., Evangelou, Nikolaos, Psarellis, Yorgos M., Boudouvis, Andreas G., and Kevrekidis, Ioannis G.
Subjects: Mathematics - Numerical Analysis
Abstract: A data-driven framework is presented, that enables the prediction of quantities, either observations or parameters, given sufficient partial data. The framework is illustrated via a computational model of the deposition of Cu in a Chemical Vapor Deposition (CVD) reactor, where the reactor pressure, the deposition temperature and feed mass flow rate are important process parameters that determine the outcome of the process. The sampled observations are high-dimensional vectors containing the outputs of a detailed CFD steady-state model of the process, i.e. the values of velocity, pressure, temperature, and species mass fractions at each point in the discretization. A machine learning workflow is presented, able to predict out-of-sample (a) observations (e.g. mass fraction in the reactor) given process parameters (e.g. inlet temperature); (b) process parameters given observation data; and (c) partial observations (e.g. temperature in the reactor) given other partial observations (e.g. mass fraction in the reactor). The proposed workflow relies on the manifold learning schemes Diffusion Maps and the associated Geometric Harmonics. Diffusion Maps is used for discovering a reduced representation of the available data, and Geometric Harmonics for extending functions defined on the manifold. In our work a special use case of Geometric Harmonics is formulated and implemented, which we call Double Diffusion Maps, to map from the reduced representation back to (partial) observations and process parameters. A comparison of our manifold learning scheme to the traditional Gappy-POD approach is provided: ours can be thought of as a "Gappy DMAP" approach. The presented methodology is easily transferable to application domains beyond reactor engineering.
Published: 2023
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

9 results on '"Koronaki, Eleni D."'

1. Implementing LLMs in industrial process modeling: Addressing Categorical Variables

2. Conformal Disentanglement: A Neural Framework for Perspective Synthesis and Differentiation

3. On Learning what to Learn: heterogeneous observations of dynamics and establishing (possibly causal) relations among them

4. Discovering deposition process regimes: leveraging unsupervised learning for process insights, surrogate modeling, and sensitivity analysis

5. Integrating supervised and unsupervised learning approaches to unveil critical process inputs

6. Nonlinear Manifold Learning Determines Microgel Size from Raman Spectroscopy

7. Nonlinear dimensionality reduction then and now: AIMs for dissipative PDEs in the ML era

8. Physics-agnostic and Physics-infused machine learning for thin films flows: modeling, and predictions from small data

9. From partial data to out-of-sample parameter and observation estimation with Diffusion Maps and Geometric Harmonics

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Publication Type

Database

9 results on '"Koronaki, Eleni D."'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources