Author: "Jauhiainen A" / Publication Type: Reports - Searchworks@Jio Institute Digital Library Search Results

1. Online optimisation for dynamic electrical impedance tomography

Author: Dizon, Neil, Jauhiainen, Jyrki, and Valkonen, Tuomo
Subjects: Mathematics - Optimization and Control, Computer Science - Computer Vision and Pattern Recognition
Abstract: Online optimisation studies the convergence of optimisation methods as the data embedded in the problem changes. Based on this idea, we propose a primal dual online method for nonlinear time-discrete inverse problems. We analyse the method through regret theory and demonstrate its performance in real-time monitoring of moving bodies in a fluid with Electrical Impedance Tomography (EIT). To do so, we also prove the second-order differentiability of the Complete Electrode Model (CEM) solution operator on $L^\infty$.
Published: 2024

2. Evaluating Students' Open-ended Written Responses with LLMs: Using the RAG Framework for GPT-3.5, GPT-4, Claude-3, and Mistral-Large

Author: Jauhiainen, Jussi S. and Guerra, Agustín Garagorry
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Evaluating open-ended written examination responses from students is an essential yet time-intensive task for educators, requiring a high degree of effort, consistency, and precision. Recent developments in Large Language Models (LLMs) present a promising opportunity to balance the need for thorough evaluation with efficient use of educators' time. In our study, we explore the effectiveness of LLMs ChatGPT-3.5, ChatGPT-4, Claude-3, and Mistral-Large in assessing university students' open-ended answers to questions made about reference material they have studied. Each model was instructed to evaluate 54 answers repeatedly under two conditions: 10 times (10-shot) with a temperature setting of 0.0 and 10 times with a temperature of 0.5, expecting a total of 1,080 evaluations per model and 4,320 evaluations across all models. The RAG (Retrieval Augmented Generation) framework was used as the framework to make the LLMs to process the evaluation of the answers. As of spring 2024, our analysis revealed notable variations in consistency and the grading outcomes provided by studied LLMs. There is a need to comprehend strengths and weaknesses of LLMs in educational settings for evaluating open-ended written responses. Further comparative research is essential to determine the accuracy and cost-effectiveness of using LLMs for educational assessments., Comment: 18 pages, 6 tables, 1 figure
Published: 2024

3. Prediction techniques for dynamic imaging with online primal-dual methods

Author: Dizon, Neil, Jauhiainen, Jyrki, and Valkonen, Tuomo
Subjects: Mathematics - Optimization and Control, Computer Science - Computer Vision and Pattern Recognition
Abstract: Online optimisation facilitates the solution of dynamic inverse problems, such as image stabilisation, fluid flow monitoring, and dynamic medical imaging. In this paper, we improve upon previous work on predictive online primal-dual methods on two fronts. Firstly, we provide a more concise analysis that symmetrises previously unsymmetric regret bounds, and relaxes previous restrictive conditions on the dual predictor. Secondly, based on the latter, we develop several improved dual predictors. We numerically demonstrate their efficacy in image stabilisation and dynamic positron emission tomography.
Published: 2024

4. UQSA -- An R-Package for Uncertainty Quantification and Sensitivity Analysis for Biochemical Reaction Network Models

Author: Kramer, Andrei, Milinanni, Federica, Nyquist, Pierre, Jauhiainen, Alexandra, and Eriksson, Olivia
Subjects: Quantitative Biology - Quantitative Methods, Quantitative Biology - Subcellular Processes
Abstract: We present an R-package developed for modeling of biochemical reaction networks, uncertainty quantification (UQ) and sensitivity analysis (SA). Estimating parameters and quantifying their uncertainty (and resulting prediction uncertainty), is required for data-driven systems biology modeling. Sampling methods need to be efficient when confronted with high-dimensional, correlated parameter distributions. We have developed the UQSA package to be fast for this problem class and work well with other tools for modelling. We aim for simplicity, and part of that is our use of the SBtab format for the unified storage of model and data. Our tool-set is modular enough, that parts can be replaced. We use intermediate formats that are not hidden from the user to make this feasible. UQ is performed through Markov chain Monte Carlo (MCMC) sampling in an Approximate Bayesian Computation (ABC) setting. This can be followed by a variance-decomposition based global sensitivity analysis. If needed, complex parameter distributions can be described, evaluated, and sampled from, with the help of Vine-copulas that are available in R. This approach is especially useful when new experimental data become available, and a previously calibrated model needs to be updated. Implementation: R is a high-level language and allows the use of sophisticated statistical methods. The ode solver we used is written in C (gsl_odeiv2, interface to R is ours). We use the SBtab tabular format for the model description, as well as the data and an event system to be able to model inputs frequently encountered in systems biology and neuroscience. The code has been tested on one node with 256 cores of a computing cluster, but smaller examples are included in the repository that can be run on a laptop. Source code: https://github.com/icpm-kth/uqsa, Comment: 6 pages, 1 figure, application note
Published: 2023

5. Findings of the VarDial Evaluation Campaign 2023

Author: Aepli, Noëmi, Çöltekin, Çağrı, Van Der Goot, Rob, Jauhiainen, Tommi, Kazzaz, Mourhaf, Ljubešić, Nikola, North, Kai, Plank, Barbara, Scherrer, Yves, and Zampieri, Marcos
Subjects: Computer Science - Computation and Language
Abstract: This report presents the results of the shared tasks organized as part of the VarDial Evaluation Campaign 2023. The campaign is part of the tenth workshop on Natural Language Processing (NLP) for Similar Languages, Varieties and Dialects (VarDial), co-located with EACL 2023. Three separate shared tasks were included this year: Slot and intent detection for low-resource language varieties (SID4LR), Discriminating Between Similar Languages -- True Labels (DSL-TL), and Discriminating Between Similar Languages -- Speech (DSL-S). All three tasks were organized for the first time this year.
Published: 2023

6. Language Variety Identification with True Labels

Author: Zampieri, Marcos, North, Kai, Jauhiainen, Tommi, Felice, Mariano, Kumari, Neha, Nair, Nishant, and Bangera, Yash
Subjects: Computer Science - Computation and Language
Abstract: Language identification is an important first step in many IR and NLP applications. Most publicly available language identification datasets, however, are compiled under the assumption that the gold label of each instance is determined by where texts are retrieved from. Research has shown that this is a problematic assumption, particularly in the case of very similar languages (e.g., Croatian and Serbian) and national language varieties (e.g., Brazilian and European Portuguese), where texts may contain no distinctive marker of the particular language or variety. To overcome this important limitation, this paper presents DSL True Labels (DSL-TL), the first human-annotated multilingual dataset for language variety identification. DSL-TL contains a total of 12,900 instances in Portuguese, split between European Portuguese and Brazilian Portuguese; Spanish, split between Argentine Spanish and Castilian Spanish; and English, split between American English and British English. We trained multiple models to discriminate between these language varieties, and we present the results in detail. The data and models presented in this paper provide a reliable benchmark toward the development of robust and fairer language variety identification systems. We make DSL-TL freely available to the research community.
Published: 2023

7. Uralic Language Identification (ULI) 2020 shared task dataset and the Wanca 2017 corpus

Author: Jauhiainen, Tommi, Jauhiainen, Heidi, Partanen, Niko, and Lindén, Krister
Subjects: Computer Science - Computation and Language
Abstract: This article introduces the Wanca 2017 corpus of texts crawled from the internet from which the sentences in rare Uralic languages for the use of the Uralic Language Identification (ULI) 2020 shared task were collected. We describe the ULI dataset and how it was constructed using the Wanca 2017 corpus and texts in different languages from the Leipzig corpora collection. We also provide baseline language identification experiments conducted using the ULI 2020 dataset.
Published: 2020

8. Mumford-Shah regularization in electrical impedance tomography with complete electrode model

Author: Jauhiainen, Jyrki, Seppänen, Aku, and Valkonen, Tuomo
Subjects: Mathematics - Optimization and Control, Mathematics - Analysis of PDEs, 65K10 (Primary), 35R30, 68U10, 35Q93 (Secondary)
Abstract: In electrical impedance tomography, we aim to solve the conductivity within a target body through electrical measurements made on the surface of the target. This inverse conductivity problem is severely ill-posed, especially in real applications with only partial boundary data available. Thus regularization has to be introduced. Conventionally regularization promoting smooth features is used, however, the Mumford--Shah regularizer familiar for image segmentation is more appropriate for targets consisting of several distinct objects or materials. It is, however, numerically challenging. We show theoretically through $\Gamma$-convergence that a modification of the Ambrosio--Tortorelli approximation of the Mumford--Shah regularizer is applicable to electrical impedance tomography, in particular the complete electrode model of boundary measurements. With numerical and experimental studies, we confirm that this functional works in practice and produces higher quality results than typical regularizations employed in electrical impedance tomography when the conductivity of the target consists of distinct smoothly-varying regions., Comment: 28 pages, 7 figures
Published: 2021
Full Text: View/download PDF

9. Comparing Approaches to Dravidian Language Identification

Author: Jauhiainen, Tommi, Ranasinghe, Tharindu, and Zampieri, Marcos
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: This paper describes the submissions by team HWR to the Dravidian Language Identification (DLI) shared task organized at VarDial 2021 workshop. The DLI training set includes 16,674 YouTube comments written in Roman script containing code-mixed text with English and one of the three South Dravidian languages: Kannada, Malayalam, and Tamil. We submitted results generated using two models, a Naive Bayes classifier with adaptive language models, which has shown to obtain competitive performance in many language and dialect identification tasks, and a transformer-based model which is widely regarded as the state-of-the-art in a number of NLP tasks. Our first submission was sent in the closed submission track using only the training set provided by the shared task organisers, whereas the second submission is considered to be open as it used a pretrained model trained with external data. Our team attained shared second position in the shared task with the submission based on Naive Bayes. Our results reinforce the idea that deep learning methods are not as competitive in language identification related tasks as they are in many other text classification tasks., Comment: Accepted to VarDial 2021 @ EACL 2021
Published: 2021

10. Non-planar sensing skins for structural health monitoring based on electrical resistance tomography

Author: Jauhiainen, Jyrki, Pour-Ghaz, Mohammad, Valkonen, Tuomo, and Seppänen, Aku
Subjects: Physics - Computational Physics, Mathematics - Differential Geometry, Mathematics - Numerical Analysis
Abstract: Electrical resistance tomography (ERT) -based distributed surface sensing systems, or sensing skins, offer alternative sensing techniques for structural health monitoring, providing capabilities for distributed sensing of, for example, damage, strain and temperature. Currently, however, the computational techniques utilized for sensing skins are limited to planar surfaces. In this paper, to overcome this limitation, we generalize the ERT-based surface sensing to non-planar surfaces covering arbitrarily shaped three-dimensional structures; We construct a framework in which we reformulate the image reconstruction problem of ERT using techniques of Riemannian geometry, and solve the resulting problem numerically. We test this framework in series of numerical and experimental studies. The results demonstrate that the feasibility of the proposed formulation and the applicability of ERT-based sensing skins for non-planar geometries., Comment: 22 pages, 7 figures
Published: 2020
Full Text: View/download PDF

11. FinnSentiment -- A Finnish Social Media Corpus for Sentiment Polarity Annotation

Author: Lindén, Krister, Jauhiainen, Tommi, and Hardwick, Sam
Subjects: Computer Science - Computation and Language
Abstract: Sentiment analysis and opinion mining is an important task with obvious application areas in social media, e.g. when indicating hate speech and fake news. In our survey of previous work, we note that there is no large-scale social media data set with sentiment polarity annotations for Finnish. This publications aims to remedy this shortcoming by introducing a 27,000 sentence data set annotated independently with sentiment polarity by three native annotators. We had the same three annotators for the whole data set, which provides a unique opportunity for further studies of annotator behaviour over time. We analyse their inter-annotator agreement and provide two baselines to validate the usefulness of the data set.
Published: 2020

12. Language Model Adaptation for Language and Dialect Identification of Text

Author: Jauhiainen, Tommi, Lindén, Krister, and Jauhiainen, Heidi
Subjects: Computer Science - Computation and Language
Abstract: This article describes an unsupervised language model adaptation approach that can be used to enhance the performance of language identification methods. The approach is applied to a current version of the HeLI language identification method, which is now called HeLI 2.0. We describe the HeLI 2.0 method in detail. The resulting system is evaluated using the datasets from the German dialect identification and Indo-Aryan language identification shared tasks of the VarDial workshops 2017 and 2018. The new approach with language identification provides considerably higher F1-scores than the previous HeLI method or the other systems which participated in the shared tasks. The results indicate that unsupervised language model adaptation should be considered as an option in all language identification tasks, especially in those where encountering out-of-domain data is likely.
Published: 2019

13. Language and Dialect Identification of Cuneiform Texts

Author: Jauhiainen, Tommi, Jauhiainen, Heidi, Alstola, Tero, and Lindén, Krister
Subjects: Computer Science - Computation and Language
Abstract: This article introduces a corpus of cuneiform texts from which the dataset for the use of the Cuneiform Language Identification (CLI) 2019 shared task was derived as well as some preliminary language identification experiments conducted using that corpus. We also describe the CLI dataset and how it was derived from the corpus. In addition, we provide some baseline language identification results using the CLI dataset. To the best of our knowledge, the experiments detailed here are the first time automatic language identification methods have been used on cuneiform data.
Published: 2019

14. Relaxed Gauss-Newton methods with applications to electrical impedance tomography

Author: Jauhiainen, Jyrki, Kuusela, Petri, Seppänen, Aku, and Valkonen, Tuomo
Subjects: Mathematics - Optimization and Control, Mathematics - Numerical Analysis, 90C26 (Primary), 49M15, 35R30, 68U10 (Secondary)
Abstract: As second-order methods, Gauss--Newton-type methods can be more effective than first-order methods for the solution of nonsmooth optimization problems with expensive-to-evaluate smooth components. Such methods, however, often do not converge. Motivated by nonlinear inverse problems with nonsmooth regularization, we propose a new Gauss--Newton-type method with inexact relaxed steps. We prove that the method converges to a set of disjoint critical points given that the linearisation of the forward operator for the inverse problem is sufficiently precise. We extensively evaluate the performance of the method on electrical impedance tomography (EIT)., Comment: 43 pages, 29 figures
Published: 2020
Full Text: View/download PDF

15. Primal-dual block-proximal splitting for a class of non-convex problems

Author: Mazurenko, Stanislav, Jauhiainen, Jyrki, and Valkonen, Tuomo
Subjects: Mathematics - Optimization and Control, Mathematics - Numerical Analysis
Abstract: We develop block structure adapted primal-dual algorithms for non-convex non-smooth optimisation problems whose objectives can be written as compositions $G(x)+F(K(x))$ of non-smooth block-separable convex functions $G$ and $F$ with a non-linear Lipschitz-differentiable operator $K$. Our methods are refinements of the non-linear primal-dual proximal splitting method for such problems without the block structure, which itself is based on the primal-dual proximal splitting method of Chambolle and Pock for convex problems. We propose individual step length parameters and acceleration rules for each of the primal and dual blocks of the problem. This allows them to convergence faster by adapting to the structure of the problem. For the squared distance of the iterates to a critical point, we show local $O(1/N)$, $O(1/N^2)$ and linear rates under varying conditions and choices of the step lengths parameters. Finally, we demonstrate the performance of the methods on practical inverse problems: diffusion tensor imaging and electrical impedance tomography.
Published: 2019
Full Text: View/download PDF

16. Automatic Language Identification in Texts: A Survey

Author: Jauhiainen, Tommi, Lui, Marco, Zampieri, Marcos, Baldwin, Timothy, and Lindén, Krister
Subjects: Computer Science - Computation and Language
Abstract: Language identification (LI) is the problem of determining the natural language that a document or part thereof is written in. Automatic LI has been extensively researched for over fifty years. Today, LI is a key part of many text processing pipelines, as text processing techniques generally assume that the language of the input text is known. Research in this area has recently been especially active. This article provides a brief history of LI research, and an extensive survey of the features and methods used so far in the LI literature. For describing the features and methods we introduce a unified notation. We discuss evaluation methods, applications of LI, as well as off-the-shelf LI systems that do not require training by the end user. Finally, we identify open issues, survey the work to date on each issue, and propose future directions for research in LI., Comment: Under review at JAIR - Journal of Artificial Intelligence Research
Published: 2018

17. ROPE: high-dimensional network modeling with robust control of edge FDR

Author: Kallus, Jonatan, Sanchez, Jose, Jauhiainen, Alexandra, Nelander, Sven, and Jörnsten, Rebecka
Subjects: Statistics - Computation
Abstract: Network modeling has become increasingly popular for analyzing genomic data, to aid in the interpretation and discovery of possible mechanistic components and therapeutic targets. However, genomic-scale networks are high-dimensional models and are usually estimated from a relatively small number of samples. Therefore, their usefulness is hampered by estimation instability. In addition, the complexity of the models is controlled by one or more penalization (tuning) parameters where small changes to these can lead to vastly different networks, thus making interpretation of models difficult. This necessitates the development of techniques to produce robust network models accompanied by estimation quality assessments. We introduce Resampling of Penalized Estimates (ROPE): a novel statistical method for robust network modeling. The method utilizes resampling-based network estimation and integrates results from several levels of penalization through a constrained, over-dispersed beta-binomial mixture model. ROPE provides robust False Discovery Rate (FDR) control of network estimates and each edge is assigned a measure of validity, the q-value, corresponding to the FDR-level for which the edge would be included in the network model. We apply ROPE to several simulated data sets as well as genomic data from The Cancer Genome Atlas. We show that ROPE outperforms state-of-the-art methods in terms of FDR control and robust performance across data sets. We illustrate how to use ROPE to make a principled model selection for which genomic associations to study further. ROPE is available as an R package on CRAN., Comment: Main paper contains 6 figures
Published: 2017

18. Inferring Regulatory Networks by Combining Perturbation Screens and Steady State Gene Expression Profiles

Author: Shojaie, Ali, Jauhiainen, Alexandra, Kallitsis, Michael, and Michailidis, George
Subjects: Statistics - Machine Learning, Quantitative Biology - Molecular Networks
Abstract: Reconstructing transcriptional regulatory networks is an important task in functional genomics. Data obtained from experiments that perturb genes by knockouts or RNA interference contain useful information for addressing this reconstruction problem. However, such data can be limited in size and/or are expensive to acquire. On the other hand, observational data of the organism in steady state (e.g. wild-type) are more readily available, but their informational content is inadequate for the task at hand. We develop a computational approach to appropriately utilize both data sources for estimating a regulatory network. The proposed approach is based on a three-step algorithm to estimate the underlying directed but cyclic network, that uses as input both perturbation screens and steady state gene expression data. In the first step, the algorithm determines causal orderings of the genes that are consistent with the perturbation data, by combining an exhaustive search method with a fast heuristic that in turn couples a Monte Carlo technique with a fast search algorithm. In the second step, for each obtained causal ordering, a regulatory network is estimated using a penalized likelihood based method, while in the third step a consensus network is constructed from the highest scored ones. Extensive computational experiments show that the algorithm performs well in reconstructing the underlying network and clearly outperforms competing approaches that rely only on a single data source. Further, it is established that the algorithm produces a consistent estimate of the regulatory network., Comment: 24 pages, 4 figures, 6 tables
Published: 2013
Full Text: View/download PDF

19. The Home-School Connection: Parental Influences on a Child's ESL Acquisition

Author: Jauhiainen, Catharine, primary
Published: 2000
Full Text: View/download PDF

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

19 results on '"Jauhiainen A"'

1. Online optimisation for dynamic electrical impedance tomography

2. Evaluating Students' Open-ended Written Responses with LLMs: Using the RAG Framework for GPT-3.5, GPT-4, Claude-3, and Mistral-Large

3. Prediction techniques for dynamic imaging with online primal-dual methods

4. UQSA -- An R-Package for Uncertainty Quantification and Sensitivity Analysis for Biochemical Reaction Network Models

5. Findings of the VarDial Evaluation Campaign 2023

6. Language Variety Identification with True Labels

7. Uralic Language Identification (ULI) 2020 shared task dataset and the Wanca 2017 corpus

8. Mumford-Shah regularization in electrical impedance tomography with complete electrode model

9. Comparing Approaches to Dravidian Language Identification

10. Non-planar sensing skins for structural health monitoring based on electrical resistance tomography

11. FinnSentiment -- A Finnish Social Media Corpus for Sentiment Polarity Annotation

12. Language Model Adaptation for Language and Dialect Identification of Text

13. Language and Dialect Identification of Cuneiform Texts

14. Relaxed Gauss-Newton methods with applications to electrical impedance tomography

15. Primal-dual block-proximal splitting for a class of non-convex problems

16. Automatic Language Identification in Texts: A Survey

17. ROPE: high-dimensional network modeling with robust control of edge FDR

18. Inferring Regulatory Networks by Combining Perturbation Screens and Steady State Gene Expression Profiles

19. The Home-School Connection: Parental Influences on a Child's ESL Acquisition

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Publication Type

Database

19 results on '"Jauhiainen A"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources