Author: "Ergen A" / Search Limiters: Available in Library Collection - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Ergen A"' showing total 3,536 results

Start Over Author "Ergen A" Search Limiters Available in Library Collection

3,536 results on '"Ergen A"'

1. Investigation of GHRL (rs4684677), FTO (rs8044769) and PGC1Α (rs8192678) polymorphisms in type 2 diabetic Turkish population

Author: Oğuz Osman, Gheybi Arezoo, Doğan Zeliha, Akbaş Feray, Zeybek Ümit, and Ergen Arzu
Subjects: genetic polymorphism, ghrelin, single nucleotide polymorphism, type 2 diabetes mellitus, Biochemistry, QD415-436
Abstract: Diabetes is a chronic group of metabolic disorders those generally present with hyperglycemia hence insulin synthesis defects due to multifactorial causes in beta cells in the Langerhans islets of the pancreas. In the development of diabetes, genetic predisposition is as important as environmental factors. As a result of polymorphism studies in diabetic patients, many genes were associated with the development of diabetes. In our study, we aimed to represent the relationship between diabetes and certain variants of the ghrelin (GHRL), fat mass and obesity-associated protein (FTO) and peroxisome proliferator-activated receptor-gamma coactivator (PGC-1α) genes which are generally associated with diabetes and obesity.
Published: 2022
Full Text: View/download PDF

2. Clear Preferences Leave Traces: Reference Model-Guided Sampling for Preference Learning

Author: Diwan, Nirav, Ergen, Tolga, Shim, Dongsub, and Lee, Honglak
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: Direct Preference Optimization (DPO) has emerged as a de-facto approach for aligning language models with human preferences. Recent work has shown DPO's effectiveness relies on training data quality. In particular, clear quality differences between preferred and rejected responses enhance learning performance. Current methods for identifying and obtaining such high-quality samples demand additional resources or external models. We discover that reference model probability space naturally detects high-quality training samples. Using this insight, we present a sampling strategy that achieves consistent improvements (+0.1 to +0.4) on MT-Bench while using less than half (30-50%) of the training data. We observe substantial improvements (+0.4 to +0.98) for technical tasks (coding, math, and reasoning) across multiple models and hyperparameter settings.
Published: 2025

3. Unveiling potential candidates for rare-earth-free permanent magnet and magnetocaloric effect applications: a high throughput screening in Fe-N alloys

Author: Gao, Qiang, Bao, Ergen, Shahid, Ijaz, Ma, Hui, and Chen, Xing-Qiu
Subjects: Condensed Matter - Materials Science
Abstract: Based on high-throughput density functional theory calculations, we have found 49 ferromag-netic cases in FexN1-x (0
Published: 2025

4. Investigation of RASSF4 gene in head and neck cancers

Author: Karagedik Emine H., Pamuk Saim, Ataş Merve N., Ulusan Murat, Aydemir Levent, and Ergen Arzu
Subjects: elisa, gene expression, head and neck, polymorphism, rassf4, gen anlatımı, baş boyun, polimorfizm, Biochemistry, QD415-436
Abstract: RASSF gene family can inhibit the growth of RAS oncogene. This gene family is suggested to have a role in cell cycle control, apoptosis, cell migration, and mitosis control. This study evaluated RASSF4 gene expression levels, SNPs and serum levels in tissues dissected from both healthy individuals and patients diagnosed with head, and neck cancer.
Published: 2021
Full Text: View/download PDF

5. Map2Text: New Content Generation from Low-Dimensional Visualizations

Author: Zhang, Xingjian, Xiong, Ziyang, Liu, Shixuan, Xie, Yutong, Ergen, Tolga, Shim, Dongsub, Xu, Hua, Lee, Honglak, and Me, Qiaozhu
Subjects: Computer Science - Artificial Intelligence, Computer Science - Human-Computer Interaction
Abstract: Low-dimensional visualizations, or "projection maps" of datasets, are widely used across scientific research and creative industries as effective tools for interpreting large-scale and complex information. These visualizations not only support understanding existing knowledge spaces but are often used implicitly to guide exploration into unknown areas. While powerful methods like TSNE or UMAP can create such visual maps, there is currently no systematic way to leverage them for generating new content. To bridge this gap, we introduce Map2Text, a novel task that translates spatial coordinates within low-dimensional visualizations into new, coherent, and accurately aligned textual content. This allows users to explore and navigate undiscovered information embedded in these spatial layouts interactively and intuitively. To evaluate the performance of Map2Text methods, we propose Atometric, an evaluation metric that provides a granular assessment of logical coherence and alignment of the atomic statements in the generated texts. Experiments conducted across various datasets demonstrate the versatility of Map2Text in generating scientific research hypotheses, crafting synthetic personas, and devising strategies for testing large language models. Our findings highlight the potential of Map2Text to unlock new pathways for interacting with and navigating large-scale textual datasets, offering a novel framework for spatially guided content generation and discovery.
Published: 2024

6. The Fe-N system: crystal structure prediction, phase stability, and mechanical properties

Author: Bao, Ergen, Zhao, Jinbin, Gao, Qiang, Shahid, Ijaz, Ma, Hui, Luo, Yixiu, Liu, Peitao, Sun, Yan, and Chen, Xing-Qiu
Subjects: Condensed Matter - Materials Science
Abstract: Nitriding introduces nitrides into the surface of steels, significantly enhancing the surface me-chanical properties. By combining the variable composition evolutionary algorithm and first-principles calculations based on density functional theory, 50 thermodynamically stable or metastable Fe-N compounds with various stoichiometric ratios were identified, exhibiting also dynamic and mechanical stability. The mechanical properties of these structures were systemati-cally studied, including the bulk modulus, shear modulus, Young's modulus, Poisson's ratio, Pugh's ratio, Cauchy pressure, Klemen parameters, universal elastic anisotropy, Debye tempera-ture, and Vickers hardness. All identified stable and metastable Fe-N compounds were found in the ductile region, with most exhibiting homogeneous elastic properties and isotropic metallic bonding. As the nitrogen concentration increases, their bulk moduli generally increase as well. The Vickers hardness values of Fe-N compounds range from 3.5 to 10.5 GPa, which are signifi-cantly higher than that of pure Fe (2.0 GPa), due to the stronger Fe-N bonds strength. This study provides insights into optimizing and designing Fe-N alloys with tailored mechanical properties., Comment: 19 pages, 19 figures
Published: 2024

7. SPRIG: Improving Large Language Model Performance by System Prompt Optimization

Author: Zhang, Lechen, Ergen, Tolga, Logeswaran, Lajanugen, Lee, Moontae, and Jurgens, David
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Human-Computer Interaction, Computer Science - Machine Learning
Abstract: Large Language Models (LLMs) have shown impressive capabilities in many scenarios, but their performance depends, in part, on the choice of prompt. Past research has focused on optimizing prompts specific to a task. However, much less attention has been given to optimizing the general instructions included in a prompt, known as a system prompt. To address this gap, we propose SPRIG, an edit-based genetic algorithm that iteratively constructs prompts from prespecified components to maximize the model's performance in general scenarios. We evaluate the performance of system prompts on a collection of 47 different types of tasks to ensure generalizability. Our study finds that a single optimized system prompt performs on par with task prompts optimized for each individual task. Moreover, combining system and task-level optimizations leads to further improvement, which showcases their complementary nature. Experiments also reveal that the optimized system prompts generalize effectively across model families, parameter sizes, and languages. This study provides insights into the role of system-level instructions in maximizing LLM potential.
Published: 2024

8. A Generally Covariant Model of Spacetime as a 4-Brane in 4+1 Flat Dimensions

Author: Ergen, Mert and Arık, Metin
Subjects: General Relativity and Quantum Cosmology, High Energy Physics - Theory
Abstract: We construct a simple toy model of a closed FLRW spacetime by starting from a flat five dimensional scalar field space with a quartic potential. The action contains no curvature terms. An SO(4) invariant ansatz for the scalar fields is shown to lead to a linearly expanding universe. The model is fully covariant and the spacetime metric is uniquely determined from the dynamical equations of the metric tensor provided that the cosmological constant is negative.
Published: 2024

9. Consensus prediction of cell type labels in single-cell data with popV.

Author: Ergen, Can, Xing, Galen, Xu, Chenling, Kim, Martin, Jayasuriya, Michael, McGeever, Erin, Oliveira Pisco, Angela, Streets, Aaron, and Yosef, Nir
Subjects: Single-Cell Analysis, Humans, Algorithms, Molecular Sequence Annotation, Computational Biology, Consensus, Software
Abstract: Cell-type classification is a crucial step in single-cell sequencing analysis. Various methods have been proposed for transferring a cell-type label from an annotated reference atlas to unannotated query datasets. Existing methods for transferring cell-type labels lack proper uncertainty estimation for the resulting annotations, limiting interpretability and usefulness. To address this, we propose popular Vote (popV), an ensemble of prediction models with an ontology-based voting scheme. PopV achieves accurate cell-type labeling and provides uncertainty scores. In multiple case studies, popV confidently annotates the majority of cells while highlighting cell populations that are challenging to annotate by label transfer. This additional step helps to reduce the load of manual inspection, which is often a necessary component of the annotation process, and enables one to focus on the most problematic parts of the annotation, streamlining the overall annotation process.
Published: 2024

10. VI-VS: calibrated identification of feature dependencies in single-cell multiomics.

Author: Boyeau, Pierre, Bates, Stephen, Ergen, Can, Jordan, Michael, and Yosef, Nir
Subjects: Single-Cell Analysis, Software, Machine Learning, Humans, Genomics, Multiomics
Abstract: Unveiling functional relationships between various molecular cell phenotypes from data using machine learning models is a key promise of multiomics. Existing methods either use flexible but hard-to-interpret models or simpler, misspecified models. VI-VS (Variational Inference for Variable Selection) balances flexibility and interpretability to identify relevant feature relationships in multiomic data. It uses deep generative models to identify conditionally dependent features, with false discovery rate control. VI-VS is available as an open-source Python package, providing a robust solution to identify features more likely representing genuine causal relationships.
Published: 2024

11. Hearing Your Blood Sugar: Non-Invasive Glucose Measurement Through Simple Vocal Signals, Transforming any Speech into a Sensor with Machine Learning

Author: Ahmadli, Nihat, Sarsil, Mehmet Ali, and Ergen, Onur
Subjects: Computer Science - Machine Learning, Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: Effective diabetes management relies heavily on the continuous monitoring of blood glucose levels, traditionally achieved through invasive and uncomfortable methods. While various non-invasive techniques have been explored, such as optical, microwave, and electrochemical approaches, none have effectively supplanted these invasive technologies due to issues related to complexity, accuracy, and cost. In this study, we present a transformative and straightforward method that utilizes voice analysis to predict blood glucose levels. Our research investigates the relationship between fluctuations in blood glucose and vocal characteristics, highlighting the influence of blood vessel dynamics during voice production. By applying advanced machine learning algorithms, we analyzed vocal signal variations and established a significant correlation with blood glucose levels. We developed a predictive model using artificial intelligence, based on voice recordings and corresponding glucose measurements from participants, utilizing logistic regression and Ridge regularization. Our findings indicate that voice analysis may serve as a viable non-invasive alternative for glucose monitoring. This innovative approach not only has the potential to streamline and reduce the costs associated with diabetes management but also aims to enhance the quality of life for individuals living with diabetes by providing a painless and user-friendly method for monitoring blood sugar levels., Comment: 5 figure and 5 tables. This manuscript is a pre-print to be submitted to a journal or/and a conference. arXiv admin note: substantial text overlap with arXiv:2402.13812
Published: 2024

12. MASSW: A New Dataset and Benchmark Tasks for AI-Assisted Scientific Workflows

Author: Zhang, Xingjian, Xie, Yutong, Huang, Jin, Ma, Jinge, Pan, Zhaoying, Liu, Qijia, Xiong, Ziyang, Ergen, Tolga, Shim, Dongsub, Lee, Honglak, and Mei, Qiaozhu
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Scientific innovation relies on detailed workflows, which include critical steps such as analyzing literature, generating ideas, validating these ideas, interpreting results, and inspiring follow-up research. However, scientific publications that document these workflows are extensive and unstructured. This makes it difficult for both human researchers and AI systems to effectively navigate and explore the space of scientific innovation. To address this issue, we introduce MASSW, a comprehensive text dataset on Multi-Aspect Summarization of Scientific Workflows. MASSW includes more than 152,000 peer-reviewed publications from 17 leading computer science conferences spanning the past 50 years. Using Large Language Models (LLMs), we automatically extract five core aspects from these publications -- context, key idea, method, outcome, and projected impact -- which correspond to five key steps in the research workflow. These structured summaries facilitate a variety of downstream tasks and analyses. The quality of the LLM-extracted summaries is validated by comparing them with human annotations. We demonstrate the utility of MASSW through multiple novel machine-learning tasks that can be benchmarked using this new dataset, which make various types of predictions and recommendations along the scientific workflow. MASSW holds significant potential for researchers to create and benchmark new AI methods for optimizing scientific workflows and fostering scientific innovation in the field. Our dataset is openly available at \url{https://github.com/xingjian-zhang/massw}., Comment: arXiv admin note: text overlap with arXiv:1706.03762 by other authors
Published: 2024

13. A Library of Mirrors: Deep Neural Nets in Low Dimensions are Convex Lasso Models with Reflection Features

Author: Zeger, Emi, Wang, Yifei, Mishkin, Aaron, Ergen, Tolga, Candès, Emmanuel, and Pilanci, Mert
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Neural and Evolutionary Computing, Mathematics - Optimization and Control, Statistics - Machine Learning
Abstract: We prove that training neural networks on 1-D data is equivalent to solving convex Lasso problems with discrete, explicitly defined dictionary matrices. We consider neural networks with piecewise linear activations and depths ranging from 2 to an arbitrary but finite number of layers. We first show that two-layer networks with piecewise linear activations are equivalent to Lasso models using a discrete dictionary of ramp functions, with breakpoints corresponding to the training data points. In certain general architectures with absolute value or ReLU activations, a third layer surprisingly creates features that reflect the training data about themselves. Additional layers progressively generate reflections of these reflections. The Lasso representation provides valuable insights into the analysis of globally optimal networks, elucidating their solution landscapes and enabling closed-form solutions in certain special cases. Numerical results show that reflections also occur when optimizing standard deep networks using standard non-convex optimizers. Additionally, we demonstrate our theory with autoregressive time series models.
Published: 2024

14. Total Completion Time Scheduling Under Scenarios

Author: Bosman, Thomas, van Ee, Martijn, Ergen, Ekin, Imreh, Csanad, Marchetti-Spaccamela, Alberto, Skutella, Martin, and Stougie, Leen
Subjects: Computer Science - Data Structures and Algorithms
Abstract: Scheduling jobs with given processing times on identical parallel machines so as to minimize their total completion time is one of the most basic scheduling problems. We study interesting generalizations of this classical problem involving scenarios. In our model, a scenario is defined as a subset of a predefined and fully specified set of jobs. The aim is to find an assignment of the whole set of jobs to identical parallel machines such that the schedule, obtained for the given scenarios by simply skipping the jobs not in the scenario, optimizes a function of the total completion times over all scenarios. While the underlying scheduling problem without scenarios can be solved efficiently by a simple greedy procedure (SPT rule), scenarios, in general, make the problem NP-hard. We paint an almost complete picture of the evolving complexity landscape, drawing the line between easy and hard. One of our main algorithmic contributions relies on a deep structural result on the maximum imbalance of an optimal schedule, based on a subtle connection to Hilbert bases of a related convex cone.
Published: 2024

15. Voice-Driven Mortality Prediction in Hospitalized Heart Failure Patients: A Machine Learning Approach Enhanced with Diagnostic Biomarkers

Author: Ahmadli, Nihat, Sarsil, Mehmet Ali, Mizrak, Berk, Karauzum, Kurtulus, Shaker, Ata, Tulumen, Erol, Mirzamidinov, Didar, Ural, Dilek, and Ergen, Onur
Subjects: Computer Science - Machine Learning, Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: Addressing heart failure (HF) as a prevalent global health concern poses difficulties in implementing innovative approaches for enhanced patient care. Predicting mortality rates in HF patients, in particular, is difficult yet critical, necessitating individualized care, proactive management, and enabling educated decision-making to enhance outcomes. Recently, the significance of voice biomarkers coupled with Machine Learning (ML) has surged, demonstrating remarkable efficacy, particularly in predicting heart failure. The synergy of voice analysis and ML algorithms provides a non-invasive and easily accessible means to evaluate patients' health. However, there is a lack of voice biomarkers for predicting mortality rates among heart failure patients with standardized speech protocols. Here, we demonstrate a powerful and effective ML model for predicting mortality rates in hospitalized HF patients through the utilization of voice biomarkers. By seamlessly integrating voice biomarkers into routine patient monitoring, this strategy has the potential to improve patient outcomes, optimize resource allocation, and advance patient-centered HF management. In this study, a Machine Learning system, specifically a logistic regression model, is trained to predict patients' 5-year mortality rates using their speech as input. The model performs admirably and consistently, as demonstrated by cross-validation and statistical approaches (p-value < 0.001). Furthermore, integrating NT-proBNP, a diagnostic biomarker in HF, improves the model's predictive accuracy substantially., Comment: 11 pages, 6 figures, 5 tables. The first 2 authors have contributed equally
Published: 2024

16. The Convex Landscape of Neural Networks: Characterizing Global Optima and Stationary Points via Lasso Models

Author: Ergen, Tolga and Pilanci, Mert
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Mathematics - Optimization and Control, Statistics - Machine Learning
Abstract: Due to the non-convex nature of training Deep Neural Network (DNN) models, their effectiveness relies on the use of non-convex optimization heuristics. Traditional methods for training DNNs often require costly empirical methods to produce successful models and do not have a clear theoretical foundation. In this study, we examine the use of convex optimization theory and sparse recovery models to refine the training process of neural networks and provide a better interpretation of their optimal weights. We focus on training two-layer neural networks with piecewise linear activations and demonstrate that they can be formulated as a finite-dimensional convex program. These programs include a regularization term that promotes sparsity, which constitutes a variant of group Lasso. We first utilize semi-infinite programming theory to prove strong duality for finite width neural networks and then we express these architectures equivalently as high dimensional convex sparse recovery models. Remarkably, the worst-case complexity to solve the convex program is polynomial in the number of samples and number of neurons when the rank of the data matrix is bounded, which is the case in convolutional networks. To extend our method to training data of arbitrary rank, we develop a novel polynomial-time approximation scheme based on zonotope subsampling that comes with a guaranteed approximation ratio. We also show that all the stationary of the nonconvex training objective can be characterized as the global optimum of a subsampled convex program. Our convex models can be trained using standard convex solvers without resorting to heuristics or extensive hyper-parameter tuning unlike non-convex methods. Through extensive numerical experiments, we show that convex models can outperform traditional non-convex methods and are not sensitive to optimizer hyperparameters., Comment: A preliminary version of part of this work was published at ICML 2020 with the title "Neural Networks are Convex Regularizers: Exact Polynomial-time Convex Optimization Formulations for Two-layer Networks"
Published: 2023

17. The Energy Prediction Smart-Meter Dataset: Analysis of Previous Competitions and Beyond

Author: Pekaslan, Direnc, Alonso-Moral, Jose Maria, Bandara, Kasun, Bergmeir, Christoph, Bernabe-Moreno, Juan, Eigenmann, Robert, Einecke, Nils, Ergen, Selvi, Godahewa, Rakshitha, Hewamalage, Hansika, Lago, Jesus, Limmer, Steffen, Rebhan, Sven, Rabinovich, Boris, Rajapasksha, Dilini, Song, Heda, Wagner, Christian, Wu, Wenlong, Magdalena, Luis, and Triguero, Isaac
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: This paper presents the real-world smart-meter dataset and offers an analysis of solutions derived from the Energy Prediction Technical Challenges, focusing primarily on two key competitions: the IEEE Computational Intelligence Society (IEEE-CIS) Technical Challenge on Energy Prediction from Smart Meter data in 2020 (named EP) and its follow-up challenge at the IEEE International Conference on Fuzzy Systems (FUZZ-IEEE) in 2021 (named as XEP). These competitions focus on accurate energy consumption forecasting and the importance of interpretability in understanding the underlying factors. The challenge aims to predict monthly and yearly estimated consumption for households, addressing the accurate billing problem with limited historical smart meter data. The dataset comprises 3,248 smart meters, with varying data availability ranging from a minimum of one month to a year. This paper delves into the challenges, solutions and analysing issues related to the provided real-world smart meter data, developing accurate predictions at the household level, and introducing evaluation criteria for assessing interpretability. Additionally, this paper discusses aspects beyond the competitions: opportunities for energy disaggregation and pattern detection applications at the household level, significance of communicating energy-driven factors for optimised billing, and emphasising the importance of responsible AI and data privacy considerations. These aspects provide insights into the broader implications and potential advancements in energy consumption prediction. Overall, these competitions provide a dataset for residential energy research and serve as a catalyst for exploring accurate forecasting, enhancing interpretability, and driving progress towards the discussion of various aspects such as energy disaggregation, demand response programs or behavioural interventions.
Published: 2023

18. Topological Expressivity of ReLU Neural Networks

Author: Ergen, Ekin and Grillo, Moritz
Subjects: Computer Science - Machine Learning, Computer Science - Discrete Mathematics, Mathematics - Algebraic Topology
Abstract: We study the expressivity of ReLU neural networks in the setting of a binary classification problem from a topological perspective. Recently, empirical studies showed that neural networks operate by changing topology, transforming a topologically complicated data set into a topologically simpler one as it passes through the layers. This topological simplification has been measured by Betti numbers, which are algebraic invariants of a topological space. We use the same measure to establish lower and upper bounds on the topological simplification a ReLU neural network can achieve with a given architecture. We therefore contribute to a better understanding of the expressivity of ReLU neural networks in the context of binary classification problems by shedding light on their ability to capture the underlying topological structure of the data. In particular the results show that deep ReLU neural networks are exponentially more powerful than shallow ones in terms of topological simplification. This provides a mathematically rigorous explanation why deeper networks are better equipped to handle complex and topologically rich data sets., Comment: 44 pages, to appear in COLT 2024
Published: 2023

19. Effects of Different Synthetic and Organic Fertilizer Applications on the Micromorphological Characteristics of Maize (Zea mays L.) Leaves and Some Silage Quality Traits

Author: Yildirim, Gözde Hafize, Yilmaz, Nuri, Soysal, Ayşe Özge Şimşek, Öztürk, Şükran, Akçin, Öznur Ergen, and Ay, Ebru Bati
Published: 2024
Full Text: View/download PDF

20. Successful Management of a Pediatric Patient with Humeral Lateral Condyle Non-union, Elbow Valgus Deformity and Ulnar Neuropathy

Author: Çoban, İdris, Karakaplan, Mustafa, Ergen, Emre, Aslantürk, Okan, Köroğlu, Muhammed, and Ertem, Kadir
Published: 2024
Full Text: View/download PDF

21. Fixing the NTK: From Neural Network Linearizations to Exact Convex Programs

Author: Dwaraknath, Rajat Vadiraj, Ergen, Tolga, and Pilanci, Mert
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Recently, theoretical analyses of deep neural networks have broadly focused on two directions: 1) Providing insight into neural network training by SGD in the limit of infinite hidden-layer width and infinitesimally small learning rate (also known as gradient flow) via the Neural Tangent Kernel (NTK), and 2) Globally optimizing the regularized training objective via cone-constrained convex reformulations of ReLU networks. The latter research direction also yielded an alternative formulation of the ReLU network, called a gated ReLU network, that is globally optimizable via efficient unconstrained convex programs. In this work, we interpret the convex program for this gated ReLU network as a Multiple Kernel Learning (MKL) model with a weighted data masking feature map and establish a connection to the NTK. Specifically, we show that for a particular choice of mask weights that do not depend on the learning targets, this kernel is equivalent to the NTK of the gated ReLU network on the training data. A consequence of this lack of dependence on the targets is that the NTK cannot perform better than the optimal MKL kernel on the training set. By using iterative reweighting, we improve the weights induced by the NTK to obtain the optimal MKL kernel which is equivalent to the solution of the exact convex reformulation of the gated ReLU network. We also provide several numerical simulations corroborating our theory. Additionally, we provide an analysis of the prediction error of the resulting optimal kernel via consistency results for the group lasso., Comment: Accepted to Neurips 2023
Published: 2023

22. CalibFPA: A Focal Plane Array Imaging System based on Online Deep-Learning Calibration

Author: Güngör, Alper, Bahceci, M. Umut, Ergen, Yasin, Sözak, Ahmet, Ekiz, O. Oner, Yelboga, Tolga, and Çukur, Tolga
Subjects: Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Computer Vision and Pattern Recognition
Abstract: Compressive focal plane arrays (FPA) enable cost-effective high-resolution (HR) imaging by acquisition of several multiplexed measurements on a low-resolution (LR) sensor. Multiplexed encoding of the visual scene is typically performed via electronically controllable spatial light modulators (SLM). An HR image is then reconstructed from the encoded measurements by solving an inverse problem that involves the forward model of the imaging system. To capture system non-idealities such as optical aberrations, a mainstream approach is to conduct an offline calibration scan to measure the system response for a point source at each spatial location on the imaging grid. However, it is challenging to run calibration scans when using structured SLMs as they cannot encode individual grid locations. In this study, we propose a novel compressive FPA system based on online deep-learning calibration of multiplexed LR measurements (CalibFPA). We introduce a piezo-stage that locomotes a pre-printed fixed coded aperture. A deep neural network is then leveraged to correct for the influences of system non-idealities in multiplexed measurements without the need for offline calibration scans. Finally, a deep plug-and-play algorithm is used to reconstruct images from corrected measurements. On simulated and experimental datasets, we demonstrate that CalibFPA outperforms state-of-the-art compressive FPA methods. We also report analyses to validate the design elements in CalibFPA and assess computational complexity.
Published: 2023

23. Cerebrotendinous Xanthomatosis patients with late diagnosed in single orthopedic clinic: two novel variants in the CYP27A1 gene

Author: Köroğlu, Muhammed, Karakaplan, Mustafa, Gündüz, Enes, Kesriklioğlu, Betül, Ergen, Emre, Aslantürk, Okan, and Özdemir, Zeynep Maraş
Published: 2024
Full Text: View/download PDF

24. Edge computing in future wireless networks: A comprehensive evaluation and vision for 6G and beyond

Author: Ergen, Mustafa, Saoud, Bilal, Shayea, Ibraheem, El-Saleh, Ayman A., Ergen, Onur, Inan, Feride, and Tuysuz, Mehmet Fatih
Published: 2024
Full Text: View/download PDF

25. Globally Optimal Training of Neural Networks with Threshold Activation Functions

Author: Ergen, Tolga, Gulluk, Halil Ibrahim, Lacotte, Jonathan, and Pilanci, Mert
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Threshold activation functions are highly preferable in neural networks due to their efficiency in hardware implementations. Moreover, their mode of operation is more interpretable and resembles that of biological neurons. However, traditional gradient based algorithms such as Gradient Descent cannot be used to train the parameters of neural networks with threshold activations since the activation function has zero gradient except at a single non-differentiable point. To this end, we study weight decay regularized training problems of deep neural networks with threshold activations. We first show that regularized deep threshold network training problems can be equivalently formulated as a standard convex optimization problem, which parallels the LASSO method, provided that the last hidden layer width exceeds a certain threshold. We also derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network. We corroborate our theoretical results with various numerical experiments., Comment: Accepted to ICLR 2023
Published: 2023

26. Single-cell multiomic analysis of thymocyte development reveals drivers of CD4+ T cell and CD8+ T cell lineage commitment.

Author: Steier, Zoë, Aylard, Dominik, McIntyre, Laura, Baldwin, Isabel, Kim, Esther, Lutes, Lydia, Ergen, Can, Huang, Tse-Shun, Yosef, Nir, Robey, Ellen, and Streets, Aaron
Subjects: Mice, Animals, CD8-Positive T-Lymphocytes, Cell Lineage, CD4-Positive T-Lymphocytes, Thymocytes, Multiomics, Mice, Transgenic, Cell Differentiation, Receptors, Antigen, T-Cell, Thymus Gland, Histocompatibility Antigens Class I, CD4 Antigens
Abstract: The development of CD4+ T cells and CD8+ T cells in the thymus is critical to adaptive immunity and is widely studied as a model of lineage commitment. Recognition of self-peptide major histocompatibility complex (MHC) class I or II by the T cell antigen receptor (TCR) determines the CD8+ or CD4+ T cell lineage choice, respectively, but how distinct TCR signals drive transcriptional programs of lineage commitment remains largely unknown. Here we applied CITE-seq to measure RNA and surface proteins in thymocytes from wild-type and T cell lineage-restricted mice to generate a comprehensive timeline of cell states for each T cell lineage. These analyses identified a sequential process whereby all thymocytes initiate CD4+ T cell lineage differentiation during a first wave of TCR signaling, followed by a second TCR signaling wave that coincides with CD8+ T cell lineage specification. CITE-seq and pharmaceutical inhibition experiments implicated a TCR-calcineurin-NFAT-GATA3 axis in driving the CD4+ T cell fate. Our data provide a resource for understanding cell fate decisions and implicate a sequential selection process in guiding lineage choice.
Published: 2023

27. Effect of propofol induction on antioxidant defense system, cytokines, and cd4+ and cd8+ T cells in cats

Author: Didar AYDIN KAYA, Özlem GÜZEL, Duygu SEZER, Gülşen SEVİM, Erdal MATUR, Ezgi ERGEN, Feraye Esen GÜRSEL, and Gizem ATMACA
Subjects: antioxidant, cat, cytokine, propofol, t cells, Veterinary medicine, SF600-1100
Abstract: We investigated the effects of propofol on the antioxidant defense mechanisms, pro and anti-inflammatory cytokine expressions and specific defense processes in the study since these parameters play a significant role in postoperative complications, regulation of immune reactions, and wound healing. Twenty male cats were included in the study, anesthesia protocol was induced by IV administration of 6 mg/kg of propofol. Blood samples were harvested right before (T0) and fifteen minutes after (T1) propofol injection. Serum malondialdehyde (MDA), catalase (CAT), superoxide dismutase (SOD), glutathione peroxidase (GSH-Px), IL-4, IL-8, TNF-α, IL-1β, and IFN-γ levels; the number of CD4+, CD8+ T cells and CD4/CD8 ratio in peripheral blood were determined. Propofol reduced the serum MDA and GSH-Px, while CAT and SOD levels remained unchanged. Furthermore, propofol did not impact serum IL-8, TNF-α, and IL-1β levels. Contrastingly, IFN-γ level tended to elevate, and serum IL-4 level was significantly increased. On the other hand, the CD8+ T cell population was significantly decreased, while the number of CD4+ T cells and the CD4/CD8 ratio were unaffected. Briefly, propofol did not adversely affect oxidative defense mechanisms, proinflammatory and anti-inflammatory cytokine cascade, and cell mediated immunity. Considering the insufficiency of cats" hepatic drug metabolism, we may conclude that propofol is a safe product regarding the investigated parameters.
Published: 2024
Full Text: View/download PDF

28. Convexifying Transformers: Improving optimization and understanding of transformer networks

Author: Ergen, Tolga, Neyshabur, Behnam, and Mehta, Harsh
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Statistics - Machine Learning
Abstract: Understanding the fundamental mechanism behind the success of transformer networks is still an open problem in the deep learning literature. Although their remarkable performance has been mostly attributed to the self-attention mechanism, the literature still lacks a solid analysis of these networks and interpretation of the functions learned by them. To this end, we study the training problem of attention/transformer networks and introduce a novel convex analytic approach to improve the understanding and optimization of these networks. Particularly, we first introduce a convex alternative to the self-attention mechanism and reformulate the regularized training problem of transformer networks with our alternative convex attention. Then, we cast the reformulation as a convex optimization problem that is interpretable and easier to optimize. Moreover, as a byproduct of our convex analysis, we reveal an implicit regularization mechanism, which promotes sparsity across tokens. Therefore, we not only improve the optimization of attention/transformer networks but also provide a solid theoretical understanding of the functions learned by them. We also demonstrate the effectiveness of our theory through several numerical experiments.
Published: 2022

29. The Fe-N system: crystal structure prediction, phase stability, and mechanical properties

Author: Bao, Ergen, Zhao, Jinbin, Gao, Qiang, Shahid, Ijaz, Ma, Hui, Luo, Yixiu, Liu, Peitao, Sun, Yan, and Chen, Xing-Qiu
Published: 2025
Full Text: View/download PDF

30. Handover decision with multi-access edge computing in 6G networks: A survey

Author: Jahandar, Saeid, Shayea, Ibraheem, Gures, Emre, El-Saleh, Ayman A., Ergen, Mustafa, and Alnakhli, Mohammad
Published: 2025
Full Text: View/download PDF

31. Evaluation of the Diversities in the Inflammatory Responses in Cats With Bacterial and Viral Infections

Author: Songul Erhan, Bengu Bilgic, Ezgi Ergen, Mert Erek, Elif Ergul Ekiz, Mukaddes Ozcan, Mehmet Erman Or, Banu Dokuzeylul, and Erdal Matur
Subjects: cat, CRP, IL‐6, inflammation, TGF‐β, Veterinary medicine, SF600-1100
Abstract: ABSTRACT Background Understanding the nature of inflammatory responses in cats with bacterial and viral infections is essential for accurately managing the infection. This study aimed to investigate the diversities of inflammatory responses between bacterial and viral infections in cats to figure out their role in the pathophysiology of these infections. Methods Seventy‐five owned cats were included in the study. The evaluations were performed based on three groups: healthy control, bacterial infection group (those with bronchopneumonia and gastrointestinal tract and urinary tract infections) and viral infection group (21 with feline coronavirus [FCoV], 3 with feline leukaemia virus [FeLV] and 1 with feline calicivirus), each containing 25 individuals. Total and differential leukocyte counts, C‐reactive protein (CRP), transforming growth factor beta (TGF‐β), interleukin‐6 (IL‐6), tumour necrosis factor‐alpha (TNF‐α), interleukin‐1 beta (IL‐1β) and interleukin‐10 (IL‐10) concentrations were assessed in the blood samples collected from sick and healthy animals. Results No statistically significant difference was noted in serum TNF‐α, IL‐1β and IL‐10 concentrations of the infected cats (p = 0.996, p = 0.160 and p = 0.930, respectively). Serum TGF‐β concentration in the viral infection group was reduced compared to the healthy control (p = 0.001). In contrast, WBC count and IL‐6 and CRP concentrations were increased in the cats with bronchopneumonia, gastrointestinal tract infections and urinary tract infections compared to the healthy control and viral infection groups (p = 0.001, p = 0.001 and p = 0.001, respectively). Conclusion This study revealed significant differences between bacterial and viral infections regarding the fashion of inflammatory responses in cats, and the relevant data will undoubtedly contribute to the management and control of feline infectious diseases, rendering the development of novel therapeutic strategies.
Published: 2024
Full Text: View/download PDF

32. GLEAM: Greedy Learning for Large-Scale Accelerated MRI Reconstruction

Author: Ozturkler, Batu, Sahiner, Arda, Ergen, Tolga, Desai, Arjun D, Sandino, Christopher M, Vasanawala, Shreyas, Pauly, John M, Mardani, Morteza, and Pilanci, Mert
Subjects: Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Computer Vision and Pattern Recognition
Abstract: Unrolled neural networks have recently achieved state-of-the-art accelerated MRI reconstruction. These networks unroll iterative optimization algorithms by alternating between physics-based consistency and neural-network based regularization. However, they require several iterations of a large neural network to handle high-dimensional imaging tasks such as 3D MRI. This limits traditional training algorithms based on backpropagation due to prohibitively large memory and compute requirements for calculating gradients and storing intermediate activations. To address this challenge, we propose Greedy LEarning for Accelerated MRI (GLEAM) reconstruction, an efficient training strategy for high-dimensional imaging settings. GLEAM splits the end-to-end network into decoupled network modules. Each module is optimized in a greedy manner with decoupled gradient updates, reducing the memory footprint during training. We show that the decoupled gradient updates can be performed in parallel on multiple graphical processing units (GPUs) to further reduce training time. We present experiments with 2D and 3D datasets including multi-coil knee, brain, and dynamic cardiac cine MRI. We observe that: i) GLEAM generalizes as well as state-of-the-art memory-efficient baselines such as gradient checkpointing and invertible networks with the same memory footprint, but with 1.3x faster training; ii) for the same memory footprint, GLEAM yields 1.1dB PSNR gain in 2D and 1.8 dB in 3D over end-to-end baselines.
Published: 2022

33. AI-enabled routing in next generation networks: A survey

Author: Aktas, Fatma, Shayea, Ibraheem, Ergen, Mustafa, Saoud, Bilal, Yahya, Abdulsamad Ebrahim, and Laura, Aldasheva
Published: 2025
Full Text: View/download PDF

34. Latent Tuberculosis Infection Management in Solid Organ Transplantation Recipients: A National Snapshot

Author: Aylin Özgen Alpaydın, Tuba Yeter Turunç, Vildan Avkan-Oğuz, Füsun Öner-Eyüboğlu, Elif Tükenmez-Tigen, İmran Hasanoğlu, Güle Aydın, Yasemin Tezer-Tekçe, Seniha Şenbayrak, Filiz Kızılateş, Adalet Altunsoy Aypak, Sibel Altunışık-Toplu, Pınar Ergen, Behice Kurtaran, Meltem Işıkgöz Taşbakan, Ayşegül Yıldırım, Serkan Yıldız, Kenan Çalışkan, Ebru Ayvazoğlu, Ender Dulundu, Ebru Şengül Şeref Parlak, İrem Akdemir, Melih Kara, Sinan Türkkan, Kübra Demir-Önder, Ezgi Yenigün, Aslı Turgut, Sabahat Alışır Ecder, Saime Paydaş, Tansu Yamazhan, Tufan Egeli, Rüya Özelsancak, Arzu Velioğlu, Mehmet Kılıç, Alpay Azap, Erdal Yekeler, Tuğrul Çakır, Yaşar Bayındır, Asiye Kanbay, Ferit Kuşcu, Kemal Osman Memikoğlu, Nazan Şen, Erhan Kabasakal, and Gülden Ersöz
Subjects: Medicine
Abstract: OBJECTIVE: Latent tuberculosis infection (LTBI) screening is strongly recommended in the pre-transplant evaluation of solid organ transplant (SOT) recipients, although it remains inadequate in many transplant centers. We decided to investigate pre-transplant TB risk assessment, LTBI treatment, and registry rates in Turkey. MATERIAL AND METHODS: Adult SOT recipients who underwent tuberculin skin test (TST) and/or interferon-gamma release test (IGRA) from 14 centers between 2015 and 2019 were included in the study. An induration of ≥5 mm on TST and/or probable/positive IGRA (QuantiFERON-TB) was considered positive for LTBI. Demographic features, LTBI screening and treatment, and pre-/post-transplant TB history were recorded from the electronic database of transplantation units across the country and pooled at a single center for a unified database. RESULTS: TST and/or IGRA were performed in 766 (33.8%) of 2266 screened patients most of whom were kidney transplant recipients (n = 485, 63.4%). LTBI screening test was positive in 359 (46.9%) patients, and isoniazid was given to 203 (56.5%) patients. Of the patients treated for LTBI, 112 (55.2%) were registered in the national registry, and 82 (73.2%) completed the treatment. Tuberculosis developed in 6 (1.06%) of 563 patients who were not offered LTBI treatment. CONCLUSION: We determined that overall, only one-third of SOT recipients in our country were evaluated in terms of TB risk, only 1 of the 2 SOT recipients with LTBI received treatment, and half were registered. Therefore, we want to emphasize the critical importance of pretransplant TB risk stratification and registration, guided by revised national guidelines.
Published: 2024
Full Text: View/download PDF

35. Unraveling Attention via Convex Duality: Analysis and Interpretations of Vision Transformers

Author: Sahiner, Arda, Ergen, Tolga, Ozturkler, Batu, Pauly, John, Mardani, Morteza, and Pilanci, Mert
Subjects: Computer Science - Machine Learning, Computer Science - Computer Vision and Pattern Recognition, Mathematics - Optimization and Control
Abstract: Vision transformers using self-attention or its proposed alternatives have demonstrated promising results in many image related tasks. However, the underpinning inductive bias of attention is not well understood. To address this issue, this paper analyzes attention through the lens of convex duality. For the non-linear dot-product self-attention, and alternative mechanisms such as MLP-mixer and Fourier Neural Operator (FNO), we derive equivalent finite-dimensional convex problems that are interpretable and solvable to global optimality. The convex programs lead to {\it block nuclear-norm regularization} that promotes low rank in the latent feature and token dimensions. In particular, we show how self-attention networks implicitly clusters the tokens, based on their latent similarity. We conduct experiments for transferring a pre-trained transformer backbone for CIFAR-100 classification by fine-tuning a variety of convex attention heads. The results indicate the merits of the bias induced by attention compared with the existing MLP or linear heads., Comment: 38 pages, 2 figures. To appear in ICML 2022
Published: 2022

36. The Experiences of Classroom Teachers on the Homework Process in Teaching Mathematics: An Interpretative Phenomenological Analysis

Author: Ergen, Yusuf and Durmus, Mehmet Emin
Abstract: This phenomenological study aimed to explore a group of classroom teachers' experiences with homework assignment in teaching mathematics. The participants of the study were 27 classroom teachers who were selected using the criterion-based sampling technique. The research data were collected with a semi-structured interview form developed by the researchers and subjected to interpretive phenomenological analysis. The results showed that the teachers plan the homework they would assign the evening before the class and use resources available on the internet while planning it. It was also revealed that they assign mathematics homework for various purposes such as ensuring comprehension of the subjects, knowledge retention and use of the learned subjects in daily life. They reported that they check and provide feedback on the assigned homework during the classes that the assigned homework is sometimes done by the family members of the students and that both preparing and checking the homework take an extensive amount of time. As a solution to these problems, they suggested communicating and negotiating with the parents, getting support from school counselors and reducing the number of the themes in the primary mathematics curriculum.
Published: 2021

37. Experience of Primary School Teachers with Inclusion Students in the Context of Teaching Mathematics: A Case Study

Author: Durmus, Mehmet Emin and Ergen, Yusuf
Abstract: This study investigated experiences of primary school teachers with inclusion students in the context of teaching mathematics. In the study the phenomenology design, which is one of the qualitative research designs, was used. The participants were determined by criterion-based sampling method. The participants of the study consisted of 21 primary school teachers with inclusion students. Research data were collected by a semi-structured interview form developed by the researchers. Content analysis method was used for data analysis. The study found that the participants mostly used rough evaluation forms in order to determine the gains when preparing IEP. In addition, most of the participants stated that they needed help in preparing IEP and they received the most of the help from school counselors. The participants stated that they mostly used demonstration, teaching with play and drama methods and that they could practice with inclusion students only during breaks, during social activities times or in the hours in the support training room apart from mathematic class. Moreover, it was found that most of the participants measured verbally gains of the inclusion students by question and answer method. It was also concluded that inadequacy of the time was the most common problem they encountered in the process of learning-teaching and assessment for the mathematics class.
Published: 2021

38. Role of VDR gene polymorphisms and vitamin D levels in normal and overweight patients with PCOS

Author: Sağlam, Zümrüt Mine Işik, Bakir, Vuslat Lale, Ataş, Merve Nur, and Ergen, H. Arzu
Published: 2024
Full Text: View/download PDF

39. Integration of 5G, 6G and IoT with Low Earth Orbit (LEO) networks: Opportunity, challenges and future trends

Author: Ibraheem Shayea, Ayman A. El-Saleh, Mustafa Ergen, Bilal Saoud, Riad Hartani, Derya Turan, and Adnan Kabbani
Subjects: Fifth generation (5G), Future mobile broadband networks, Integration, Land mobile satellite system, Satellite, Satellite challenges, Technology
Abstract: The rapid growth of the massive smart Internet of Things (IoT) with mobile connections, the enhanced Mobile Broadband (eMBB) and the high demand for building a connected and intelligent world increase the probability of mobile satellite systems to be a major network in providing internet communication services in the future. Currently, the mobile satellite systems are envisioned as a significant solution for providing mobile services in different settings and for various vital objectives. These satellite systems have special qualities in each of these situations, including extensive coverage area, robustness, and ability to broadcast/multicast. The Low Earth Orbit (LEO) systems are the best promising technology that will offer internet services among the different types of satellite systems. However, the LEO systems are still experiencing certain restrictions with respect to connectivity, stability, and mobility support; because of which communication becomes unreliable. Therefore, the aim of this paper is to broadly explain the LEO systems and services in a comprehensive manner using a variety of perspectives. The paper focus is on key aspects of mobile internet based on satellite systems. This paper illustrates the integration of LEO systems with fifth and sixth generations of mobile cellular networks as well as with the IoT networks. It discusses the problems being faced as a result of the integration between cellular with IoT and satellite systems by comprehending which future research plans are outlined.
Published: 2024
Full Text: View/download PDF

40. Is the brightness- contrast level of virtual reality videos significant for visually induced motion sickness? Experimental real-time biosensor and self-report analysis

Author: Emel Ugur, Bahriye Ozlem Konukseven, Mehmet Ergen, Mehmet Emin Aksoy, and Serhat Ilgaz Yoner
Subjects: virtual reality, visually induced motion sickness, youtube VR, electrodermal activity, simulator sickness questionnaire, Electronic computers. Computer science, QA75.5-76.95
Abstract: BackgroundVirtual reality is no longer created solely with design graphics. Real life 360° videos created with special shooting techniques are now offered as open access to users’ experience. As a result, this widespread use of VR systems has increased the incidence of visually induced motion sickness.ObjectiveIn the present study, we aimed to investigate impact of brightness-contrast levels of real-life 360° videos on susceptibility to visually induced motion sickness during immersive virtual reality headset viewing.MethodsIn this study, 360° real-world day and night driving videos publicly available on YouTube VR were used as stimuli. Stimuli were presented in 2-min segments. Electrodermal activity was recorded throughout the stimulus presentation, and SSQ was administered immediately afterward.ResultsNo significant difference was found between the experiments in terms of dermal activity. There is a statistically significant difference in total SSQ scores and in symptoms of fatigue, eye strain, head fullness, blurred vision, and dizziness (p < 0.005; p < 0.01) after then the night video.ConclusionThe present study examined the likely impact of brightness and contrast levels in VR environments on VIMS provocation.
Published: 2024
Full Text: View/download PDF

41. DestVI identifies continuums of cell types in spatial transcriptomics data

Author: Lopez, Romain, Li, Baoguo, Keren-Shaul, Hadas, Boyeau, Pierre, Kedmi, Merav, Pilzer, David, Jelinski, Adam, Yofe, Ido, David, Eyal, Wagner, Allon, Ergen, Can, Addadi, Yoseph, Golani, Ofra, Ronchese, Franca, Jordan, Michael I, Amit, Ido, and Yosef, Nir
Subjects: Biological Sciences, Bioinformatics and Computational Biology, Genetics, Biotechnology, Bioengineering, Human Genome, 1.1 Normal biological development and functioning, Underpinning research, Generic health relevance, Animals, Gene Expression Profiling, Mice, Neoplasms, Single-Cell Analysis, Software, Transcriptome, Exome Sequencing
Abstract: Most spatial transcriptomics technologies are limited by their resolution, with spot sizes larger than that of a single cell. Although joint analysis with single-cell RNA sequencing can alleviate this problem, current methods are limited to assessing discrete cell types, revealing the proportion of cell types inside each spot. To identify continuous variation of the transcriptome within cells of the same type, we developed Deconvolution of Spatial Transcriptomics profiles using Variational Inference (DestVI). Using simulations, we demonstrate that DestVI outperforms existing methods for estimating gene expression for every cell type inside every spot. Applied to a study of infected lymph nodes and of a mouse tumor model, DestVI provides high-resolution, accurate spatial characterization of the cellular organization of these tissues and identifies cell-type-specific changes in gene expression between different tissue regions or between conditions. DestVI is available as part of the open-source software package scvi-tools ( https://scvi-tools.org ).
Published: 2022

42. Path Regularization: A Convexity and Sparsity Inducing Regularization for Parallel ReLU Networks

Author: Ergen, Tolga and Pilanci, Mert
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Statistics - Machine Learning
Abstract: Understanding the fundamental principles behind the success of deep neural networks is one of the most important open questions in the current literature. To this end, we study the training problem of deep neural networks and introduce an analytic approach to unveil hidden convexity in the optimization landscape. We consider a deep parallel ReLU network architecture, which also includes standard deep networks and ResNets as its special cases. We then show that pathwise regularized training problems can be represented as an exact convex optimization problem. We further prove that the equivalent convex problem is regularized via a group sparsity inducing norm. Thus, a path regularized parallel ReLU network can be viewed as a parsimonious convex model in high dimensions. More importantly, since the original training problem may not be trainable in polynomial-time, we propose an approximate algorithm with a fully polynomial-time complexity in all data dimensions. Then, we prove strong global optimality guarantees for this algorithm. We also provide experiments corroborating our theory., Comment: Accepted to NeurIPS 2023
Published: 2021

43. Parallel Deep Neural Networks Have Zero Duality Gap

Author: Wang, Yifei, Ergen, Tolga, and Pilanci, Mert
Subjects: Computer Science - Machine Learning, Mathematics - Optimization and Control
Abstract: Training deep neural networks is a challenging non-convex optimization problem. Recent work has proven that the strong duality holds (which means zero duality gap) for regularized finite-width two-layer ReLU networks and consequently provided an equivalent convex training problem. However, extending this result to deeper networks remains to be an open problem. In this paper, we prove that the duality gap for deeper linear networks with vector outputs is non-zero. In contrast, we show that the zero duality gap can be obtained by stacking standard deep networks in parallel, which we call a parallel architecture, and modifying the regularization. Therefore, we prove the strong duality and existence of equivalent convex problems that enable globally optimal training of deep networks. As a by-product of our analysis, we demonstrate that the weight decay regularization on the network parameters explicitly encourages low-rank solutions via closed-form expressions. In addition, we show that strong duality holds for three-layer standard ReLU networks given rank-1 data matrices.
Published: 2021

44. Global Optimality Beyond Two Layers: Training Deep ReLU Networks via Convex Programs

Author: Ergen, Tolga and Pilanci, Mert
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computational Complexity, Statistics - Machine Learning
Abstract: Understanding the fundamental mechanism behind the success of deep neural networks is one of the key challenges in the modern machine learning literature. Despite numerous attempts, a solid theoretical analysis is yet to be developed. In this paper, we develop a novel unified framework to reveal a hidden regularization mechanism through the lens of convex optimization. We first show that the training of multiple three-layer ReLU sub-networks with weight decay regularization can be equivalently cast as a convex optimization problem in a higher dimensional space, where sparsity is enforced via a group $\ell_1$-norm regularization. Consequently, ReLU networks can be interpreted as high dimensional feature selection methods. More importantly, we then prove that the equivalent convex problem can be globally optimized by a standard convex optimization solver with a polynomial-time complexity with respect to the number of samples and data dimension when the width of the network is fixed. Finally, we numerically validate our theoretical results via experiments involving both synthetic and real datasets., Comment: Accepted to ICML 2021
Published: 2021

45. Processing 2D barcode data with metaheuristic based CNN models and detection of malicious PDF files

Author: Toğaçar, Mesut and Ergen, Burhan
Published: 2024
Full Text: View/download PDF

46. Hidden Convexity of Wasserstein GANs: Interpretable Generative Models with Closed-Form Solutions

Author: Sahiner, Arda, Ergen, Tolga, Ozturkler, Batu, Bartan, Burak, Pauly, John, Mardani, Morteza, and Pilanci, Mert
Subjects: Computer Science - Machine Learning, Computer Science - Computer Vision and Pattern Recognition, Electrical Engineering and Systems Science - Image and Video Processing, Mathematics - Optimization and Control, Statistics - Machine Learning
Abstract: Generative Adversarial Networks (GANs) are commonly used for modeling complex distributions of data. Both the generators and discriminators of GANs are often modeled by neural networks, posing a non-transparent optimization problem which is non-convex and non-concave over the generator and discriminator, respectively. Such networks are often heuristically optimized with gradient descent-ascent (GDA), but it is unclear whether the optimization problem contains any saddle points, or whether heuristic methods can find them in practice. In this work, we analyze the training of Wasserstein GANs with two-layer neural network discriminators through the lens of convex duality, and for a variety of generators expose the conditions under which Wasserstein GANs can be solved exactly with convex optimization approaches, or can be represented as convex-concave games. Using this convex duality interpretation, we further demonstrate the impact of different activation functions of the discriminator. Our observations are verified with numerical results demonstrating the power of the convex interpretation, with applications in progressive training of convex architectures corresponding to linear generators and quadratic-activation discriminators for CelebA image generation. The code for our experiments is available at https://github.com/ardasahiner/ProCoGAN., Comment: Published as paper in ICLR 2022. First two authors contributed equally to this work; 34 pages, 11 figures
Published: 2021

47. Retinal vascular and structural recovery analysis by optical coherence tomography angiography after endoscopic decompression in sellar/parasellar tumors

Author: Anil Ergen, Sebnem Kaya Ergen, Busra Gunduz, Sevgi Subasi, Melih Caklili, Burak Cabuk, Ihsan Anik, and Savas Ceylan
Subjects: Medicine, Science
Abstract: Abstract We assessed the potential retinal microcirculation alterations for postoperative visual recovery in sellar/paraseller tumor patients with Optical Coherence Tomography Angiography (OCT-A). Two hundred ten eyes with sellar/parasellar tumor for which preoperative and postoperative (3 months) MRI Scans, Visual Acuity Test, Optical Coherence Tomography (OCT), OCT-A and, Visual Field Test data were available, besides 92 healthy eyes were evaluated. In the preoperative phase, significant reductions were observed in retinal vascular densities in various regions, including the Superficial Retinal Capillary Plexus (SRCP) (whole: p
Published: 2023
Full Text: View/download PDF

48. The anatomy, micromorphology, and essential oils of the Turkish endemic and endangered species Alchemilla orduensis

Author: Ergen Akçin, Öznur, Özbucak, Tuğba, Öztürk, Şükran, Ümit Uzunömeroğlu, Hüseyin, Ergen Akçin, Öznur, Özbucak, Tuğba, Öztürk, Şükran, and Ümit Uzunömeroğlu, Hüseyin
Abstract: In this study, the anatomical and micromorphological characteristics of the vegetative organs and the essential oil constituents of the aerial and underground parts of the local and endangered endemic species A. orduensis Pawł. were evaluated. For anatomical study, sections of root, rhizome, stem, leaves and petiole were excised and stained with safranin/fast green mixture. Leaf and petiole structures were examined micromorphologically. Essential oil contents were determined by headspace solid-phase microextraction coupled with gas chromatography-mass spectrometry (HS-SPME/GC-MS) analysis. The results showed that rectangular meristematic cells were present in the root. The leaf is of the bifacial and amphistomatic type. Stomata cells are of the anomocytic type. The stomatal index for the upper surface of the leaves is 0.04, while the stomatal index for the lower surface is 0.17. Druse crystals were found in the rhizome, stem and leaves. Among the various compounds identified, the most abundant groups in the aboveground parts are alcohols (39.81%) and ketones (14.99%) with 1-Octen-3-ol, 1-octan-3-one and borane- methyl sulfide complex as the main compounds. Terpenes (23.44%) and alcohols (11.82%), in which myrtenolis was the main compound, were most abundant in the underground parts.
Published: 2025

49. Resilience in service firms: the impact of social capital on firm performance during turmoil

Author: Ergen Keleş, Fatma Hilal and Keleş, Emrah
Published: 2023
Full Text: View/download PDF

50. Value Literacy -- A New Model for Education of Character and Values

Author: Ergen, Gürkan
Abstract: Life without interaction and communication cannot exist, neither without choices. In communication, firstly every source and then every message has a value; likewise, every choice humans are to make each and every second is a result of an e'value'tion. Because there exists no moment or field without 'value' and e'value'tion, they end up in a numerous value exchanges and e'value'tion processes occurring in numerous ways. Under all disagreements and conflicts lies the failure to perform a proper analysis of the value to be passed across in the exchange of value before us in particular and then this limitless value exchange process and the values governing our choices and the consequences thereof for ourselves and people around us. The primary concern of the present paper is to discuss the conceptualization of 'value literacy' as a learning model allowing for an analysis of this kind. This is an analytical study based on an exhaustive review of the literature related to literacy, values, and character education. By allowing for the construction of relationships with an elevated awareness and sensitivity concerning the values underlying interpersonal relationships, 'Value literacy' has a notable potential for the resolution of conflicts in these relationships and reasoning thereof worthy of human dignity and for its capacity to complete literacy types such as values education, character education, personality development, emotion management, ethics literacy.
Published: 2019

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

3,536 results on '"Ergen A"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources