Author: "Ebrahimi A." - Searchworks@Jio Institute Digital Library Search Results

1. Adaptive Group Robust Ensemble Knowledge Distillation

Author: Kenfack, Patrik, Aïvodji, Ulrich, and Kahou, Samira Ebrahimi
Subjects: Computer Science - Machine Learning
Abstract: Neural networks can learn spurious correlations in the data, often leading to performance disparity for underrepresented subgroups. Studies have demonstrated that the disparity is amplified when knowledge is distilled from a complex teacher model to a relatively "simple" student model. Prior work has shown that ensemble deep learning methods can improve the performance of the worst-case subgroups; however, it is unclear if this advantage carries over when distilling knowledge from an ensemble of teachers, especially when the teacher models are debiased. This study demonstrates that traditional ensemble knowledge distillation can significantly drop the performance of the worst-case subgroups in the distilled student model even when the teacher models are debiased. To overcome this, we propose Adaptive Group Robust Ensemble Knowledge Distillation (AGRE-KD), a simple ensembling strategy to ensure that the student model receives knowledge beneficial for unknown underrepresented subgroups. Leveraging an additional biased model, our method selectively chooses teachers whose knowledge would better improve the worst-performing subgroups by upweighting the teachers with gradient directions deviating from the biased model. Our experiments on several datasets demonstrate the superiority of the proposed ensemble distillation technique and show that it can even outperform classic model ensembles based on majority voting., Comment: Workshop Algorithmic Fairness through the Lens of Metrics and Evaluation at NeurIPS 2024
Published: 2024

2. An All-in-one Approach for Accelerated Cardiac MRI Reconstruction

Author: Hamedani, Kian Anvari, Razizadeh, Narges, Nabavi, Shahabedin, and Moghaddam, Mohsen Ebrahimi
Subjects: Electrical Engineering and Systems Science - Image and Video Processing
Abstract: Cardiovascular magnetic resonance (CMR) imaging is the gold standard for diagnosing several heart diseases due to its non-invasive nature and proper contrast. MR imaging is time-consuming because of signal acquisition and image formation issues. Prolonging the imaging process can result in the appearance of artefacts in the final image, which can affect the diagnosis. It is possible to speed up CMR imaging using image reconstruction based on deep learning. For this purpose, the high-quality clinical interpretable images can be reconstructed by acquiring highly undersampled k-space data, that is only partially filled, and using a deep learning model. In this study, we proposed a stepwise reconstruction approach based on the Patch-GAN structure for highly undersampled k-space data compatible with the multi-contrast nature, various anatomical views and trajectories of CMR imaging. The proposed approach was validated using the CMRxRecon2024 challenge dataset and outperformed previous studies. The structural similarity index measure (SSIM) values for the first and second tasks of the challenge are 99.07 and 97.99, respectively. This approach can accelerate CMR imaging to obtain high-quality images, more accurate diagnosis and a pleasant patient experience.
Published: 2024

3. Safety Filter Design for Articulated Frame Steering Vehicles In the Presence of Actuator Dynamics Using High-Order Control Barrier Functions

Author: Toulkani, Naeim Ebrahimi and Ghabcheloo, Reza
Subjects: Electrical Engineering and Systems Science - Systems and Control
Abstract: Articulated Frame Steering (AFS) vehicles are widely used in heavy-duty industries, where they often operate near operators and laborers. Therefore, designing safe controllers for AFS vehicles is essential. In this paper, we develop a Quadratic Program (QP)-based safety filter that ensures feasibility for AFS vehicles with affine actuator dynamics. To achieve this, we first derive the general equations of motion for AFS vehicles, incorporating affine actuator dynamics. We then introduce a novel High-Order Control Barrier Function (HOCBF) candidate with equal relative degrees for both system controls. Finally, we design a Parametric Adaptive HOCBF (PACBF) and an always-feasible, QP-based safety filter. Numerical simulations of AFS vehicle kinematics demonstrate the effectiveness of our approach.
Published: 2024

4. Reducing Conservativeness of Controlled-Invariant Safe Sets by Introducing a Novel Synthesis of Control Barrier Certificates

Author: Toulkani, Naeim Ebrahimi and Ghabcheloo, Reza
Subjects: Electrical Engineering and Systems Science - Systems and Control
Abstract: Finding a controlled-invariant safe set for a given system with state and control constraints plays an important role in safety-critical systems. Current methods typically produce conservative solutions. In this paper, we introduce a method to generate controlled-invariant safe sets for nonlinear polynomial control-affine dynamical systems by using the notion of Control Barrier Certificates (CBCs). To this end, we relax CBC conditions into Sum of Squares (SOS) constraints, to be solved by an SOS program. We first assume a controlled-invariant safe set (although small) exists for the system. We then propose a method to iteratively enlarge the safe set. We theoretically prove that our method enlarges the safe set in each iteration. We also demonstrate the efficacy of our method through simulated numerical examples in 2D and 3D for single and multi-input dynamical systems and empirically show that our method produces a larger controlled-invariant safe set in these examples, compared to a state-of-the-art technique using Control Barrier Function (CBF).
Published: 2024

5. The Exponential Lie Series and a Chen-Strichartz Formula for Levy Processes

Author: Ebrahimi-Fard, Kurusch, Patras, Frederic, and Wiese, Anke
Subjects: Mathematics - Probability, Mathematics - Combinatorics
Abstract: In this paper, we derive a Chen-Strichartz formula for stochastic differential equations driven by Levy processes, that is, we derive a series expansion of the logarithm of the flowmap of the stochastic differential equation in terms of commutators of vector fields with stochastic coefficients, and we provide an explicit formula for the components in this series. The stochastic components are generated by the Levy processes that drive the stochastic differential equation and their quadratic variation and power jumps; the vector fields are given as linear combinations of commutators of elements in the pre-Lie Magnus expansion generated by the original vector fields governing our stochastic differential equation. In particular, we show the logarithm of the flowmap is a Lie series. These results extend previous results for deterministic differential equations and continuous stochastic differential equations. For these, the Chen-Strichartz series has shown to play a pivotal role in the design of numerical integration schemes that preserve qualitative properties of the solution such as the construction of geometric numerical schemes and in the context of efficient numerical schemes., Comment: 25 pages
Published: 2024

6. On definable subcategories

Author: Ebrahimi, Ramin
Subjects: Mathematics - Category Theory, Mathematics - Rings and Algebras, Mathematics - Representation Theory
Abstract: Let $\mathcal{X}$ be a skeletally small additive category. Using the canonical equivalence between two different presentations of the free abelian category over $\mathcal{X}$, we give a new and simple characterization of definable subcategories of $\rm Mod\text{-}\mathcal{X}$, and in particular definable subcategories of modules over rings. In the end, we give a conceptual proof of Auslander-Gruson-Jensen duality, which makes the duality between definable subcategories of left and right module more transparent.
Published: 2024

7. Fluid flow inside slit-shaped nanopores: the role of surface morphology at the molecular scale

Author: Marcelli, Giorgia, Montandon, Tecla Bottinelli, Viand, Roya Ebrahimi, and Höfling, Felix
Subjects: Condensed Matter - Soft Condensed Matter, Physics - Chemical Physics, Physics - Computational Physics
Abstract: Non-equilibrium molecular dynamics (NEMD) simulations of fluid flow have highlighted the peculiarities of nanoscale flows compared to classical fluid mechanics. In particular, boundary conditions can deviate from the no-slip behavior at macroscopic scales due to various factors. In this context, we investigate the influence of surface morphology in slit-shaped nanopores on the fluid flow. We demonstrate that the surface morphology effectively controls the slip length, which approaches zero when the molecular structures of the pore wall and the fluid are matched. Using boundary-driven, energy-conserving NEMD simulations with a pump-like driving mechanism, we examine two types of pore walls--mimicking a crystalline and an amorphous material--that exhibit markedly different surface resistances to flow. The resulting flow velocity profiles are consistent with Hagen-Poiseuille theory for incompressible, Newtonian fluids when adjusted for surface slip. For the two pores, we observe partial slip and no-slip behavior, respectively, which correlate with fluid layering and depletion near the surfaces. However, the confinement of the fluid gives rise to an effective viscosity that varies substantially with the pore width. Analysis of the hydrodynamic permeability shows that the simulated flows are in the Darcy regime. Additionally, the thermal isolation of the flow causes a linear increase in fluid temperature along the flow, which we relate to strong viscous dissipation and heat convection, utilizing conservation laws of fluid mechanics. Our findings underscore the need for molecular-scale modeling to accurately capture the fluid dynamics near boundaries and in nanoporous materials, where macroscopic models may not be applicable.
Published: 2024

8. Observation of nonaxisymmetric standard magnetorotational instability induced by a free-shear layer

Author: Wang, Yin, Ebrahimi, Fatima, Lu, Hongke, Goodman, Jeremy, Gilson, Erik P., and Ji, Hantao
Subjects: Astrophysics - High Energy Astrophysical Phenomena, Physics - Plasma Physics
Abstract: The standard magnetorotational instability (SMRI) is widely believed to be responsible for the observed accretion rates in astronomical disks. It is a linear instability triggered in the differentially rotating ionized disk flow by a magnetic field component parallel to the rotation axis. Most studies focus on axisymmetric SMRI in conventional base flows with a Keplerian profile for accretion disks or an ideal Couette profile for Taylor-Couette flows, since excitation of nonaxisymmetric SMRI in such flows requires a magnetic Reynolds number Rm more than an order of magnitude larger. Here, we report that in a magnetized Taylor-Couette flow, nonaxisymmetric SMRI can be destabilized in a free-shear layer in the base flow at Rm $\gtrsim$ 1, the same threshold as for axisymmetric SMRI. Global linear analysis reveals that the free-shear layer reduces the required Rm, possibly by introducing an extremum in the vorticity of the base flow. Nonlinear simulations validate the results from linear analysis and confirm that a novel instability recently discovered experimentally (Nat. Commun. 13, 4679 (2022)) is the nonaxisymmetric SMRI. Our finding has astronomical implications since free-shear layers are ubiquitous in celestial systems.
Published: 2024

9. Does GenAI Make Usability Testing Obsolete?

Author: Pourasad, Ali Ebrahimi and Maalej, Walid
Subjects: Computer Science - Software Engineering, Computer Science - Human-Computer Interaction
Abstract: Ensuring usability is crucial for the success of mobile apps. Usability issues can compromise user experience and negatively impact the perceived app quality. This paper presents UX-LLM, a novel tool powered by a Large Vision-Language Model that predicts usability issues in iOS apps. To evaluate the performance of UX-LLM we predicted usability issues in two open-source apps of a medium complexity and asked usability experts to assess the predictions. We also performed traditional usability testing and expert review for both apps and compared the results to those of UX-LLM. UX-LLM demonstrated precision ranging from 0.61 and 0.66 and recall between 0.35 and 0.38, indicating its ability to identify valid usability issues, yet failing to capture the majority of issues. Finally, we conducted a focus group with an app development team of a capstone project developing a transit app for visually impaired persons. The focus group expressed positive perceptions of UX-LLM as it identified unknown usability issues in their app. However, they also raised concerns about its integration into the development workflow, suggesting potential improvements. Our results show that UX-LLM cannot fully replace traditional usability evaluation methods but serves as a valuable supplement particularly for small teams with limited resources, to identify issues in less common user paths, due to its ability to inspect the source code., Comment: Accepted for publication at The 47th IEEE/ACM International Conference on Software Engineering ICSE 2025
Published: 2024

10. KD-LoRA: A Hybrid Approach to Efficient Fine-Tuning with LoRA and Knowledge Distillation

Author: Azimi, Rambod, Rishav, Rishav, Teichmann, Marek, and Kahou, Samira Ebrahimi
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Large language models (LLMs) have demonstrated remarkable performance across various downstream tasks. However, the high computational and memory requirements of LLMs are a major bottleneck. To address this, parameter-efficient fine-tuning (PEFT) methods such as low-rank adaptation (LoRA) have been proposed to reduce computational costs while ensuring minimal loss in performance. Additionally, knowledge distillation (KD) has been a popular choice for obtaining compact student models from teacher models. In this work, we present KD-LoRA, a novel fine-tuning method that combines LoRA with KD. Our results demonstrate that KD-LoRA achieves performance comparable to full fine-tuning (FFT) and LoRA while significantly reducing resource requirements. Specifically, KD-LoRA retains 98% of LoRA's performance on the GLUE benchmark, while being 40% more compact. Additionally, KD-LoRA reduces GPU memory usage by 30% compared to LoRA, while decreasing inference time by 30% compared to both FFT and LoRA. We evaluate KD-LoRA across three encoder-only models: BERT, RoBERTa, and DeBERTaV3. Code is available at https://github.com/rambodazimi/KD-LoRA., Comment: Accepted at 4th NeurIPS Efficient Natural Language and Speech Processing Workshop (ENLSP-IV 2024)
Published: 2024

11. Minimum Entropy Coupling with Bottleneck

Author: Ebrahimi, M. Reza, Chen, Jun, and Khisti, Ashish
Subjects: Computer Science - Machine Learning, Computer Science - Information Theory
Abstract: This paper investigates a novel lossy compression framework operating under logarithmic loss, designed to handle situations where the reconstruction distribution diverges from the source distribution. This framework is especially relevant for applications that require joint compression and retrieval, and in scenarios involving distributional shifts due to processing. We show that the proposed formulation extends the classical minimum entropy coupling framework by integrating a bottleneck, allowing for a controlled degree of stochasticity in the coupling. We explore the decomposition of the Minimum Entropy Coupling with Bottleneck (MEC-B) into two distinct optimization problems: Entropy-Bounded Information Maximization (EBIM) for the encoder, and Minimum Entropy Coupling (MEC) for the decoder. Through extensive analysis, we provide a greedy algorithm for EBIM with guaranteed performance, and characterize the optimal solution near functional mappings, yielding significant theoretical insights into the structural complexity of this problem. Furthermore, we illustrate the practical application of MEC-B through experiments in Markov Coding Games (MCGs) under rate limits. These games simulate a communication scenario within a Markov Decision Process, where an agent must transmit a compressed message from a sender to a receiver through its actions. Our experiments highlight the trade-offs between MDP rewards and receiver accuracy across various compression rates, showcasing the efficacy of our method compared to conventional compression baseline., Comment: 38th Conference on Neural Information Processing Systems (NeurIPS 2024) - Spotlight
Published: 2024

12. Prediction of Final Phosphorus Content of Steel in a Scrap-Based Electric Arc Furnace Using Artificial Neural Networks

Author: Azzaz, Riadh, Hurel, Valentin, Menard, Patrice, Jahazi, Mohammad, Kahou, Samira Ebrahimi, and Moosavi-Khoonsari, Elmira
Subjects: Computer Science - Machine Learning, Condensed Matter - Materials Science
Abstract: The scrap-based electric arc furnace process is expected to capture a significant share of the steel market in the future due to its potential for reducing environmental impacts through steel recycling. However, managing impurities, particularly phosphorus, remains a challenge. This study aims to develop a machine learning model to estimate the steel phosphorus content at the end of the process based on input parameters. Data were collected over two years from a steel plant, focusing on the chemical composition and weight of the scrap, the volume of oxygen injected, and process duration. After preprocessing the data, several machine learning models were evaluated, with the artificial neural network (ANN) emerging as the most effective. The best ANN model included four hidden layers. The model was trained for 500 epochs with a batch size of 50. The best model achieves a mean square error (MSE) of 0.000016, a root-mean-square error (RMSE) of 0.0049998, a coefficient of determination (R2) of 99.96%, and a correlation coefficient (r) of 99.98%. Notably, the model achieved a 100% hit rate for predicting phosphorus content within +-0.001 wt% (+-10 ppm). These results demonstrate that the optimized ANN model offers accurate predictions for the steel final phosphorus content., Comment: 53 pages, 8 figures
Published: 2024

13. Multi-Draft Speculative Sampling: Canonical Architectures and Theoretical Limits

Author: Khisti, Ashish, Ebrahimi, M. Reza, Dbouk, Hassan, Behboodi, Arash, Memisevic, Roland, and Louizos, Christos
Subjects: Computer Science - Computation and Language, Computer Science - Distributed, Parallel, and Cluster Computing, Computer Science - Information Theory, Computer Science - Machine Learning
Abstract: We consider multi-draft speculative sampling, where the proposal sequences are sampled independently from different draft models. At each step, a token-level draft selection scheme takes a list of valid tokens as input and produces an output token whose distribution matches that of the target model. Previous works have demonstrated that the optimal scheme (which maximizes the probability of accepting one of the input tokens) can be cast as a solution to a linear program. In this work we show that the optimal scheme can be decomposed into a two-step solution: in the first step an importance sampling (IS) type scheme is used to select one intermediate token; in the second step (single-draft) speculative sampling is applied to generate the output token. For the case of two identical draft models we further 1) establish a necessary and sufficient condition on the distributions of the target and draft models for the acceptance probability to equal one and 2) provide an explicit expression for the optimal acceptance probability. Our theoretical analysis also motives a new class of token-level selection scheme based on weighted importance sampling. Our experimental results demonstrate consistent improvements in the achievable block efficiency and token rates over baseline schemes in a number of scenarios.
Published: 2024

14. The Double-Edged Sword of Behavioral Responses in Strategic Classification: Theory and User Studies

Author: Ebrahimi, Raman, Vaccaro, Kristen, and Naghizadeh, Parinaz
Subjects: Computer Science - Machine Learning, Computer Science - Computer Science and Game Theory, Computer Science - Human-Computer Interaction
Abstract: When humans are subject to an algorithmic decision system, they can strategically adjust their behavior accordingly (``game'' the system). While a growing line of literature on strategic classification has used game-theoretic modeling to understand and mitigate such gaming, these existing works consider standard models of fully rational agents. In this paper, we propose a strategic classification model that considers behavioral biases in human responses to algorithms. We show how misperceptions of a classifier (specifically, of its feature weights) can lead to different types of discrepancies between biased and rational agents' responses, and identify when behavioral agents over- or under-invest in different features. We also show that strategic agents with behavioral biases can benefit or (perhaps, unexpectedly) harm the firm compared to fully rational strategic agents. We complement our analytical results with user studies, which support our hypothesis of behavioral biases in human responses to the algorithm. Together, our findings highlight the need to account for human (cognitive) biases when designing AI systems, and providing explanations of them, to strategic human in the loop.
Published: 2024

15. Non-vanishing elements and complex group algebras

Author: Ebrahimi, Mahdi
Subjects: Mathematics - Group Theory, 20C15, 05C50, 05C92
Abstract: Let $G$ be a finite group, and let $\mathrm{Irr}(G)$ denote the set of irreducible complex characters of $G$. An element $x$ of $G$ is said to be vanishing, if for some $\chi$ in $\mathrm{Irr}(G)$, we have $\chi(x)=0$. Also the element $x$ is called rational if $x$ is conjugate to $x^i$ for every integer $i$ co-prime to the order of $x$. We define the weight of $G$ as $\omega(G):=(\sum_{\chi\in \mathrm{Irr}(G)}\chi(1))^2/|G|$. In this paper, we show that for every rational non-vanishing element $x\in G$, the order of $C_G(x)$ is at least $\omega(G)$.
Published: 2024

16. Cayley graphs on symmetric groups generated by $n$-cycles are hyperenergetic

Author: Ebrahimi, Mahdi
Subjects: Mathematics - Combinatorics, 05C92, 20C30, 05C50
Abstract: Let $\Gamma$ be a simple graph with $n$ vertices. The energy of $\Gamma$, denoted by $\mathcal{E}(\Gamma)$, is defined as the sum of the absolute values of the eigenvalues of $\Gamma$. The graph $\Gamma$ is said to be hyperenergetic if $\mathcal{E}(\Gamma)>2n-2$. For the graph $\Gamma$, the multiplicity of the eigenvalue $0$, denoted by $\eta(\Gamma)$, is called the nullity of $\Gamma$. In this paper, we show that for every positive integer $n\geq 4$, the Cayley graph $\Gamma_n$ on the symmetric group $\mathrm{Sym}(n)$ generated by $n$-cycles is an integral hyperenergetic graph with $\mathcal{E}(\Gamma_n)=2^{n-1}(n-1)!$ and $\eta(\Gamma_n)=n!-\binom{2n-2}{n-1}$., Comment: arXiv admin note: text overlap with arXiv:1808.06097 by other authors
Published: 2024

17. Model Swarms: Collaborative Search to Adapt LLM Experts via Swarm Intelligence

Author: Feng, Shangbin, Wang, Zifeng, Wang, Yike, Ebrahimi, Sayna, Palangi, Hamid, Miculicich, Lesly, Kulshrestha, Achin, Rauschmayr, Nathalie, Choi, Yejin, Tsvetkov, Yulia, Lee, Chen-Yu, and Pfister, Tomas
Subjects: Computer Science - Computation and Language
Abstract: We propose Model Swarms, a collaborative search algorithm to adapt LLMs via swarm intelligence, the collective behavior guiding individual systems. Specifically, Model Swarms starts with a pool of LLM experts and a utility function. Guided by the best-found checkpoints across models, diverse LLM experts collaboratively move in the weight space and optimize a utility function representing model adaptation objectives. Compared to existing model composition approaches, Model Swarms offers tuning-free model adaptation, works in low-data regimes with as few as 200 examples, and does not require assumptions about specific experts in the swarm or how they should be composed. Extensive experiments demonstrate that Model Swarms could flexibly adapt LLM experts to a single task, multi-task domains, reward models, as well as diverse human interests, improving over 12 model composition baselines by up to 21.0% across tasks and contexts. Further analysis reveals that LLM experts discover previously unseen capabilities in initial checkpoints and that Model Swarms enable the weak-to-strong transition of experts through the collaborative search process.
Published: 2024

18. Uncovering Attacks and Defenses in Secure Aggregation for Federated Deep Learning

Author: Zhang, Yiwei, Behnia, Rouzbeh, Yavuz, Attila A., Ebrahimi, Reza, and Bertino, Elisa
Subjects: Computer Science - Cryptography and Security, Computer Science - Machine Learning
Abstract: Federated learning enables the collaborative learning of a global model on diverse data, preserving data locality and eliminating the need to transfer user data to a central server. However, data privacy remains vulnerable, as attacks can target user training data by exploiting the updates sent by users during each learning iteration. Secure aggregation protocols are designed to mask/encrypt user updates and enable a central server to aggregate the masked information. MicroSecAgg (PoPETS 2024) proposes a single server secure aggregation protocol that aims to mitigate the high communication complexity of the existing approaches by enabling a one-time setup of the secret to be re-used in multiple training iterations. In this paper, we identify a security flaw in the MicroSecAgg that undermines its privacy guarantees. We detail the security flaw and our attack, demonstrating how an adversary can exploit predictable masking values to compromise user privacy. Our findings highlight the critical need for enhanced security measures in secure aggregation protocols, particularly the implementation of dynamic and unpredictable masking strategies. We propose potential countermeasures to mitigate these vulnerabilities and ensure robust privacy protection in the secure aggregation frameworks.
Published: 2024

19. Fine-grained subjective visual quality assessment for high-fidelity compressed images

Author: Testolina, Michela, Jenadeleh, Mohsen, Mohammadi, Shima, Su, Shaolin, Ascenso, Joao, Ebrahimi, Touradj, Sneyers, Jon, and Saupe, Dietmar
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Advances in image compression, storage, and display technologies have made high-quality images and videos widely accessible. At this level of quality, distinguishing between compressed and original content becomes difficult, highlighting the need for assessment methodologies that are sensitive to even the smallest visual quality differences. Conventional subjective visual quality assessments often use absolute category rating scales, ranging from ``excellent'' to ``bad''. While suitable for evaluating more pronounced distortions, these scales are inadequate for detecting subtle visual differences. The JPEG standardization project AIC is currently developing a subjective image quality assessment methodology for high-fidelity images. This paper presents the proposed assessment methods, a dataset of high-quality compressed images, and their corresponding crowdsourced visual quality ratings. It also outlines a data analysis approach that reconstructs quality scale values in just noticeable difference (JND) units. The assessment method uses boosting techniques on visual stimuli to help observers detect compression artifacts more clearly. This is followed by a rescaling process that adjusts the boosted quality values back to the original perceptual scale. This reconstruction yields a fine-grained, high-precision quality scale in JND units, providing more informative results for practical applications. The dataset and code to reproduce the results will be available at https://github.com/jpeg-aic/dataset-BTC-PTC-24., Comment: Michela Testolina, Mohsen Jenadeleh contributed equally to this work, submitted to the Data Compression Conference (DCC) 2025
Published: 2024

20. Bounds on the Complete Forcing Number of Graphs

Author: Ebrahimi, Javad B., Nemayande, Aref, and Tohidi, Elahe
Subjects: Mathematics - Combinatorics, Computer Science - Discrete Mathematics
Abstract: A forcing set for a perfect matching of a graph is defined as a subset of the edges of that perfect matching such that there exists a unique perfect matching containing it. A complete forcing set for a graph is a subset of its edges, such that it intersects the edges of every perfect matching in a forcing set of that perfect matching. The size of a smallest complete forcing set of a graph is called the complete forcing number of the graph. In this paper, we derive new upper bounds for the complete forcing number of graphs in terms of other graph theoretical parameters such as the degeneracy or the spectral radius of the graph. We show that for graphs with the number of edges more than some constant times the number of vertices, our result outperforms the best known upper bound for the complete forcing number. For the set of edge-transitive graphs, we present a lower bound for the complete forcing number in terms of maximum forcing number. This result in particular is applied to the hypercube graphs and Cartesian powers of even cycles., Comment: 18 pages
Published: 2024

21. Autonomous Wheel Loader Navigation Using Goal-Conditioned Actor-Critic MPC

Author: Mäki-Penttilä, Aleksi, Toulkani, Naeim Ebrahimi, and Ghabcheloo, Reza
Subjects: Computer Science - Robotics, Electrical Engineering and Systems Science - Systems and Control
Abstract: This paper proposes a novel control method for an autonomous wheel loader, enabling time-efficient navigation to an arbitrary goal pose. Unlike prior works that combine high-level trajectory planners with Model Predictive Control (MPC), we directly enhance the planning capabilities of MPC by integrating a cost function derived from Actor-Critic Reinforcement Learning (RL). Specifically, we train an RL agent to solve the pose reaching task in simulation, then incorporate the trained neural network critic as both the stage and terminal cost of an MPC. We show through comprehensive simulations that the resulting MPC inherits the time-efficient behavior of the RL agent, generating trajectories that compare favorably against those found using trajectory optimization. We also deploy our method on a real wheel loader, where we successfully navigate to various goal poses., Comment: Submitted to International Conference on Robotics and Automation (ICRA) 2025
Published: 2024

22. Ramanujan graphs with diameter at most three

Author: Ebrahimi, Mahdi
Subjects: Mathematics - Combinatorics, 05C50
Abstract: For a simple graph $G$, the complement and the line graph of $G$ are denoted by $G^c$ and $L(G)$, respectively. In this paper, we show that for every simple connected regular graph $G$ with at least $5$ vertices, the graph $\mathcal{R}(G):=L(L(G)^c)^c$ is a Ramanujan graph with diameter at most three.
Published: 2024

23. Orthogonally additive polynomials on the bidual of Banach algebras

Author: Khosravi, Aminallah, Vishki, Hamid Reza Ebrahimi, and Faal, Ramin
Subjects: Mathematics - Functional Analysis
Abstract: We say that a Banach algebra A has $k$-orthogonally additive property ($k$-OA property, for short) if every orthogonally additive k-homogeneous polynomial $P:\mathcal{A}\to \mathbb{C}$ can be expressed in the standard form $P(x)=\langle \gamma,x^k\rangle$, $(x\in \mathcal{A})$, for some $\gamma\in \mathcal{A}^*$. In this paper we first investigate the extensions of a $k$-homogeneous polynomial from $\mathcal{A}$ to the bidual $\mathcal{A}^{**}$; equipped with the first Arens product. We then study the relationship between $k$-OA properties of $\mathcal{A}$ and $\mathcal{A}^{**}$: This relation is specially investigated for a dual Banach algebra. Finally we examine our results for the dual Banach algebra $\ell^{1}$, with pointwise product, and we show that the Banach algebra $(\ell^{1})^{**}$ enjoys k-OA property.
Published: 2024

24. Optimal Design of Vehicle Dynamics Using Gradient-Based, Mixed-Fidelity Multidisciplinary Optimization

Author: Cheong, Hyunmin, Ebrahimi, Mehran, Salehipour, Hesam, Butscher, Adrian, and Tessier, Alex
Subjects: Electrical Engineering and Systems Science - Systems and Control
Abstract: In automotive engineering, designing for optimal vehicle dynamics is challenging due to the complexities involved in analysing the behaviour of a multibody system. Typically, a simplified set of dynamics equations for only the key bodies of the vehicle such as the chassis and wheels are formulated while reducing their degrees of freedom. In contrast, one could employ high-fidelity multibody dynamics simulation and include more intricate details such as the individual suspension components while considering full degrees of freedom for all bodies; however, this is more computationally demanding. Also, for gradient-based design optimization, computing adjoints for different objective functions can be more challenging for the latter approach, and often not feasible if an existing multibody dynamics solver is used. We propose a mixed-fidelity multidisciplinary approach, in which a simplified set of dynamics equations are used to model the whole vehicle while incorporating a high-fidelity multibody suspension module as an additional coupled discipline. We then employ MAUD (modular analysis and unified derivatives) to combine analytical derivatives based on the dynamics equations and finite differences obtained using an existing multibody solver. Also, we use a collocation method for time integration, which solves for both the system trajectory and optimal design variables simultaneously. The benefits of our approach are shown in an experiment conducted to find optimal vehicle parameters that optimize ride comfort and driving performance considering vertical vehicle dynamics., Comment: ECCOMAS Congress 2024
Published: 2024

25. ChatGPT's Potential in Cryptography Misuse Detection: A Comparative Analysis with Static Analysis Tools

Author: Firouzi, Ehsan, Ghafari, Mohammad, and Ebrahimi, Mike
Subjects: Computer Science - Cryptography and Security, Computer Science - Artificial Intelligence
Abstract: The correct adoption of cryptography APIs is challenging for mainstream developers, often resulting in widespread API misuse. Meanwhile, cryptography misuse detectors have demonstrated inconsistent performance and remain largely inaccessible to most developers. We investigated the extent to which ChatGPT can detect cryptography misuses and compared its performance with that of the state-of-the-art static analysis tools. Our investigation, mainly based on the CryptoAPI-Bench benchmark, demonstrated that ChatGPT is effective in identifying cryptography API misuses, and with the use of prompt engineering, it can even outperform leading static cryptography misuse detectors., Comment: ESEM 2024
Published: 2024
Full Text: View/download PDF

26. Distributed Robust Continuous-Time Optimization Algorithms for Time-Varying Constrained Cost

Author: Ebrahimi, Zeinab and Deghat, Mohammad
Subjects: Electrical Engineering and Systems Science - Systems and Control
Abstract: This paper presents a distributed continuous-time optimization framework aimed at overcoming the challenges posed by time-varying cost functions and constraints in multi-agent systems, particularly those subject to disturbances. By incorporating tools such as log-barrier penalty functions to address inequality constraints, an integral sliding mode control for disturbance mitigation is proposed. The algorithm ensures asymptotic tracking of the optimal solution, achieving a tracking error of zero. The convergence of the introduced algorithms is demonstrated through Lyapunov analysis and nonsmooth techniques. Furthermore, the framework's effectiveness is validated through numerical simulations considering two scenarios for the communication networks., Comment: 7 pages, 3 figures, Accepted for publication in the 12th International Conference on Control, Mechatronics and Automation (ICCMA 2024)
Published: 2024

27. Statistical Distance-Guided Unsupervised Domain Adaptation for Automated Multi-Class Cardiovascular Magnetic Resonance Image Quality Assessment

Author: Nabavi, Shahabedin, Hamedani, Kian Anvari, Moghaddam, Mohsen Ebrahimi, Abin, Ahmad Ali, and Frangi, Alejandro F.
Subjects: Electrical Engineering and Systems Science - Image and Video Processing
Abstract: This study proposes an attention-based statistical distance-guided unsupervised domain adaptation model for multi-class cardiovascular magnetic resonance (CMR) image quality assessment. The proposed model consists of a feature extractor, a label predictor and a statistical distance estimator. An annotated dataset as the source set and an unlabeled dataset as the target set with different statistical distributions are considered inputs. The statistical distance estimator approximates the Wasserstein distance between the extracted feature vectors from the source and target data in a mini-batch. The label predictor predicts data labels of source data and uses a combinational loss function for training, which includes cross entropy and centre loss functions plus the estimated value of the distance estimator. Four datasets, including imaging and k-space data, were used to evaluate the proposed model in identifying four common CMR imaging artefacts: respiratory and cardiac motions, Gibbs ringing and Aliasing. The results of the extensive experiments showed that the proposed model, both in image and k-space analysis, has an acceptable performance in covering the domain shift between the source and target sets. The model explainability evaluations and the ablation studies confirmed the proper functioning and effectiveness of all the model's modules. The proposed model outperformed the previous studies regarding performance and the number of examined artefacts. The proposed model can be used for CMR post-imaging quality control or in large-scale cohort studies for image and k-space quality assessment due to the appropriate performance in domain shift coverage without a tedious data-labelling process.
Published: 2024

28. Learning Multi-agent Multi-machine Tending by Mobile Robots

Author: Abdalwhab, Abdalwhab, Beltrame, Giovanni, Kahou, Samira Ebrahimi, and St-Onge, David
Subjects: Computer Science - Robotics, Computer Science - Machine Learning
Abstract: Robotics can help address the growing worker shortage challenge of the manufacturing industry. As such, machine tending is a task collaborative robots can tackle that can also highly boost productivity. Nevertheless, existing robotics systems deployed in that sector rely on a fixed single-arm setup, whereas mobile robots can provide more flexibility and scalability. In this work, we introduce a multi-agent multi-machine tending learning framework by mobile robots based on Multi-agent Reinforcement Learning (MARL) techniques with the design of a suitable observation and reward. Moreover, an attention-based encoding mechanism is developed and integrated into Multi-agent Proximal Policy Optimization (MAPPO) algorithm to boost its performance for machine tending scenarios. Our model (AB-MAPPO) outperformed MAPPO in this new challenging scenario in terms of task success, safety, and resources utilization. Furthermore, we provided an extensive ablation study to support our various design decisions., Comment: 7 pages, 4 figures
Published: 2024

29. A New Method for Cross-Lingual-based Semantic Role Labeling

Author: Ebrahimi, Mohammad, Bidgoli, Behrouz Minaei, and Khozouei, Nasim
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Semantic role labeling is a crucial task in natural language processing, enabling better comprehension of natural language. However, the lack of annotated data in multiple languages has posed a challenge for researchers. To address this, a deep learning algorithm based on model transfer has been proposed. The algorithm utilizes a dataset consisting of the English portion of CoNLL2009 and a corpus of semantic roles in Persian. To optimize the efficiency of training, only ten percent of the educational data from each language is used. The results of the proposed model demonstrate significant improvements compared to Niksirt et al.'s model. In monolingual mode, the proposed model achieved a 2.05 percent improvement on F1-score, while in cross-lingual mode, the improvement was even more substantial, reaching 6.23 percent. Worth noting is that the compared model only trained two of the four stages of semantic role labeling and employed golden data for the remaining two stages. This suggests that the actual superiority of the proposed model surpasses the reported numbers by a significant margin. The development of cross-lingual methods for semantic role labeling holds promise, particularly in addressing the scarcity of annotated data for various languages. These advancements pave the way for further research in understanding and processing natural language across different linguistic contexts.
Published: 2024

30. Differentially Private Stochastic Gradient Descent with Fixed-Size Minibatches: Tighter RDP Guarantees with or without Replacement

Author: Birrell, Jeremiah, Ebrahimi, Reza, Behnia, Rouzbeh, and Pacheco, Jason
Subjects: Computer Science - Machine Learning, Computer Science - Cryptography and Security
Abstract: Differentially private stochastic gradient descent (DP-SGD) has been instrumental in privately training deep learning models by providing a framework to control and track the privacy loss incurred during training. At the core of this computation lies a subsampling method that uses a privacy amplification lemma to enhance the privacy guarantees provided by the additive noise. Fixed size subsampling is appealing for its constant memory usage, unlike the variable sized minibatches in Poisson subsampling. It is also of interest in addressing class imbalance and federated learning. However, the current computable guarantees for fixed-size subsampling are not tight and do not consider both add/remove and replace-one adjacency relationships. We present a new and holistic R{\'e}nyi differential privacy (RDP) accountant for DP-SGD with fixed-size subsampling without replacement (FSwoR) and with replacement (FSwR). For FSwoR we consider both add/remove and replace-one adjacency. Our FSwoR results improves on the best current computable bound by a factor of $4$. We also show for the first time that the widely-used Poisson subsampling and FSwoR with replace-one adjacency have the same privacy to leading order in the sampling probability. Accordingly, our work suggests that FSwoR is often preferable to Poisson subsampling due to constant memory usage. Our FSwR accountant includes explicit non-asymptotic upper and lower bounds and, to the authors' knowledge, is the first such analysis of fixed-size RDP with replacement for DP-SGD. We analytically and empirically compare fixed size and Poisson subsampling, and show that DP-SGD gradients in a fixed-size subsampling regime exhibit lower variance in practice in addition to memory usage benefits., Comment: 39 pages, 10 figures
Published: 2024

31. Boosting Unconstrained Face Recognition with Targeted Style Adversary

Author: Saadabadi, Mohammad Saeed Ebrahimi, Malakshan, Sahar Rahimi, Hosseini, Seyed Rasoul, and Nasrabadi, Nasser M.
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: While deep face recognition models have demonstrated remarkable performance, they often struggle on the inputs from domains beyond their training data. Recent attempts aim to expand the training set by relying on computationally expensive and inherently challenging image-space augmentation of image generation modules. In an orthogonal direction, we present a simple yet effective method to expand the training data by interpolating between instance-level feature statistics across labeled and unlabeled sets. Our method, dubbed Targeted Style Adversary (TSA), is motivated by two observations: (i) the input domain is reflected in feature statistics, and (ii) face recognition model performance is influenced by style information. Shifting towards an unlabeled style implicitly synthesizes challenging training instances. We devise a recognizability metric to constraint our framework to preserve the inherent identity-related information of labeled instances. The efficacy of our method is demonstrated through evaluations on unconstrained benchmarks, outperforming or being on par with its competitors while offering nearly a 70\% improvement in training speed and 40\% less memory consumption.
Published: 2024

32. An Introduction to Reinforcement Learning: Fundamental Concepts and Practical Applications

Author: Ghasemi, Majid, Moosavi, Amir Hossein, Sorkhoh, Ibrahim, Agrawal, Anjali, Alzhouri, Fadi, and Ebrahimi, Dariush
Subjects: Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Reinforcement Learning (RL) is a branch of Artificial Intelligence (AI) which focuses on training agents to make decisions by interacting with their environment to maximize cumulative rewards. An overview of RL is provided in this paper, which discusses its core concepts, methodologies, recent trends, and resources for learning. We provide a detailed explanation of key components of RL such as states, actions, policies, and reward signals so that the reader can build a foundational understanding. The paper also provides examples of various RL algorithms, including model-free and model-based methods. In addition, RL algorithms are introduced and resources for learning and implementing them are provided, such as books, courses, and online communities. This paper demystifies a comprehensive yet simple introduction for beginners by offering a structured and clear pathway for acquiring and implementing real-time techniques.
Published: 2024

33. CROME: Cross-Modal Adapters for Efficient Multimodal LLM

Author: Ebrahimi, Sayna, Arik, Sercan O., Nama, Tejas, and Pfister, Tomas
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: Multimodal Large Language Models (MLLMs) demonstrate remarkable image-language capabilities, but their widespread use faces challenges in cost-effective training and adaptation. Existing approaches often necessitate expensive language model retraining and limited adaptability. Additionally, the current focus on zero-shot performance improvements offers insufficient guidance for task-specific tuning. We propose CROME, an efficient vision-language instruction tuning framework. It features a novel gated cross-modal adapter that effectively combines visual and textual representations prior to input into a frozen LLM. This lightweight adapter, trained with minimal parameters, enables efficient cross-modal understanding. Notably, CROME demonstrates superior zero-shot performance on standard visual question answering and instruction-following benchmarks. Moreover, it yields fine-tuning with exceptional parameter efficiency, competing with task-specific specialist state-of-the-art methods. CROME demonstrates the potential of pre-LM alignment for building scalable, adaptable, and parameter-efficient multimodal models.
Published: 2024

34. Your Context Is Not an Array: Unveiling Random Access Limitations in Transformers

Author: Ebrahimi, MohammadReza, Panchal, Sunny, and Memisevic, Roland
Subjects: Computer Science - Computation and Language
Abstract: Despite their recent successes, Transformer-based large language models show surprising failure modes. A well-known example of such failure modes is their inability to length-generalize: solving problem instances at inference time that are longer than those seen during training. In this work, we further explore the root cause of this failure by performing a detailed analysis of model behaviors on the simple parity task. Our analysis suggests that length generalization failures are intricately related to a model's inability to perform random memory accesses within its context window. We present supporting evidence for this hypothesis by demonstrating the effectiveness of methodologies that circumvent the need for indexing or that enable random token access indirectly, through content-based addressing. We further show where and how the failure to perform random memory access manifests through attention map visualizations., Comment: Published as a conference paper at COLM 2024
Published: 2024

35. Holographic dark energy in Barrow cosmology with Granda-Oliveros IR cutoff

Author: Motaghi, M., Sheykhi, A., and Ebrahimi, E.
Subjects: General Relativity and Quantum Cosmology, High Energy Physics - Theory
Abstract: Applying the modified Barrow entropy, inspired by the quantum fluctuation effects, to the cosmological background, and using thermodynamics-gravity conjuncture, the Friedmann equations get modified as well. In this paper, we explore the holographic dark energy with Granda-Oliveros (GO) IR cutoff, in the context of the modified Barrow cosmology. First, we assume two dark components of the universe evolves independently and obtain the cosmological parameters and explore the cosmic evolution. Second, we consider an interaction term between dark energy (DE) and dark matter (DM). We observe that the Barrow parameter $\delta$ crucially affects the cosmic dynamics, causes the transition from the decelerating phase to the accelerating phase occurs later. We find out that the equation of state parameter is in the quintessence region in the past and crosses the phantom divide at the present time. Finally, we examine the squared speed of sound analysis for this model. According to the squared sound speed diagrams, the results indicate that the presence of interaction between DM and DE as well as increasing in the value of $\delta$ leads to the manifestation of signs of instability in the past $(v_s^2<0)$. Furthermore, by examining the statefinder, we find that presence of $\delta$ also makes a distinction between holographic dark energy in Barrow cosmology with GO-IR cutoff and the $\Lambda$CDM model. In fact, increasing $\delta$ causes the statefinder diagram move away from the point of $\left\lbrace r,s\right\rbrace= \left\lbrace 1,0\right\rbrace$ at $z=0$., Comment: 20 pages, 11 figures
Published: 2024

36. Empowering Clinicians with Medical Decision Transformers: A Framework for Sepsis Treatment

Author: Rahman, Aamer Abdul, Agarwal, Pranav, Noumeir, Rita, Jouvet, Philippe, Michalski, Vincent, and Kahou, Samira Ebrahimi
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: Offline reinforcement learning has shown promise for solving tasks in safety-critical settings, such as clinical decision support. Its application, however, has been limited by the lack of interpretability and interactivity for clinicians. To address these challenges, we propose the medical decision transformer (MeDT), a novel and versatile framework based on the goal-conditioned reinforcement learning paradigm for sepsis treatment recommendation. MeDT uses the decision transformer architecture to learn a policy for drug dosage recommendation. During offline training, MeDT utilizes collected treatment trajectories to predict administered treatments for each time step, incorporating known treatment outcomes, target acuity scores, past treatment decisions, and current and past medical states. This analysis enables MeDT to capture complex dependencies among a patient's medical history, treatment decisions, outcomes, and short-term effects on stability. Our proposed conditioning uses acuity scores to address sparse reward issues and to facilitate clinician-model interactions, enhancing decision-making. Following training, MeDT can generate tailored treatment recommendations by conditioning on the desired positive outcome (survival) and user-specified short-term stability improvements. We carry out rigorous experiments on data from the MIMIC-III dataset and use off-policy evaluation to demonstrate that MeDT recommends interventions that outperform or are competitive with existing offline reinforcement learning methods while enabling a more interpretable, personalized and clinician-directed approach.
Published: 2024

37. Enhancing Diversity in Multi-objective Feature Selection

Author: Miyandoab, Sevil Zanjani, Rahnamayan, Shahryar, Bidgoli, Azam Asilian, Ebrahimi, Sevda, and Makrehchi, Masoud
Subjects: Computer Science - Machine Learning, 68T05 (Primary), 68T20, 68W20, 68W40, 90C29, 90C27, 62H30, 62H25 (Secondary), I.2.6, I.2.8, I.5, H.2.8
Abstract: Feature selection plays a pivotal role in the data preprocessing and model-building pipeline, significantly enhancing model performance, interpretability, and resource efficiency across diverse domains. In population-based optimization methods, the generation of diverse individuals holds utmost importance for adequately exploring the problem landscape, particularly in highly multi-modal multi-objective optimization problems. Our study reveals that, in line with findings from several prior research papers, commonly employed crossover and mutation operations lack the capability to generate high-quality diverse individuals and tend to become confined to limited areas around various local optima. This paper introduces an augmentation to the diversity of the population in the well-established multi-objective scheme of the genetic algorithm, NSGA-II. This enhancement is achieved through two key components: the genuine initialization method and the substitution of the worst individuals with new randomly generated individuals as a re-initialization approach in each generation. The proposed multi-objective feature selection method undergoes testing on twelve real-world classification problems, with the number of features ranging from 2,400 to nearly 50,000. The results demonstrate that replacing the last front of the population with an equivalent number of new random individuals generated using the genuine initialization method and featuring a limited number of features substantially improves the population's quality and, consequently, enhances the performance of the multi-objective algorithm., Comment: 8 pages, 3 figures, published in IEEE WCCI 2024 conference, DOI added
Published: 2024
Full Text: View/download PDF

38. Noncrossing arithmetics

Author: Ebrahimi-Fard, Kurusch, Foissy, Loïc, Kock, Joachim, and Patras, Frédéric
Subjects: Mathematics - Combinatorics, 05A18, 46L54, 11A05, 06A11, 18N50
Abstract: Higher-order notions of Kreweras complementation have appeared in the literature in the works of Krawczyk, Speicher, Mastnak, Nica, Arizmendi, Vargas, and others. While the theory has been developed primarily for specific applications in free probability, it also possesses an elegant, purely combinatorial core that is of independent interest. The present article aims at offering a simple account of various aspects of higher-order Kreweras complementation on the basis of elementary arithmetic, (co)algebraic, categorical and simplicial properties of noncrossing partitions. The main idea is to see noncrossing partitions as providing an interesting noncommutative analogue of the interplay between the divisibility poset and the multiplicative monoid of positive integers. Just as the divisibility poset can be regarded as the decalage of the multiplicative monoid, we exhibit the lattice of noncrossing partitions as the decalage of a partial monoid structure on noncrossing partitions encoding higher-order Kreweras complements. While our results may be considered known, some of the viewpoints can be regarded as novel, providing an efficient approach both conceptually and computationally., Comment: For the CATMI proceedings volume. 19pp
Published: 2024

39. ARoFace: Alignment Robustness to Improve Low-Quality Face Recognition

Author: Saadabadi, Mohammad Saeed Ebrahimi, Malakshan, Sahar Rahimi, Dabouei, Ali, and Nasrabadi, Nasser M.
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Aiming to enhance Face Recognition (FR) on Low-Quality (LQ) inputs, recent studies suggest incorporating synthetic LQ samples into training. Although promising, the quality factors that are considered in these works are general rather than FR-specific, \eg, atmospheric turbulence, resolution, \etc. Motivated by the observation of the vulnerability of current FR models to even small Face Alignment Errors (FAE) in LQ images, we present a simple yet effective method that considers FAE as another quality factor that is tailored to FR. We seek to improve LQ FR by enhancing FR models' robustness to FAE. To this aim, we formalize the problem as a combination of differentiable spatial transformations and adversarial data augmentation in FR. We perturb the alignment of the training samples using a controllable spatial transformation and enrich the training with samples expressing FAE. We demonstrate the benefits of the proposed method by conducting evaluations on IJB-B, IJB-C, IJB-S (+4.3\% Rank1), and TinyFace (+2.63\%). \href{https://github.com/msed-Ebrahimi/ARoFace}{https://github.com/msed-Ebrahimi/ARoFace}, Comment: European Conference on Computer Vision (ECCV 2024)
Published: 2024

40. Sharif-STR at SemEval-2024 Task 1: Transformer as a Regression Model for Fine-Grained Scoring of Textual Semantic Relations

Author: Ebrahimi, Seyedeh Fatemeh, Azari, Karim Akhavan, Iravani, Amirmasoud, Alizadeh, Hadi, Taghavi, Zeinab Sadat, and Sameti, Hossein
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Semantic Textual Relatedness holds significant relevance in Natural Language Processing, finding applications across various domains. Traditionally, approaches to STR have relied on knowledge-based and statistical methods. However, with the emergence of Large Language Models, there has been a paradigm shift, ushering in new methodologies. In this paper, we delve into the investigation of sentence-level STR within Track A (Supervised) by leveraging fine-tuning techniques on the RoBERTa transformer. Our study focuses on assessing the efficacy of this approach across different languages. Notably, our findings indicate promising advancements in STR performance, particularly in Latin languages. Specifically, our results demonstrate notable improvements in English, achieving a correlation of 0.82 and securing a commendable 19th rank. Similarly, in Spanish, we achieved a correlation of 0.67, securing the 15th position. However, our approach encounters challenges in languages like Arabic, where we observed a correlation of only 0.38, resulting in a 20th rank., Comment: 10 pages, 9 figures, 4 tables
Published: 2024

41. Sharif-MGTD at SemEval-2024 Task 8: A Transformer-Based Approach to Detect Machine Generated Text

Author: Ebrahimi, Seyedeh Fatemeh, Azari, Karim Akhavan, Iravani, Amirmasoud, Qazvini, Arian, Sadeghi, Pouya, Taghavi, Zeinab Sadat, and Sameti, Hossein
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Detecting Machine-Generated Text (MGT) has emerged as a significant area of study within Natural Language Processing. While language models generate text, they often leave discernible traces, which can be scrutinized using either traditional feature-based methods or more advanced neural language models. In this research, we explore the effectiveness of fine-tuning a RoBERTa-base transformer, a powerful neural architecture, to address MGT detection as a binary classification task. Focusing specifically on Subtask A (Monolingual-English) within the SemEval-2024 competition framework, our proposed system achieves an accuracy of 78.9% on the test dataset, positioning us at 57th among participants. Our study addresses this challenge while considering the limited hardware resources, resulting in a system that excels at identifying human-written texts but encounters challenges in accurately discerning MGTs., Comment: 8 pages, 3 figures, 2 tables. Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)
Published: 2024

42. Academic Article Recommendation Using Multiple Perspectives

Author: Church, Kenneth, Alonso, Omar, Vickers, Peter, Sun, Jiameng, Ebrahimi, Abteen, and Chandrasekar, Raman
Subjects: Computer Science - Information Retrieval
Abstract: We argue that Content-based filtering (CBF) and Graph-based methods (GB) complement one another in Academic Search recommendations. The scientific literature can be viewed as a conversation between authors and the audience. CBF uses abstracts to infer authors' positions, and GB uses citations to infer responses from the audience. In this paper, we describe nine differences between CBF and GB, as well as synergistic opportunities for hybrid combinations. Two embeddings will be used to illustrate these opportunities: (1) Specter, a CBF method based on BERT-like deepnet encodings of abstracts, and (2) ProNE, a GB method based on spectral clustering of more than 200M papers and 2B citations from Semantic Scholar.
Published: 2024

43. Towards A Comprehensive Visual Saliency Explanation Framework for AI-based Face Recognition Systems

Author: Lu, Yuhang, Xu, Zewei, and Ebrahimi, Touradj
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: Over recent years, deep convolutional neural networks have significantly advanced the field of face recognition techniques for both verification and identification purposes. Despite the impressive accuracy, these neural networks are often criticized for lacking explainability. There is a growing demand for understanding the decision-making process of AI-based face recognition systems. Some studies have investigated the use of visual saliency maps as explanations, but they have predominantly focused on the specific face verification case. The discussion on more general face recognition scenarios and the corresponding evaluation methodology for these explanations have long been absent in current research. Therefore, this manuscript conceives a comprehensive explanation framework for face recognition tasks. Firstly, an exhaustive definition of visual saliency map-based explanations for AI-based face recognition systems is provided, taking into account the two most common recognition situations individually, i.e., face verification and identification. Secondly, a new model-agnostic explanation method named CorrRISE is proposed to produce saliency maps, which reveal both the similar and dissimilar regions between any given face images. Subsequently, the explanation framework conceives a new evaluation methodology that offers quantitative measurement and comparison of the performance of general visual saliency explanation methods in face recognition. Consequently, extensive experiments are carried out on multiple verification and identification scenarios. The results showcase that CorrRISE generates insightful saliency maps and demonstrates superior performance, particularly in similarity maps in comparison with the state-of-the-art explanation approaches., Comment: arXiv admin note: text overlap with arXiv:2305.08546
Published: 2024

44. Reinforcement Learning for Sequence Design Leveraging Protein Language Models

Author: Subramanian, Jithendaraa, Sujit, Shivakanth, Irtisam, Niloy, Sain, Umong, Islam, Riashat, Nowrouzezahrai, Derek, and Kahou, Samira Ebrahimi
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Quantitative Biology - Biomolecules
Abstract: Protein sequence design, determined by amino acid sequences, are essential to protein engineering problems in drug discovery. Prior approaches have resorted to evolutionary strategies or Monte-Carlo methods for protein design, but often fail to exploit the structure of the combinatorial search space, to generalize to unseen sequences. In the context of discrete black box optimization over large search spaces, learning a mutation policy to generate novel sequences with reinforcement learning is appealing. Recent advances in protein language models (PLMs) trained on large corpora of protein sequences offer a potential solution to this problem by scoring proteins according to their biological plausibility (such as the TM-score). In this work, we propose to use PLMs as a reward function to generate new sequences. Yet the PLM can be computationally expensive to query due to its large size. To this end, we propose an alternative paradigm where optimization can be performed on scores from a smaller proxy model that is periodically finetuned, jointly while learning the mutation policy. We perform extensive experiments on various sequence lengths to benchmark RL-based approaches, and provide comprehensive evaluations along biological plausibility and diversity of the protein. Our experimental results include favorable evaluations of the proposed sequences, along with high diversity scores, demonstrating that RL is a strong candidate for biological sequence design. Finally, we provide a modular open source implementation can be easily integrated in most RL training loops, with support for replacing the reward model with other PLMs, to spur further research in this domain. The code for all experiments is provided in the supplementary material., Comment: 22 pages, 7 figures, 4 tables
Published: 2024

45. Shower Separation in Five Dimensions for Highly Granular Calorimeters using Machine Learning

Author: Lai, S., Utehs, J., Wilhahn, A., Fouz, M. C., Bach, O., Brianne, E., Ebrahimi, A., Gadow, K., Göttlicher, P., Hartbrich, O., Heuchel, D., Irles, A., Krüger, K., Kvasnicka, J., Lu, S., Neubüser, C., Provenza, A., Reinecke, M., Sefkow, F., Schuwalow, S., De Silva, M., Sudo, Y., Tran, H. L., Liu, L., Masuda, R., Murata, T., Ootani, W., Seino, T., Takatsu, T., Tsuji, N., Pöschl, R., Richard, F., Zerwas, D., Hummer, F., Simon, F., Boudry, V., Brient, J-C., Nanni, J., Videau, H., Buhmann, E., Garutti, E., Huck, S., Kasieczka, G., Martens, S., Rolph, J., Wellhausen, J., Bilki, B., Northacker, D., Onel, Y., Emberger, L., and Graf, C.
Subjects: Physics - Instrumentation and Detectors
Abstract: To achieve state-of-the-art jet energy resolution for Particle Flow, sophisticated energy clustering algorithms must be developed that can fully exploit available information to separate energy deposits from charged and neutral particles. Three published neural network-based shower separation models were applied to simulation and experimental data to measure the performance of the highly granular CALICE Analogue Hadronic Calorimeter (AHCAL) technological prototype in distinguishing the energy deposited by a single charged and single neutral hadron for Particle Flow. The performance of models trained using only standard spatial and energy and charged track position information from an event was compared to models trained using timing information available from AHCAL, which is expected to improve sensitivity to shower development and, therefore, aid in clustering. Both simulation and experimental data were used to train and test the models and their performances were compared. The best-performing neural network achieved significantly superior event reconstruction when timing information was utilised in training for the case where the charged hadron had more energy than the neutral one, motivating temporally sensitive calorimeters. All models under test were observed to tend to allocate energy deposited by the more energetic of the two showers to the less energetic one. Similar shower reconstruction performance was observed for a model trained on simulation and applied to data and a model trained and applied to data.
Published: 2024

46. A multiplicative surface signature through its Magnus expansion

Author: Chevyrev, Ilya, Diehl, Joscha, Ebrahimi-Fard, Kurusch, and Tapia, Nikolas
Subjects: Mathematics - Rings and Algebras, Mathematics - Classical Analysis and ODEs
Abstract: In the last decade, the concept of path signature has found great success in data science applications, where it provides features describing the path. This is partly explained by the fact that it is possible to compute the signature of a path in linear time, owing to a dynamic programming principle, based on Chen's identity. The path signature can be regarded as a specific example of product or time-/path-ordered integral. In other words, it can be seen as a 1-parameter object build on iterated integrals over a path. Increasing the number of parameters by one, which amounts to considering iterated integrals over surfaces, is more complicated. An observation that is familiar in the context of higher gauge theory where multiparameter iterated integrals play an important role. The 2-parameter case is naturally related to a non-commutative version of Stokes' theorem, which is understood to be fundamentally linked to the concept of crossed modules of groups. Indeed, crossed modules with non-trivial kernel of the feedback map permit to compute features of a surface that go beyond what can be expressed by computing line integrals along the boundary of a surface. A good candidate for the crossed analog of free Lie algebra then seems to be a certain free crossed module over it. Building on work by Kapranov, we study the analog to the classical path signature taking values in such a free crossed module of Lie algebra. In particular, we provide a Magnus-type expression for the logarithm of surface signature as well as a sewing lemma for the crossed module setting.
Published: 2024

47. Improving Computational Efficiency in DSMC Simulations of Vacuum Gas Dynamics with a Fixed Number of Particles per Cell

Author: Sabouri, Moslem, Zakeri, Ramin, and Ebrahimi, Amin
Subjects: Physics - Fluid Dynamics
Abstract: The present study addresses the challenge of enhancing computational efficiency without compromising accuracy in numerical simulations of vacuum gas dynamics using the direct simulation Monte Carlo (DSMC) method. A technique termed "fixed particle per cell (FPPC)" was employed, which enforces a fixed number of simulator particles across all computational cells. The proposed technique eliminates the need for real-time adjustment of particle weights during simulation, reducing calculation time. Using the SPARTA solver, simulations of rarefied gas flow in a micromixer and rarefied supersonic airflow around a cylinder were conducted to validate the proposed technique. Results demonstrate that applying the FPPC technique effectively reduces computational costs while yielding results comparable to conventional DSMC implementations. Additionally, the application of local grid refinement coupled with the FPPC technique was investigated. The results show that integrating local grid refinement with the FPPC technique enables accurate prediction of flow behaviour in regions with significant gradients. These findings highlight the efficacy of the proposed technique in improving the accuracy and efficiency of numerical simulations of complex vacuum gas dynamics at a reduced computational cost.
Published: 2024
Full Text: View/download PDF

48. Cayley graphs on $p$-solvable groups generated by $p$-singular elements

Author: Ebrahimi, Mahdi
Subjects: Mathematics - Combinatorics, 05C50, 20C15, 20C20, 05C92
Abstract: For a graph $\Gamma$, the multiplicity of the eigenvalue $0$, denoted by $\eta(\Gamma)$, is called the nullity of $\Gamma$. Also the energy of $\Gamma$, denoted by $\mathcal{E}(\Gamma)$, is defined as the sum of the absolute values of the eigenvalues of $\Gamma$. The index of a subgroup $H$ in a group $G$ is denoted by $[G:H]$. For a prime $p$, let $G$ be a finite $p$-solvable group whose order is divisible by $p$. Also let $\Omega_p(G)$ be the set of all $p$-singular elements of $G$. In this paper, we apply block theory of finite groups to show that the Cayley graph $\Gamma_p(G):=\mathrm{Cay}(G,\Omega_p(G))$ is an integral graph with $\eta(\Gamma_p(G))=|G|-[G:O_{p^\prime}(G)]$, where $O_{p^\prime}(G)$ is the largest normal subgroup of $G$ whose order is co-prime to $p$. We also find a lower bound for $\mathcal{E}(\Gamma_p(G))$. Finally, we prove that the diameter of $\Gamma_p(G)$ is at most $ |G|_p$.
Published: 2024

49. Towards Neural Architecture Search for Transfer Learning in 6G Networks

Author: Orucu, Adam, Moradi, Farnaz, Ebrahimi, Masoumeh, and Johnsson, Andreas
Subjects: Computer Science - Networking and Internet Architecture, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: The future 6G network is envisioned to be AI-native, and as such, ML models will be pervasive in support of optimizing performance, reducing energy consumption, and in coping with increasing complexity and heterogeneity. A key challenge is automating the process of finding optimal model architectures satisfying stringent requirements stemming from varying tasks, dynamicity and available resources in the infrastructure and deployment positions. In this paper, we describe and review the state-of-the-art in Neural Architecture Search and Transfer Learning and their applicability in networking. Further, we identify open research challenges and set directions with a specific focus on three main requirements with elements unique to the future network, namely combining NAS and TL, multi-objective search, and tabular data. Finally, we outline and discuss both near-term and long-term work ahead.
Published: 2024

50. Learning to Play Atari in a World of Tokens

Author: Agarwal, Pranav, Andrews, Sheldon, and Kahou, Samira Ebrahimi
Subjects: Computer Science - Machine Learning
Abstract: Model-based reinforcement learning agents utilizing transformers have shown improved sample efficiency due to their ability to model extended context, resulting in more accurate world models. However, for complex reasoning and planning tasks, these methods primarily rely on continuous representations. This complicates modeling of discrete properties of the real world such as disjoint object classes between which interpolation is not plausible. In this work, we introduce discrete abstract representations for transformer-based learning (DART), a sample-efficient method utilizing discrete representations for modeling both the world and learning behavior. We incorporate a transformer-decoder for auto-regressive world modeling and a transformer-encoder for learning behavior by attending to task-relevant cues in the discrete representation of the world model. For handling partial observability, we aggregate information from past time steps as memory tokens. DART outperforms previous state-of-the-art methods that do not use look-ahead search on the Atari 100k sample efficiency benchmark with a median human-normalized score of 0.790 and beats humans in 9 out of 26 games. We release our code at https://pranaval.github.io/DART/., Comment: Accepted at ICML 2024
Published: 2024

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

79,961 results on '"Ebrahimi A."'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources