62,739 results on '"Reda, A"'
Search Results
2. Nash equilibria in scalar discrete-time linear quadratic games
- Author
-
Salizzoni, Giulio, Ouhamma, Reda, and Kamgarpour, Maryam
- Subjects
Computer Science - Computer Science and Game Theory ,Computer Science - Multiagent Systems - Abstract
An open problem in linear quadratic (LQ) games has been characterizing the Nash equilibria. This problem has renewed relevance given the surge of work on understanding the convergence of learning algorithms in dynamic games. This paper investigates scalar discrete-time infinite-horizon LQ games with two agents. Even in this arguably simple setting, there are no results for finding $\textit{all}$ Nash equilibria. By analyzing the best response map, we formulate a polynomial system of equations characterizing the linear feedback Nash equilibria. This enables us to bring in tools from algebraic geometry, particularly the Gr\"obner basis, to study the roots of this polynomial system. Consequently, we can not only compute all Nash equilibria numerically, but we can also characterize their number with explicit conditions. For instance, we prove that the LQ games under consideration admit at most three Nash equilibria. We further provide sufficient conditions for the existence of at most two Nash equilibria and sufficient conditions for the uniqueness of the Nash equilibrium. Our numerical experiments demonstrate the tightness of our bounds and showcase the increased complexity in settings with more than two agents.
- Published
- 2024
3. Maximizing the Potential of Synthetic Data: Insights from Random Matrix Theory
- Author
-
Firdoussi, Aymane El, Seddik, Mohamed El Amine, Hayou, Soufiane, Alami, Reda, Alzubaidi, Ahmed, and Hacid, Hakim
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence ,Mathematics - Statistics Theory - Abstract
Synthetic data has gained attention for training large language models, but poor-quality data can harm performance (see, e.g., Shumailov et al. (2023); Seddik et al. (2024)). A potential solution is data pruning, which retains only high-quality data based on a score function (human or machine feedback). Previous work Feng et al. (2024) analyzed models trained on synthetic data as sample size increases. We extend this by using random matrix theory to derive the performance of a binary classifier trained on a mix of real and pruned synthetic data in a high dimensional setting. Our findings identify conditions where synthetic data could improve performance, focusing on the quality of the generative model and verification strategy. We also show a smooth phase transition in synthetic label noise, contrasting with prior sharp behavior in infinite sample limits. Experiments with toy models and large language models validate our theoretical results.
- Published
- 2024
4. CLoSD: Closing the Loop between Simulation and Diffusion for multi-task character control
- Author
-
Tevet, Guy, Raab, Sigal, Cohan, Setareh, Reda, Daniele, Luo, Zhengyi, Peng, Xue Bin, Bermano, Amit H., and van de Panne, Michiel
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Motion diffusion models and Reinforcement Learning (RL) based control for physics-based simulations have complementary strengths for human motion generation. The former is capable of generating a wide variety of motions, adhering to intuitive control such as text, while the latter offers physically plausible motion and direct interaction with the environment. In this work, we present a method that combines their respective strengths. CLoSD is a text-driven RL physics-based controller, guided by diffusion generation for various tasks. Our key insight is that motion diffusion can serve as an on-the-fly universal planner for a robust RL controller. To this end, CLoSD maintains a closed-loop interaction between two modules -- a Diffusion Planner (DiP), and a tracking controller. DiP is a fast-responding autoregressive diffusion model, controlled by textual prompts and target locations, and the controller is a simple and robust motion imitator that continuously receives motion plans from DiP and provides feedback from the environment. CLoSD is capable of seamlessly performing a sequence of different tasks, including navigation to a goal location, striking an object with a hand or foot as specified in a text prompt, sitting down, and getting up. https://guytevet.github.io/CLoSD-page/
- Published
- 2024
5. Advancing Spatio-temporal Storm Surge Prediction with Hierarchical Deep Neural Networks
- Author
-
Naeini, Saeed Saviz, Snaiki, Reda, and Wu, Teng
- Subjects
Physics - Atmospheric and Oceanic Physics ,Computer Science - Machine Learning - Abstract
Coastal regions in North America face major threats from storm surges caused by hurricanes and nor'easters. Traditional numerical models, while accurate, are computationally expensive, limiting their practicality for real-time predictions. Recently, deep learning techniques have been developed for efficient simulation of time-dependent storm surge. To resolve the small scales of storm surge in both time and space over a long duration and a large area, these simulations typically need to employ oversized neural networks that struggle with the accumulation of prediction errors over successive time steps. To address these challenges, this study introduces a hierarchical deep neural network (HDNN) combined with a convolutional autoencoder (CAE) to accurately and efficiently predict storm surge time series. The CAE reduces the dimensionality of storm surge data, streamlining the learning process. HDNNs then map storm parameters to the low-dimensional representation of storm surge, allowing for sequential predictions across different time scales. Specifically, the current-level neural network is utilized to predict future states with a relatively large time step, which are passed as inputs to the next-level neural network for smaller time-step predictions. This process continues sequentially for all time steps. The results from different-level neural networks across various time steps are then stacked to acquire the entire time series of storm surge. The simulated low-dimensional representations are finally decoded back into storm surge time series. The proposed model was trained and tested using synthetic data from the North Atlantic Comprehensive Coastal Study. Results demonstrate its excellent performance to effectively handle high-dimensional surge data while mitigating the accumulation of prediction errors over time, making it a promising tool for advancing storm surge prediction.
- Published
- 2024
6. Margin-bounded Confidence Scores for Out-of-Distribution Detection
- Author
-
Tamang, Lakpa D., Bouadjenek, Mohamed Reda, Dazeley, Richard, and Aryal, Sunil
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
In many critical Machine Learning applications, such as autonomous driving and medical image diagnosis, the detection of out-of-distribution (OOD) samples is as crucial as accurately classifying in-distribution (ID) inputs. Recently Outlier Exposure (OE) based methods have shown promising results in detecting OOD inputs via model fine-tuning with auxiliary outlier data. However, most of the previous OE-based approaches emphasize more on synthesizing extra outlier samples or introducing regularization to diversify OOD sample space, which is rather unquantifiable in practice. In this work, we propose a novel and straightforward method called Margin bounded Confidence Scores (MaCS) to address the nontrivial OOD detection problem by enlarging the disparity between ID and OOD scores, which in turn makes the decision boundary more compact facilitating effective segregation with a simple threshold. Specifically, we augment the learning objective of an OE regularized classifier with a supplementary constraint, which penalizes high confidence scores for OOD inputs compared to that of ID and significantly enhances the OOD detection performance while maintaining the ID classification accuracy. Extensive experiments on various benchmark datasets for image classification tasks demonstrate the effectiveness of the proposed method by significantly outperforming state-of-the-art (S.O.T.A) methods on various benchmarking metrics. The code is publicly available at https://github.com/lakpa-tamang9/margin_ood, Comment: 10 pages, 5 figures, IEEE Conference in Data Mining 2024
- Published
- 2024
7. Phase transitions in chromatin: mesoscopic and mean-field approaches
- Author
-
Tiani, Reda, Jardat, Marie, and Dahirel, Vincent
- Subjects
Physics - Biological Physics ,Condensed Matter - Soft Condensed Matter ,Condensed Matter - Statistical Mechanics ,Physics - Chemical Physics - Abstract
By means of a minimal physical model, we investigate the interplay of two phase transitions at play in chromatin organization: (1) liquid-liquid phase separation (LLPS) within the fluid solvating chromatin, resulting in the formation of biocondensates, and (2) the coil-globule crossover of the chromatin fiber, which drives the condensation or extension of the chain. In our model, a species representing a domain of chromatin is embedded in a binary fluid. This fluid phase separates to form a droplet rich in a macromolecule (B). Chromatin particles are trapped in a harmonic potential to reproduce the coil and globular phases of an isolated polymer chain. We investigate the role of the droplet material B on the radius of gyration of this polymer and find that this radius varies nonmonotonically with respect to the volume fraction of B. This behavior is reminiscent of a phenomenon known as $\textit{co-non-solvency}$: a polymer chain in good solvent (S) may collapse when a second good solvent (here B) is added in low quantity, and expand at higher B concentration. Additionally, the presence of finite-size effects on the coil-globule transition results in a qualitatively different impact of the droplet material on polymers of various sizes. In the context of genetic regulation, our results suggest that the size of chromatin domains and the quantity of condensate proteins are key parameters to control whether chromatin may respond to an increase of the quantity of chromatin-binding proteins by condensing or expanding., Comment: 15 pages, 10 figures
- Published
- 2024
8. Reconstruction of the Total Solar Irradiance during the last Millenium
- Author
-
Penza, Valentina, Bertello, Luca, Cantoresi, Matteo, Criscuoli, Serena, Lucaferri, Lorenza, Reda, Raffaele, Ulzega, Simone, and Berrilli, Francesco
- Subjects
Astrophysics - Solar and Stellar Astrophysics ,Astrophysics - Earth and Planetary Astrophysics ,Physics - Atmospheric and Oceanic Physics ,Physics - Space Physics - Abstract
Solar irradiance variations across various timescales, from minutes to centuries, represents a potential natural driver of past regional and global climate cold phases. To accurately assess the Sun's effect on climate, particularly during periods of exceptionally low solar activity known as grand minima, an accurate reconstruction of solar forcing is essential. While direct measurements of Total Solar Irradiance (TSI) only began in the late 1970s with the advent of space radiometers, indirect evidence from various historical proxies suggests that the Sun's magnetic activity has undergone possible significant fluctuations over much longer timescales. Employing diverse and independent methods for TSI reconstruction is essential to gaining a comprehensive understanding of this issue. This study employs a semi-empirical model to reconstruct TSI over the past millennium. Our approach uses an estimated open solar magnetic field ($F_{o}$), derived from cosmogenic isotope data, as a proxy for solar activity. We reconstruct the cyclic variations of TSI, due to the solar surface magnetic features, by correlating $F_{o}$ with the parameter of active region functional form. Instead, we obtain the long-term TSI trend by applying the Empirical Mode Decomposition (EMD) algorithm to the reconstructed $F_{o}$ to filter out the 11-year and 22-year solar variability. We prepare a reconstructed TSI record, spanning 971 to 2020 CE. The estimated departure from modern TSI values occurred during the Sp\"{o}rer Minimum (around 1400 CE), with a decrease of approximately 2.3 $W m^{-2}$. A slightly smaller decline of 2.2 $W m^{-2}$ is reported during the Maunder Minimum, between 1645 and 1715 CE.
- Published
- 2024
9. Alignment with Preference Optimization Is All You Need for LLM Safety
- Author
-
Alami, Reda, Almansoori, Ali Khalifa, Alzubaidi, Ahmed, Seddik, Mohamed El Amine, Farooq, Mugariya, and Hacid, Hakim
- Subjects
Computer Science - Machine Learning - Abstract
We demonstrate that preference optimization methods can effectively enhance LLM safety. Applying various alignment techniques to the Falcon 11B model using safety datasets, we achieve a significant boost in global safety score (from $57.64\%$ to $99.90\%$) as measured by LlamaGuard 3 8B, competing with state-of-the-art models. On toxicity benchmarks, average scores in adversarial settings dropped from over $0.6$ to less than $0.07$. However, this safety improvement comes at the cost of reduced general capabilities, particularly in math, suggesting a trade-off. We identify noise contrastive alignment (Safe-NCA) as an optimal method for balancing safety and performance. Our study ultimately shows that alignment techniques can be sufficient for building safe and robust models.
- Published
- 2024
10. Exploring WavLM Back-ends for Speech Spoofing and Deepfake Detection
- Author
-
Stourbe, Theophile, Miara, Victor, Lepage, Theo, and Dehak, Reda
- Subjects
Electrical Engineering and Systems Science - Audio and Speech Processing ,Computer Science - Machine Learning ,Computer Science - Sound - Abstract
This paper describes our submitted systems to the ASVspoof 5 Challenge Track 1: Speech Deepfake Detection - Open Condition, which consists of a stand-alone speech deepfake (bonafide vs spoof) detection task. Recently, large-scale self-supervised models become a standard in Automatic Speech Recognition (ASR) and other speech processing tasks. Thus, we leverage a pre-trained WavLM as a front-end model and pool its representations with different back-end techniques. The complete framework is fine-tuned using only the trained dataset of the challenge, similar to the close condition. Besides, we adopt data-augmentation by adding noise and reverberation using MUSAN noise and RIR datasets. We also experiment with codec augmentations to increase the performance of our method. Ultimately, we use the Bosaris toolkit for score calibration and system fusion to get better Cllr scores. Our fused system achieves 0.0937 minDCF, 3.42% EER, 0.1927 Cllr, and 0.1375 actDCF.
- Published
- 2024
11. Demonstration of hybrid foreground removal on CHIME data
- Author
-
Wang, Haochen, Masui, Kiyoshi, Bandura, Kevin, Chakraborty, Arnab, Dobbs, Matt, Foreman, Simon, Gray, Liam, Halpern, Mark, Joseph, Albin, MacEachern, Joshua, Mena-Parra, Juan, Miller, Kyle, Newburgh, Laura, Paul, Sourabh, Reda, Alex, Sanghavi, Pranav, Siegel, Seth, and Wulf, Dallas
- Subjects
Astrophysics - Cosmology and Nongalactic Astrophysics ,Astrophysics - Instrumentation and Methods for Astrophysics ,General Relativity and Quantum Cosmology - Abstract
The main challenge of 21 cm cosmology experiments is astrophysical foregrounds which are difficult to separate from the signal due to telescope systematics. An earlier study has shown that foreground residuals induced by antenna gain errors can be estimated and subtracted using the hybrid foreground residual subtraction (HyFoReS) technique which relies on cross-correlating linearly filtered data. In this paper, we apply a similar technique to the CHIME stacking analysis to subtract beam-induced foreground contamination. Using a linear high-pass delay filter for foreground suppression, the CHIME collaboration reported a $11.1\sigma$ detection in the 21 cm signal stacked on eBOSS quasar locations, despite foreground residual contamination mostly due to the instrument chromatic transfer function. We cross-correlate the foreground-dominated data at low delay with the contaminated signal at high delay to estimate residual foregrounds and subtract them from the signal. We find foreground residual subtraction can improve the signal-to-noise ratio of the stacked 21 cm signal by $ 10 - 20\%$ after the delay foreground filter, although some of the improvement can also be achieved with an alternative flagging technique. We have shown that it is possible to use HyFoReS to reduce beam-induced foreground contamination, benefiting the analysis of the HI auto power spectrum with CHIME and enabling the recovery of large scale modes.
- Published
- 2024
12. EFL Learners' Attitudes toward Reading in Both English and Arabic
- Author
-
Waleed Ahmed Nureldeen, Hala Alsabatinb, Abdalmuttaleb Al-Sartawi, and Reda S.M. Al-Mawadieh
- Abstract
The English reading ability of university students is crucial for their academic achievement, regardless of whether they are studying English or any other subject in English. Examining the affective factors that influence this ability is critical for understanding its development. Research has been conducted on the correlation between first language (L1) and English as a second language (ESL/EFL) reading attitudes, but there is limited research on the correlation between L2 and L1. This research seeks to investigate the similarities in reading perspectives between Arabic-speaking international students and English-speaking university students. The study employed surveys and semi-structured interviews to evaluate individuals' viewpoints on bilingual reading. The objective was to assess individuals' attitudes towards reading in two languages. Both quantitative and qualitative analyses were conducted. The study revealed that students' reading preferences remained consistent across different languages. However, the relationship between second-language reading attitudes and first-language reading attitudes was found to be minimal. The implications of the current study suggest that enhancing reading attitudes in both languages may enhance reading ability. The study utilised subjective data and targeted a specific demographic, limiting its generalizability to more diverse student populations.
- Published
- 2024
13. Holographic Beam Measurements of the Canadian Hydrogen Intensity Mapping Experiment (CHIME)
- Author
-
Amiri, Mandana, Chakraborty, Arnab, Foreman, Simon, Halpern, Mark, Hill, Alex S, Hinshaw, Gary, Landecker, T. L., MacEachern, Joshua, Masui, Kiyoshi W., Mena-Parra, Juan, Milutinovic, Nikola, Newburgh, Laura, Ordog, Anna, Pen, Ue-Li, Pinsonneault-Marotte, Tristan, Reda, Alex, Siegel, Seth R., Singh, Saurabh, Wang, Haochen, and Wulf, Dallas
- Subjects
Astrophysics - Instrumentation and Methods for Astrophysics - Abstract
We present the first results of the holographic beam mapping program for the Canadian Hydrogen Intensity Mapping Experiment (CHIME). We describe the implementation of the holographic technique as adapted for CHIME, and introduce the processing pipeline which prepares the raw holographic timestreams for analysis of beam features. We use data from six bright sources across the full 400-800\,MHz observing band of CHIME to provide measurements of the co-polar and cross-polar beam response of CHIME in both amplitude and phase for the 1024 dual-polarized feeds instrumented on CHIME. In addition, we present comparisons with independent probes of the CHIME beam which indicate the presence of polarized beam leakage in CHIME. Holographic measurements of the CHIME beam have already been applied in science with CHIME, e.g. in estimating detection significance of far sidelobe FRBs, and in validating the beam models used for CHIME's first detections of \tcm emission (in cross-correlation with measurements of large-scale structure from galaxy surveys and the Lyman-$\alpha$ forest). Measurements presented in this paper, and future holographic results, will provide a unique data set to characterize the CHIME beam and improve the experiment's prospects for a detection of BAO., Comment: submitted to ApJ
- Published
- 2024
14. Trust Your Gut: Comparing Human and Machine Inference from Noisy Visualizations
- Author
-
Koonchanok, Ratanond, Papka, Michael E., and Reda, Khairi
- Subjects
Computer Science - Human-Computer Interaction ,Computer Science - Machine Learning - Abstract
People commonly utilize visualizations not only to examine a given dataset, but also to draw generalizable conclusions about the underlying models or phenomena. Prior research has compared human visual inference to that of an optimal Bayesian agent, with deviations from rational analysis viewed as problematic. However, human reliance on non-normative heuristics may prove advantageous in certain circumstances. We investigate scenarios where human intuition might surpass idealized statistical rationality. In two experiments, we examine individuals' accuracy in characterizing the parameters of known data-generating models from bivariate visualizations. Our findings indicate that, although participants generally exhibited lower accuracy compared to statistical models, they frequently outperformed Bayesian agents, particularly when faced with extreme samples. Participants appeared to rely on their internal models to filter out noisy visualizations, thus improving their resilience against spurious data. However, participants displayed overconfidence and struggled with uncertainty estimation. They also exhibited higher variance than statistical machines. Our findings suggest that analyst gut reactions to visualizations may provide an advantage, even when departing from rationality. These results carry implications for designing visual analytics tools, offering new perspectives on how to integrate statistical models and analyst intuition for improved inference and decision-making. The data and materials for this paper are available at https://osf.io/qmfv6, Comment: To appear in IEEE Transactions on Visualization and Computer Graphics (Proceedings of IEEE VIS'24)
- Published
- 2024
15. Falcon2-11B Technical Report
- Author
-
Malartic, Quentin, Chowdhury, Nilabhra Roy, Cojocaru, Ruxandra, Farooq, Mugariya, Campesan, Giulia, Djilali, Yasser Abdelaziz Dahou, Narayan, Sanath, Singh, Ankit, Velikanov, Maksim, Boussaha, Basma El Amel, Al-Yafeai, Mohammed, Alobeidli, Hamza, Qadi, Leen Al, Seddik, Mohamed El Amine, Fedyanin, Kirill, Alami, Reda, and Hacid, Hakim
- Subjects
Computer Science - Computation and Language ,Computer Science - Computer Vision and Pattern Recognition - Abstract
We introduce Falcon2-11B, a foundation model trained on over five trillion tokens, and its multimodal counterpart, Falcon2-11B-vlm, which is a vision-to-text model. We report our findings during the training of the Falcon2-11B which follows a multi-stage approach where the early stages are distinguished by their context length and a final stage where we use a curated, high-quality dataset. Additionally, we report the effect of doubling the batch size mid-training and how training loss spikes are affected by the learning rate. The downstream performance of the foundation model is evaluated on established benchmarks, including multilingual and code datasets. The foundation model shows strong generalization across all the tasks which makes it suitable for downstream finetuning use cases. For the vision language model, we report the performance on several benchmarks and show that our model achieves a higher average score compared to open-source models of similar size. The model weights and code of both Falcon2-11B and Falcon2-11B-vlm are made available under a permissive license.
- Published
- 2024
16. HDLCopilot: Hardware Design Library Querying with Natural Language
- Author
-
Abdelatty, Manar and Reda, Sherief
- Subjects
Computer Science - Computation and Language - Abstract
Hardware design engineers routinely work with multiple Process Design Kits (PDKs) from various fabrication labs, each containing several standard cell libraries, optimized for specific metric such as speed, power, or density. These libraries include multiple views such as liberty files for timing information, LEF files for abstract layout details, and technology LEF for process design rules. Navigating this complex landscape to retrieve specific information about gates or design rules is often time-consuming and error-prone. To address this, we present HDLCopilot, an LLM-powered PDK query system that allows engineers to streamline interactions with PDKs in natural language format, making information retrieval accurate and more efficient. HDLCopilot achieves an accuracy of 94.23\% on an evaluation set comprised of diverse and complex natural language queries. HDLCopilot positions itself as a powerful assistant in the hardware design process, enhancing productivity and reducing potential human errors., Comment: 7 pages, 8 figures
- Published
- 2024
17. Beam Maps of the Canadian Hydrogen Intensity Mapping Experiment (CHIME) Measured with a Drone
- Author
-
Tyndall, Will, Reda, Alex, Shaw, J. Richard, Bandura, Kevin, Chakraborty, Arnab, Kuhn, Emily, MacEachern, Joshua, Mena-Parra, Juan, Newburgh, Laura, Ordog, Anna, Pinsonneault-Marotte, Tristan, Polish, Anna Rose, Saliwanchik, Ben, Sanghavi, Pranav, Siegel, Seth R., Whitmer, Audrey, and Wulf, Dallas
- Subjects
Astrophysics - Instrumentation and Methods for Astrophysics - Abstract
We present beam measurements of the CHIME telescope using a radio calibration source deployed on a drone payload. During test flights, the pulsing calibration source and the telescope were synchronized to GPS time, enabling in-situ background subtraction for the full $N^{2}$ visibility matrix for one CHIME cylindrical reflector. We use the autocorrelation products to estimate the primary beam width and centroid location, and compare these quantities to solar transit measurements and holographic measurements where they overlap on the sky. We find that the drone, solar, and holography data have similar beam parameter evolution across frequency and both spatial coordinates. This paper presents the first drone-based beam measurement of a large cylindrical radio interferometer. Furthermore, the unique analysis and instrumentation described in this paper lays the foundation for near-field measurements of experiments like CHIME., Comment: Submitted to IEEE OJAP June 30, 2024
- Published
- 2024
18. Universal piecewise polynomiality for counting curves in toric surfaces
- Author
-
Hahn, Marvin Anas and Reda, Vincenzo
- Subjects
Mathematics - Algebraic Geometry ,Mathematics - Combinatorics ,14N10, 14T90, 14N35 - Abstract
Inspired by piecewise polynomiality results of double Hurwitz numbers, Ardila and Brugall\'e introduced an enumerative problem which they call double Gromov--Witten invariants of Hirzebruch surfaces. These invariants serve as a two-dimensional analogue and satisfy a similar piecewise polynomial structure. More precisely, they introduced the enumeration of curves in Hirzebruch surfaces satisfying point conditions and tangency conditions on the two parallel toric boundaries. These conditions are stored in four partitions and the resulting invariants are piecewise polynomial in their entries. Moreover, they found that these expressions also behave polynomially with respect to the parameter determining the underlying Hirzebruch surfaces. Based on work of Ardila and Block, they proposed that such a polynomiality could also hold while changing between more general toric surfaces corresponding to $h$-transverse polygons. In this work, we answer this question affirmatively. Moreover, we express the resulting invariants for $h$-transverse polygons as matrix elements in the two-dimensional bosonic Fock space., Comment: 24 pages, 8 figures, 3 tables. arXiv admin note: text overlap with arXiv:1412.4563 by other authors
- Published
- 2024
19. Enhancing Travel Decision-Making: A Contrastive Learning Approach for Personalized Review Rankings in Accommodations
- Author
-
Igebaria, Reda, Fainman, Eran, Mizrachi, Sarai, Beladev, Moran, and Wang, Fengjun
- Subjects
Computer Science - Information Retrieval ,Computer Science - Machine Learning - Abstract
User-generated reviews significantly influence consumer decisions, particularly in the travel domain when selecting accommodations. This paper contribution comprising two main elements. Firstly, we present a novel dataset of authentic guest reviews sourced from a prominent online travel platform, totaling over two million reviews from 50,000 distinct accommodations. Secondly, we propose an innovative approach for personalized review ranking. Our method employs contrastive learning to intricately capture the relationship between a review and the contextual information of its respective reviewer. Through a comprehensive experimental study, we demonstrate that our approach surpasses several baselines across all reported metrics. Augmented by a comparative analysis, we showcase the efficacy of our method in elevating personalized review ranking. The implications of our research extend beyond the travel domain, with potential applications in other sectors where personalized review ranking is paramount, such as online e-commerce platforms.
- Published
- 2024
20. Diffusion-based Adversarial Purification for Intrusion Detection
- Author
-
Merzouk, Mohamed Amine, Beurier, Erwan, Yaich, Reda, Boulahia-Cuppens, Nora, and Cuppens, Frédéric
- Subjects
Computer Science - Cryptography and Security ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning ,Statistics - Machine Learning - Abstract
The escalating sophistication of cyberattacks has encouraged the integration of machine learning techniques in intrusion detection systems, but the rise of adversarial examples presents a significant challenge. These crafted perturbations mislead ML models, enabling attackers to evade detection or trigger false alerts. As a reaction, adversarial purification has emerged as a compelling solution, particularly with diffusion models showing promising results. However, their purification potential remains unexplored in the context of intrusion detection. This paper demonstrates the effectiveness of diffusion models in purifying adversarial examples in network intrusion detection. Through a comprehensive analysis of the diffusion parameters, we identify optimal configurations maximizing adversarial robustness with minimal impact on normal performance. Importantly, this study reveals insights into the relationship between diffusion noise and diffusion steps, representing a novel contribution to the field. Our experiments are carried out on two datasets and against 5 adversarial attacks. The implementation code is publicly available.
- Published
- 2024
21. Science in a Blink: Supporting Ensemble Perception in Scalar Fields
- Author
-
Mateevitsi, Victor A., Papka, Michael E., and Reda, Khairi
- Subjects
Computer Science - Human-Computer Interaction - Abstract
Visualizations support rapid analysis of scientific datasets, allowing viewers to glean aggregate information (e.g., the mean) within split-seconds. While prior research has explored this ability in conventional charts, it is unclear if spatial visualizations used by computational scientists afford a similar ensemble perception capacity. We investigate people's ability to estimate two summary statistics, mean and variance, from pseudocolor scalar fields. In a crowdsourced experiment, we find that participants can reliably characterize both statistics, although variance discrimination requires a much stronger signal. Multi-hue and diverging colormaps outperformed monochromatic, luminance ramps in aiding this extraction. Analysis of qualitative responses suggests that participants often estimate the distribution of hotspots and valleys as visual proxies for data statistics. These findings suggest that people's summary interpretation of spatial datasets is likely driven by the appearance of discrete color segments, rather than assessments of overall luminance. Implicit color segmentation in quantitative displays could thus prove more useful than previously assumed by facilitating quick, gist-level judgments about color-coded visualizations., Comment: To appear in Proceedings of the 2024 IEEE Visualization Conference (VIS'24)
- Published
- 2024
22. InaGVAD : a Challenging French TV and Radio Corpus Annotated for Speech Activity Detection and Speaker Gender Segmentation
- Author
-
Doukhan, David, Maertens, Christine, Personnic, William Le, Speroni, Ludovic, and Dehak, Reda
- Subjects
Electrical Engineering and Systems Science - Audio and Speech Processing ,Computer Science - Digital Libraries ,Computer Science - Multimedia ,Computer Science - Sound - Abstract
InaGVAD is an audio corpus collected from 10 French radio and 18 TV channels categorized into 4 groups: generalist radio, music radio, news TV, and generalist TV. It contains 277 1-minute-long annotated recordings aimed at representing the acoustic diversity of French audiovisual programs and was primarily designed to build systems able to monitor men's and women's speaking time in media. inaGVAD is provided with Voice Activity Detection (VAD) and Speaker Gender Segmentation (SGS) annotations extended with overlap, speaker traits (gender, age, voice quality), and 10 non-speech event categories. Annotation distributions are detailed for each channel category. This dataset is partitioned into a 1h development and a 3h37 test subset, allowing fair and reproducible system evaluation. A benchmark of 6 freely available VAD software is presented, showing diverse abilities based on channel and non-speech event categories. Two existing SGS systems are evaluated on the corpus and compared against a baseline X-vector transfer learning strategy, trained on the development subset. Results demonstrate that our proposal, trained on a single - but diverse - hour of data, achieved competitive SGS results. The entire inaGVAD package; including corpus, annotations, evaluation scripts, and baseline training code; is made freely accessible, fostering future advancement in the domain., Comment: Voice Activity Detection (VAD), Speaker Gender Segmentation, Audiovisual Speech Resource, Speaker Traits, Speech Overlap, Benchmark, X-vector, Gender Representation in the Media, Dataset
- Published
- 2024
23. On Affine Homotopy between Language Encoders
- Author
-
Chan, Robin SM, Boumasmoud, Reda, Svete, Anej, Ren, Yuxin, Guo, Qipeng, Jin, Zhijing, Ravfogel, Shauli, Sachan, Mrinmaya, Schölkopf, Bernhard, El-Assady, Mennatallah, and Cotterell, Ryan
- Subjects
Computer Science - Computation and Language ,Computer Science - Machine Learning - Abstract
Pre-trained language encoders -- functions that represent text as vectors -- are an integral component of many NLP tasks. We tackle a natural question in language encoder analysis: What does it mean for two encoders to be similar? We contend that a faithful measure of similarity needs to be \emph{intrinsic}, that is, task-independent, yet still be informative of \emph{extrinsic} similarity -- the performance on downstream tasks. It is common to consider two encoders similar if they are \emph{homotopic}, i.e., if they can be aligned through some transformation. In this spirit, we study the properties of \emph{affine} alignment of language encoders and its implications on extrinsic similarity. We find that while affine alignment is fundamentally an asymmetric notion of similarity, it is still informative of extrinsic similarity. We confirm this on datasets of natural language representations. Beyond providing useful bounds on extrinsic similarity, affine intrinsic similarity also allows us to begin uncovering the structure of the space of pre-trained encoders by defining an order over them., Comment: 10 pages
- Published
- 2024
24. Towards Supervised Performance on Speaker Verification with Self-Supervised Learning by Leveraging Large-Scale ASR Models
- Author
-
Miara, Victor, Lepage, Theo, and Dehak, Reda
- Subjects
Electrical Engineering and Systems Science - Audio and Speech Processing ,Computer Science - Machine Learning ,Computer Science - Sound - Abstract
Recent advancements in Self-Supervised Learning (SSL) have shown promising results in Speaker Verification (SV). However, narrowing the performance gap with supervised systems remains an ongoing challenge. Several studies have observed that speech representations from large-scale ASR models contain valuable speaker information. This work explores the limitations of fine-tuning these models for SV using an SSL contrastive objective in an end-to-end approach. Then, we propose a framework to learn speaker representations in an SSL context by fine-tuning a pre-trained WavLM with a supervised loss using pseudo-labels. Initial pseudo-labels are derived from an SSL DINO-based model and are iteratively refined by clustering the model embeddings. Our method achieves 0.99% EER on VoxCeleb1-O, establishing the new state-of-the-art on self-supervised SV. As this performance is close to our supervised baseline of 0.94% EER, this contribution is a step towards supervised performance on SV with SSL., Comment: accepted at INTERSPEECH 2024
- Published
- 2024
- Full Text
- View/download PDF
25. Data Quality in Edge Machine Learning: A State-of-the-Art Survey
- Author
-
Belgoumri, Mohammed Djameleddine, Bouadjenek, Mohamed Reda, Aryal, Sunil, and Hacid, Hakim
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence ,Statistics - Machine Learning - Abstract
Data-driven Artificial Intelligence (AI) systems trained using Machine Learning (ML) are shaping an ever-increasing (in size and importance) portion of our lives, including, but not limited to, recommendation systems, autonomous driving technologies, healthcare diagnostics, financial services, and personalized marketing. On the one hand, the outsized influence of these systems imposes a high standard of quality, particularly in the data used to train them. On the other hand, establishing and maintaining standards of Data Quality (DQ) becomes more challenging due to the proliferation of Edge Computing and Internet of Things devices, along with their increasing adoption for training and deploying ML models. The nature of the edge environment -- characterized by limited resources, decentralized data storage, and processing -- exacerbates data-related issues, making them more frequent, severe, and difficult to detect and mitigate. From these observations, it follows that DQ research for edge ML is a critical and urgent exploration track for the safety and robust usefulness of present and future AI systems. Despite this fact, DQ research for edge ML is still in its infancy. The literature on this subject remains fragmented and scattered across different research communities, with no comprehensive survey to date. Hence, this paper aims to fill this gap by providing a global view of the existing literature from multiple disciplines that can be grouped under the umbrella of DQ for edge ML. Specifically, we present a tentative definition of data quality in Edge computing, which we use to establish a set of DQ dimensions. We explore each dimension in detail, including existing solutions for mitigation., Comment: 31 pages, 5 figures
- Published
- 2024
26. Data-driven Machinery Fault Detection: A Comprehensive Review
- Author
-
Neupane, Dhiraj, Bouadjenek, Mohamed Reda, Dazeley, Richard, and Aryal, Sunil
- Subjects
Computer Science - Artificial Intelligence ,Computer Science - Machine Learning - Abstract
In this era of advanced manufacturing, it's now more crucial than ever to diagnose machine faults as early as possible to guarantee their safe and efficient operation. With the massive surge in industrial big data and advancement in sensing and computational technologies, data-driven Machinery Fault Diagnosis (MFD) solutions based on machine/deep learning approaches have been used ubiquitously in manufacturing. Timely and accurately identifying faulty machine signals is vital in industrial applications for which many relevant solutions have been proposed and are reviewed in many articles. Despite the availability of numerous solutions and reviews on MFD, existing works often lack several aspects. Most of the available literature has limited applicability in a wide range of manufacturing settings due to their concentration on a particular type of equipment or method of analysis. Additionally, discussions regarding the challenges associated with implementing data-driven approaches, such as dealing with noisy data, selecting appropriate features, and adapting models to accommodate new or unforeseen faults, are often superficial or completely overlooked. Thus, this survey provides a comprehensive review of the articles using different types of machine learning approaches for the detection and diagnosis of various types of machinery faults, highlights their strengths and limitations, provides a review of the methods used for condition-based analyses, comprehensively discusses the available machinery fault datasets, introduces future researchers to the possible challenges they have to encounter while using these approaches for MFD and recommends the probable solutions to mitigate those problems. The future research prospects are also pointed out for a better understanding of the field. We believe this article will help researchers and contribute to the further development of the field.
- Published
- 2024
27. Large Margin Discriminative Loss for Classification
- Author
-
Nguyen, Hai-Vy, Gamboa, Fabrice, Zhang, Sixin, Chhaibi, Reda, Gratton, Serge, and Giaccone, Thierry
- Subjects
Statistics - Machine Learning ,Computer Science - Machine Learning - Abstract
In this paper, we introduce a novel discriminative loss function with large margin in the context of Deep Learning. This loss boosts the discriminative power of neural nets, represented by intra-class compactness and inter-class separability. On the one hand, the class compactness is ensured by close distance of samples of the same class to each other. On the other hand, the inter-class separability is boosted by a margin loss that ensures the minimum distance of each class to its closest boundary. All the terms in our loss have an explicit meaning, giving a direct view of the feature space obtained. We analyze mathematically the relation between compactness and margin term, giving a guideline about the impact of the hyper-parameters on the learned features. Moreover, we also analyze properties of the gradient of the loss with respect to the parameters of the neural net. Based on this, we design a strategy called partial momentum updating that enjoys simultaneously stability and consistency in training. Furthermore, we also investigate generalization errors to have better theoretical insights. Our loss function systematically boosts the test accuracy of models compared to the standard softmax loss in our experiments.
- Published
- 2024
28. The Ca II K index-Mg II index relation: A Hilbert-Huang Transform approach
- Author
-
Reda, Raffaele and Penza, Valentina
- Subjects
Astrophysics - Solar and Stellar Astrophysics - Abstract
The solar activity, which is driven by a variable magnetic field, exhibits changes along several time scales, the 11-year being the most known. In addition to the SunSpot Number, the Ca II K index and the Mg II index are indices widely employed among those proposed to quantify the solar activity, also because of their ability to trace the solar UV emission. In this work, we compare the Ca II K 0.1nm emission index to the Mg II index over the time interval 1978-2017, which covers almost four solar cycles. We show that they are strongly correlated across each solar cycle (r$\geq$0.94), providing the corresponding linear regression fit parameters. The Hilbert-Huang Transform is then used to decompose such indices into their intrinsic mode of oscillation. By studying how their components are correlated over the different time scales, it is found that the maximum correlation is observed at the 11-year scale, while the correlation is less strong going to smaller time scales.
- Published
- 2024
29. Finite-time convergence to an $\epsilon$-efficient Nash equilibrium in potential games
- Author
-
Maddux, Anna, Ouhamma, Reda, and Kamgarpour, Maryam
- Subjects
Computer Science - Multiagent Systems - Abstract
This paper investigates the convergence time of log-linear learning to an $\epsilon$-efficient Nash equilibrium (NE) in potential games. In such games, an efficient NE is defined as the maximizer of the potential function. Previous literature provides asymptotic convergence rates to efficient Nash equilibria, and existing finite-time rates are limited to potential games with further assumptions such as the interchangeability of players. In this paper, we prove the first finite-time convergence to an $\epsilon$-efficient NE in general potential games. Our bounds depend polynomially on $1/\epsilon$, an improvement over previous bounds that are exponential in $1/\epsilon$ and only hold for subclasses of potential games. We then strengthen our convergence result in two directions: first, we show that a variant of log-linear learning that requires a factor $A$ less feedback on the utility per round enjoys a similar convergence time; second, we demonstrate the robustness of our convergence guarantee if log-linear learning is subject to small perturbations such as alterations in the learning rule or noise-corrupted utilities., Comment: 12 main pages, 33 pages, 2 figures, 1 Table
- Published
- 2024
30. Flexible Motion In-betweening with Diffusion Models
- Author
-
Cohan, Setareh, Tevet, Guy, Reda, Daniele, Peng, Xue Bin, and van de Panne, Michiel
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Graphics ,Computer Science - Machine Learning - Abstract
Motion in-betweening, a fundamental task in character animation, consists of generating motion sequences that plausibly interpolate user-provided keyframe constraints. It has long been recognized as a labor-intensive and challenging process. We investigate the potential of diffusion models in generating diverse human motions guided by keyframes. Unlike previous inbetweening methods, we propose a simple unified model capable of generating precise and diverse motions that conform to a flexible range of user-specified spatial constraints, as well as text conditioning. To this end, we propose Conditional Motion Diffusion In-betweening (CondMDI) which allows for arbitrary dense-or-sparse keyframe placement and partial keyframe constraints while generating high-quality motions that are diverse and coherent with the given keyframes. We evaluate the performance of CondMDI on the text-conditioned HumanML3D dataset and demonstrate the versatility and efficacy of diffusion models for keyframe in-betweening. We further explore the use of guidance and imputation-based approaches for inference-time keyframing and compare CondMDI against these methods., Comment: SIGGRAPH 2024. For project page and code, see https://setarehc.github.io/CondMDI/
- Published
- 2024
31. Sensitivity Analysis for Active Sampling, with Applications to the Simulation of Analog Circuits
- Author
-
Chhaibi, Reda, Gamboa, Fabrice, Oger, Christophe, Oliveira, Vinicius, Pellegrini, Clément, and Remot, Damien
- Subjects
Statistics - Machine Learning ,Computer Science - Machine Learning ,Statistics - Applications ,Statistics - Methodology - Abstract
We propose an active sampling flow, with the use-case of simulating the impact of combined variations on analog circuits. In such a context, given the large number of parameters, it is difficult to fit a surrogate model and to efficiently explore the space of design features. By combining a drastic dimension reduction using sensitivity analysis and Bayesian surrogate modeling, we obtain a flexible active sampling flow. On synthetic and real datasets, this flow outperforms the usual Monte-Carlo sampling which often forms the foundation of design space exploration., Comment: 7 pages
- Published
- 2024
32. Scattering of the Toda system and the Gaussian $\beta$-ensemble
- Author
-
Chhaibi, Reda
- Subjects
Nonlinear Sciences - Exactly Solvable and Integrable Systems ,Mathematical Physics ,Mathematics - Probability - Abstract
The classical Toda flow is a well-known integrable Hamiltonian system that diagonalizes matrices. By keeping track of the distribution of entries and precise scattering asymptotics, one can exhibit matrix models for log-gases on the real line. These types of scattering asymptotics date back to fundamental work of Moser. More precisely, using the classical Toda flow acting on symmetric real tridiagonal matrices, we give a "symplectic" proof of the fact that the Dumitriu-Edelman tridiagonal model has a spectrum following the Gaussian $\beta$-ensemble., Comment: 13 pages, v1: Submitted
- Published
- 2024
33. Statistical Edge Detection And UDF Learning For Shape Representation
- Author
-
Foy, Virgile, Gamboa, Fabrice, and Chhaibi, Reda
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Machine Learning ,Statistics - Applications - Abstract
In the field of computer vision, the numerical encoding of 3D surfaces is crucial. It is classical to represent surfaces with their Signed Distance Functions (SDFs) or Unsigned Distance Functions (UDFs). For tasks like representation learning, surface classification, or surface reconstruction, this function can be learned by a neural network, called Neural Distance Function. This network, and in particular its weights, may serve as a parametric and implicit representation for the surface. The network must represent the surface as accurately as possible. In this paper, we propose a method for learning UDFs that improves the fidelity of the obtained Neural UDF to the original 3D surface. The key idea of our method is to concentrate the learning effort of the Neural UDF on surface edges. More precisely, we show that sampling more training points around surface edges allows better local accuracy of the trained Neural UDF, and thus improves the global expressiveness of the Neural UDF in terms of Hausdorff distance. To detect surface edges, we propose a new statistical method based on the calculation of a $p$-value at each point on the surface. Our method is shown to detect surface edges more accurately than a commonly used local geometric descriptor.
- Published
- 2024
34. On the Tractability of SHAP Explanations under Markovian Distributions
- Author
-
Marzouk, Reda and de La Higuera, Colin
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
Thanks to its solid theoretical foundation, the SHAP framework is arguably one the most widely utilized frameworks for local explainability of ML models. Despite its popularity, its exact computation is known to be very challenging, proven to be NP-Hard in various configurations. Recent works have unveiled positive complexity results regarding the computation of the SHAP score for specific model families, encompassing decision trees, random forests, and some classes of boolean circuits. Yet, all these positive results hinge on the assumption of feature independence, often simplistic in real-world scenarios. In this article, we investigate the computational complexity of the SHAP score by relaxing this assumption and introducing a Markovian perspective. We show that, under the Markovian assumption, computing the SHAP score for the class of Weighted automata, Disjoint DNFs and Decision Trees can be performed in polynomial time, offering a first positive complexity result for the problem of SHAP score computation that transcends the limitations of the feature independence assumption., Comment: Accepted at ICML 2024
- Published
- 2024
35. Additive Margin in Contrastive Self-Supervised Frameworks to Learn Discriminative Speaker Representations
- Author
-
Lepage, Theo and Dehak, Reda
- Subjects
Electrical Engineering and Systems Science - Audio and Speech Processing ,Computer Science - Machine Learning ,Computer Science - Sound - Abstract
Self-Supervised Learning (SSL) frameworks became the standard for learning robust class representations by benefiting from large unlabeled datasets. For Speaker Verification (SV), most SSL systems rely on contrastive-based loss functions. We explore different ways to improve the performance of these techniques by revisiting the NT-Xent contrastive loss. Our main contribution is the definition of the NT-Xent-AM loss and the study of the importance of Additive Margin (AM) in SimCLR and MoCo SSL methods to further separate positive from negative pairs. Despite class collisions, we show that AM enhances the compactness of same-speaker embeddings and reduces the number of false negatives and false positives on SV. Additionally, we demonstrate the effectiveness of the symmetric contrastive loss, which provides more supervision for the SSL task. Implementing these two modifications to SimCLR improves performance and results in 7.85% EER on VoxCeleb1-O, outperforming other equivalent methods., Comment: accepted at Odyssey 2024: The Speaker and Language Recognition Workshop. arXiv admin note: text overlap with arXiv:2306.03664
- Published
- 2024
36. A Framework for Managing Multifaceted Privacy Leakage While Optimizing Utility in Continuous LBS Interactions
- Author
-
Bkakria, Anis and Yaich, Reda
- Subjects
Computer Science - Cryptography and Security - Abstract
Privacy in Location-Based Services (LBS) has become a paramount concern with the ubiquity of mobile devices and the increasing integration of location data into various applications. This paper presents several novel contributions to advancing the understanding and management of privacy leakage in LBS. Our contributions provide a more comprehensive framework for analyzing privacy concerns across different facets of location-based interactions. Specifically, we introduce $(\epsilon, \delta)$-location privacy, $(\epsilon, \delta, \theta)$-trajectory privacy, and $(\epsilon, \delta, \theta)$-POI privacy, which offer refined mechanisms for quantifying privacy risks associated with location, trajectory, and points of interest (POI) when continuously interacting with LBS. Furthermore, we establish fundamental connections between these privacy notions, facilitating a holistic approach to privacy preservation in LBS. Additionally, we present a lower bound analysis to evaluate the utility of the proposed privacy-preserving mechanisms, offering insights into the trade-offs between privacy protection and data utility. Finally, we instantiate our framework with the Plannar Isotopic Mechanism to demonstrate its practical applicability while ensuring optimal utility and quantifying privacy leakages across various dimensions. The evaluations provided provide a comprehensive insight into the efficacy of our framework in capturing privacy loss on location, trajectory, and points of interest while enabling quantification of the ensured accuracy.
- Published
- 2024
37. Unsupervised Microscopy Video Denoising
- Author
-
Aiyetigbo, Mary, Korte, Alexander, Anderson, Ethan, Chalhoub, Reda, Kalivas, Peter, Luo, Feng, and Li, Nianyi
- Subjects
Electrical Engineering and Systems Science - Image and Video Processing ,Computer Science - Computer Vision and Pattern Recognition - Abstract
In this paper, we introduce a novel unsupervised network to denoise microscopy videos featured by image sequences captured by a fixed location microscopy camera. Specifically, we propose a DeepTemporal Interpolation method, leveraging a temporal signal filter integrated into the bottom CNN layers, to restore microscopy videos corrupted by unknown noise types. Our unsupervised denoising architecture is distinguished by its ability to adapt to multiple noise conditions without the need for pre-existing noise distribution knowledge, addressing a significant challenge in real-world medical applications. Furthermore, we evaluate our denoising framework using both real microscopy recordings and simulated data, validating our outperforming video denoising performance across a broad spectrum of noise scenarios. Extensive experiments demonstrate that our unsupervised model consistently outperforms state-of-the-art supervised and unsupervised video denoising techniques, proving especially effective for microscopy videos., Comment: Accepted at CVPRW 2024
- Published
- 2024
38. Combining Statistical Depth and Fermat Distance for Uncertainty Quantification
- Author
-
Nguyen, Hai-Vy, Gamboa, Fabrice, Chhaibi, Reda, Zhang, Sixin, Gratton, Serge, and Giaccone, Thierry
- Subjects
Statistics - Machine Learning ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning ,Mathematics - Probability ,Statistics - Applications - Abstract
We measure the Out-of-domain uncertainty in the prediction of Neural Networks using a statistical notion called ``Lens Depth'' (LD) combined with Fermat Distance, which is able to capture precisely the ``depth'' of a point with respect to a distribution in feature space, without any assumption about the form of distribution. Our method has no trainable parameter. The method is applicable to any classification model as it is applied directly in feature space at test time and does not intervene in training process. As such, it does not impact the performance of the original model. The proposed method gives excellent qualitative result on toy datasets and can give competitive or better uncertainty estimation on standard deep learning datasets compared to strong baseline methods., Comment: 12 pages
- Published
- 2024
39. PoliTune: Analyzing the Impact of Data Selection and Fine-Tuning on Economic and Political Biases in Large Language Models
- Author
-
Agiza, Ahmed, Mostagir, Mohamed, and Reda, Sherief
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning - Abstract
In an era where language models are increasingly integrated into decision-making and communication, understanding the biases within Large Language Models (LLMs) becomes imperative, especially when these models are applied in the economic and political domains. This work investigates the impact of fine-tuning and data selection on economic and political biases in LLMs. In this context, we introduce PoliTune, a fine-tuning methodology to explore the systematic aspects of aligning LLMs with specific ideologies, mindful of the biases that arise from their extensive training on diverse datasets. Distinct from earlier efforts that either focus on smaller models or entail resource-intensive pre-training, PoliTune employs Parameter-Efficient Fine-Tuning (PEFT) techniques, which allow for the alignment of LLMs with targeted ideologies by modifying a small subset of parameters. We introduce a systematic method for using the open-source LLM Llama3-70B for dataset selection, annotation, and synthesizing a preferences dataset for Direct Preference Optimization (DPO) to align the model with a given political ideology. We assess the effectiveness of PoliTune through both quantitative and qualitative evaluations of aligning open-source LLMs (Llama3-8B and Mistral-7B) to different ideologies. Our work analyzes the potential of embedding specific biases into LLMs and contributes to the dialogue on the ethical application of AI, highlighting the importance of deploying AI in a manner that aligns with societal values., Comment: AIES '24: Proceedings of the 2024 AAAI/ACM Conference on AI, Ethics, and Society
- Published
- 2024
40. Review for Handling Missing Data with special missing mechanism
- Author
-
Zhou, Youran, Aryal, Sunil, and Bouadjenek, Mohamed Reda
- Subjects
Statistics - Methodology ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning - Abstract
Missing data poses a significant challenge in data science, affecting decision-making processes and outcomes. Understanding what missing data is, how it occurs, and why it is crucial to handle it appropriately is paramount when working with real-world data, especially in tabular data, one of the most commonly used data types in the real world. Three missing mechanisms are defined in the literature: Missing Completely At Random (MCAR), Missing At Random (MAR), and Missing Not At Random (MNAR), each presenting unique challenges in imputation. Most existing work are focused on MCAR that is relatively easy to handle. The special missing mechanisms of MNAR and MAR are less explored and understood. This article reviews existing literature on handling missing values. It compares and contrasts existing methods in terms of their ability to handle different missing mechanisms and data types. It identifies research gap in the existing literature and lays out potential directions for future research in the field. The information in this review will help data analysts and researchers to adopt and promote good practices for handling missing data in real-world problems.
- Published
- 2024
41. Investigating Regularization of Self-Play Language Models
- Author
-
Alami, Reda, Abubaker, Abdalgader, Achab, Mastane, Seddik, Mohamed El Amine, and Lahlou, Salem
- Subjects
Computer Science - Machine Learning - Abstract
This paper explores the effects of various forms of regularization in the context of language model alignment via self-play. While both reinforcement learning from human feedback (RLHF) and direct preference optimization (DPO) require to collect costly human-annotated pairwise preferences, the self-play fine-tuning (SPIN) approach replaces the rejected answers by data generated from the previous iterate. However, the SPIN method presents a performance instability issue in the learning phase, which can be mitigated by playing against a mixture of the two previous iterates. In the same vein, we propose in this work to address this issue from two perspectives: first, by incorporating an additional Kullback-Leibler (KL) regularization to stay at the proximity of the reference policy; second, by using the idea of fictitious play which smoothens the opponent policy across all previous iterations. In particular, we show that the KL-based regularizer boils down to replacing the previous policy by its geometric mixture with the base policy inside of the SPIN loss function. We finally discuss empirical results on MT-Bench as well as on the Hugging Face Open LLM Leaderboard.
- Published
- 2024
42. PikeLPN: Mitigating Overlooked Inefficiencies of Low-Precision Neural Networks
- Author
-
Neseem, Marina, McCullough, Conor, Hsin, Randy, Leichner, Chas, Li, Shan, Chong, In Suk, Howard, Andrew G., Lew, Lukasz, Reda, Sherief, Rautio, Ville-Mikko, and Moro, Daniele
- Subjects
Computer Science - Machine Learning ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Low-precision quantization is recognized for its efficacy in neural network optimization. Our analysis reveals that non-quantized elementwise operations which are prevalent in layers such as parameterized activation functions, batch normalization, and quantization scaling dominate the inference cost of low-precision models. These non-quantized elementwise operations are commonly overlooked in SOTA efficiency metrics such as Arithmetic Computation Effort (ACE). In this paper, we propose ACEv2 - an extended version of ACE which offers a better alignment with the inference cost of quantized models and their energy consumption on ML hardware. Moreover, we introduce PikeLPN, a model that addresses these efficiency issues by applying quantization to both elementwise operations and multiply-accumulate operations. In particular, we present a novel quantization technique for batch normalization layers named QuantNorm which allows for quantizing the batch normalization parameters without compromising the model performance. Additionally, we propose applying Double Quantization where the quantization scaling parameters are quantized. Furthermore, we recognize and resolve the issue of distribution mismatch in Separable Convolution layers by introducing Distribution-Heterogeneous Quantization which enables quantizing them to low-precision. PikeLPN achieves Pareto-optimality in efficiency-accuracy trade-off with up to 3X efficiency improvement compared to SOTA low-precision models., Comment: Accepted in CVPR 2024. 10 Figures, 9 Tables
- Published
- 2024
43. MTLoRA: A Low-Rank Adaptation Approach for Efficient Multi-Task Learning
- Author
-
Agiza, Ahmed, Neseem, Marina, and Reda, Sherief
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning - Abstract
Adapting models pre-trained on large-scale datasets to a variety of downstream tasks is a common strategy in deep learning. Consequently, parameter-efficient fine-tuning methods have emerged as a promising way to adapt pre-trained models to different tasks while training only a minimal number of parameters. While most of these methods are designed for single-task adaptation, parameter-efficient training in Multi-Task Learning (MTL) architectures is still unexplored. In this paper, we introduce MTLoRA, a novel framework for parameter-efficient training of MTL models. MTLoRA employs Task-Agnostic and Task-Specific Low-Rank Adaptation modules, which effectively disentangle the parameter space in MTL fine-tuning, thereby enabling the model to adeptly handle both task specialization and interaction within MTL contexts. We applied MTLoRA to hierarchical-transformer-based MTL architectures, adapting them to multiple downstream dense prediction tasks. Our extensive experiments on the PASCAL dataset show that MTLoRA achieves higher accuracy on downstream tasks compared to fully fine-tuning the MTL model while reducing the number of trainable parameters by 3.6x. Furthermore, MTLoRA establishes a Pareto-optimal trade-off between the number of trainable parameters and the accuracy of the downstream tasks, outperforming current state-of-the-art parameter-efficient training methods in both accuracy and efficiency. Our code is publicly available., Comment: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024
- Published
- 2024
44. The Pursuit of Fairness in Artificial Intelligence Models: A Survey
- Author
-
Kheya, Tahsin Alamgir, Bouadjenek, Mohamed Reda, and Aryal, Sunil
- Subjects
Computer Science - Artificial Intelligence ,Computer Science - Computers and Society ,Computer Science - Machine Learning - Abstract
Artificial Intelligence (AI) models are now being utilized in all facets of our lives such as healthcare, education and employment. Since they are used in numerous sensitive environments and make decisions that can be life altering, potential biased outcomes are a pressing matter. Developers should ensure that such models don't manifest any unexpected discriminatory practices like partiality for certain genders, ethnicities or disabled people. With the ubiquitous dissemination of AI systems, researchers and practitioners are becoming more aware of unfair models and are bound to mitigate bias in them. Significant research has been conducted in addressing such issues to ensure models don't intentionally or unintentionally perpetuate bias. This survey offers a synopsis of the different ways researchers have promoted fairness in AI systems. We explore the different definitions of fairness existing in the current literature. We create a comprehensive taxonomy by categorizing different types of bias and investigate cases of biased AI in different application domains. A thorough study is conducted of the approaches and techniques employed by researchers to mitigate bias in AI models. Moreover, we also delve into the impact of biased models on user experience and the ethical considerations to contemplate when developing and deploying such models. We hope this survey helps researchers and practitioners understand the intricate details of fairness and bias in AI systems. By sharing this thorough survey, we aim to promote additional discourse in the domain of equitable and responsible AI., Comment: 37 pages, 6 figures
- Published
- 2024
45. usfAD Based Effective Unknown Attack Detection Focused IDS Framework
- Author
-
Uddin, Md. Ashraf, Aryal, Sunil, Bouadjenek, Mohamed Reda, Al-Hawawreh, Muna, and Talukder, Md. Alamin
- Subjects
Computer Science - Cryptography and Security ,Computer Science - Machine Learning - Abstract
The rapid expansion of varied network systems, including the Internet of Things (IoT) and Industrial Internet of Things (IIoT), has led to an increasing range of cyber threats. Ensuring robust protection against these threats necessitates the implementation of an effective Intrusion Detection System (IDS). For more than a decade, researchers have delved into supervised machine learning techniques to develop IDS to classify normal and attack traffic. However, building effective IDS models using supervised learning requires a substantial number of benign and attack samples. To collect a sufficient number of attack samples from real-life scenarios is not possible since cyber attacks occur occasionally. Further, IDS trained and tested on known datasets fails in detecting zero-day or unknown attacks due to the swift evolution of attack patterns. To address this challenge, we put forth two strategies for semi-supervised learning based IDS where training samples of attacks are not required: 1) training a supervised machine learning model using randomly and uniformly dispersed synthetic attack samples; 2) building a One Class Classification (OCC) model that is trained exclusively on benign network traffic. We have implemented both approaches and compared their performances using 10 recent benchmark IDS datasets. Our findings demonstrate that the OCC model based on the state-of-art anomaly detection technique called usfAD significantly outperforms conventional supervised classification and other OCC based techniques when trained and tested considering real-life scenarios, particularly to detect previously unseen attacks., Comment: Deakin University, Australia | This material is based upon work supported by the Air Force Office of Scientific Research under award number FA2386-23-1-4003
- Published
- 2024
46. Hierarchical Classification for Intrusion Detection System: Effective Design and Empirical Analysis
- Author
-
Uddin, Md. Ashraf, Aryal, Sunil, Bouadjenek, Mohamed Reda, Al-Hawawreh, Muna, and Talukder, Md. Alamin
- Subjects
Computer Science - Cryptography and Security ,Computer Science - Machine Learning - Abstract
With the increased use of network technologies like Internet of Things (IoT) in many real-world applications, new types of cyberattacks have been emerging. To safeguard critical infrastructures from these emerging threats, it is crucial to deploy an Intrusion Detection System (IDS) that can detect different types of attacks accurately while minimizing false alarms. Machine learning approaches have been used extensively in IDS and they are mainly using flat multi-class classification to differentiate normal traffic and different types of attacks. Though cyberattack types exhibit a hierarchical structure where similar granular attack subtypes can be grouped into more high-level attack types, hierarchical classification approach has not been explored well. In this paper, we investigate the effectiveness of hierarchical classification approach in IDS. We use a three-level hierarchical classification model to classify various network attacks, where the first level classifies benign or attack, the second level classifies coarse high-level attack types, and the third level classifies a granular level attack types. Our empirical results of using 10 different classification algorithms in 10 different datasets show that there is no significant difference in terms of overall classification performance (i.e., detecting normal and different types of attack correctly) of hierarchical and flat classification approaches. However, flat classification approach misclassify attacks as normal whereas hierarchical approach misclassify one type of attack as another attack type. In other words, the hierarchical classification approach significantly minimises attacks from misclassified as normal traffic, which is more important in critical systems., Comment: Deakin University, Australia | This material is based upon work supported by the Air Force Office of Scientific Research under award number FA2386-23-1-4003
- Published
- 2024
47. A Dual-Tier Adaptive One-Class Classification IDS for Emerging Cyberthreats
- Author
-
Uddin, Md. Ashraf, Aryal, Sunil, Bouadjenek, Mohamed Reda, Al-Hawawreh, Muna, and Talukder, Md. Alamin
- Subjects
Computer Science - Cryptography and Security ,Computer Science - Machine Learning - Abstract
In today's digital age, our dependence on IoT (Internet of Things) and IIoT (Industrial IoT) systems has grown immensely, which facilitates sensitive activities such as banking transactions and personal, enterprise data, and legal document exchanges. Cyberattackers consistently exploit weak security measures and tools. The Network Intrusion Detection System (IDS) acts as a primary tool against such cyber threats. However, machine learning-based IDSs, when trained on specific attack patterns, often misclassify new emerging cyberattacks. Further, the limited availability of attack instances for training a supervised learner and the ever-evolving nature of cyber threats further complicate the matter. This emphasizes the need for an adaptable IDS framework capable of recognizing and learning from unfamiliar/unseen attacks over time. In this research, we propose a one-class classification-driven IDS system structured on two tiers. The first tier distinguishes between normal activities and attacks/threats, while the second tier determines if the detected attack is known or unknown. Within this second tier, we also embed a multi-classification mechanism coupled with a clustering algorithm. This model not only identifies unseen attacks but also uses them for retraining them by clustering unseen attacks. This enables our model to be future-proofed, capable of evolving with emerging threat patterns. Leveraging one-class classifiers (OCC) at the first level, our approach bypasses the need for attack samples, addressing data imbalance and zero-day attack concerns and OCC at the second level can effectively separate unknown attacks from the known attacks. Our methodology and evaluations indicate that the presented framework exhibits promising potential for real-world deployments., Comment: Deakin University, Australia | This material is based upon work supported by the Air Force Office of Scientific Research under award number FA2386-23-1-4003
- Published
- 2024
48. Training Machine Learning models at the Edge: A Survey
- Author
-
Khouas, Aymen Rayane, Bouadjenek, Mohamed Reda, Hacid, Hakim, and Aryal, Sunil
- Subjects
Computer Science - Machine Learning ,Computer Science - Distributed, Parallel, and Cluster Computing - Abstract
Edge computing has gained significant traction in recent years, promising enhanced efficiency by integrating artificial intelligence capabilities at the edge. While the focus has primarily been on the deployment and inference of Machine Learning (ML) models at the edge, the training aspect remains less explored. This survey, explores the concept of edge learning, specifically the optimization of ML model training at the edge. The objective is to comprehensively explore diverse approaches and methodologies in edge learning, synthesize existing knowledge, identify challenges, and highlight future trends. Utilizing Scopus and Web of science advanced search, relevant literature on edge learning was identified, revealing a concentration of research efforts in distributed learning methods, particularly federated learning. This survey further provides a guideline for comparing techniques used to optimize ML for edge learning, along with an exploration of the different frameworks, libraries, and simulation tools available. In doing so, the paper contributes to a holistic understanding of the current landscape and future directions in the intersection of edge computing and machine learning, paving the way for informed comparisons between optimization methods and techniques designed for training on the edge., Comment: 30 pages, 7 figures
- Published
- 2024
49. Robust Deep Reinforcement Learning Through Adversarial Attacks and Training : A Survey
- Author
-
Schott, Lucas, Delas, Josephine, Hajri, Hatem, Gherbi, Elies, Yaich, Reda, Boulahia-Cuppens, Nora, Cuppens, Frederic, and Lamprier, Sylvain
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
Deep Reinforcement Learning (DRL) is an approach for training autonomous agents across various complex environments. Despite its significant performance in well known environments, it remains susceptible to minor conditions variations, raising concerns about its reliability in real-world applications. To improve usability, DRL must demonstrate trustworthiness and robustness. A way to improve robustness of DRL to unknown changes in the conditions is through Adversarial Training, by training the agent against well suited adversarial attacks on the dynamics of the environment. Addressing this critical issue, our work presents an in-depth analysis of contemporary adversarial attack methodologies, systematically categorizing them and comparing their objectives and operational mechanisms. This classification offers a detailed insight into how adversarial attacks effectively act for evaluating the resilience of DRL agents, thereby paving the way for enhancing their robustness., Comment: 57 pages, 16 figues, 2 tables
- Published
- 2024
50. Exploring gene content with pangene graphs
- Author
-
Li, Heng, Marin, Maximillian, and Farhat, Maha Reda
- Subjects
Quantitative Biology - Genomics - Abstract
Motivation: The gene content regulates the biology of an organism. It varies between species and between individuals of the same species. Although tools have been developed to identify gene content changes in bacterial genomes, none is applicable to collections of large eukaryotic genomes such as the human pangenome. Results: We developed pangene, a computational tool to identify gene orientation, gene order and gene copy-number changes in a collection of genomes. Pangene aligns a set of input protein sequences to the genomes, resolves redundancies between protein sequences and constructs a gene graph with each genome represented as a walk in the graph. It additionally finds subgraphs, which we call bibubbles, that capture gene content changes. Applied to the human pangenome, pangene identifies known gene-level variations and reveals complex haplotypes that are not well studied before. Pangene also works with high-quality bacterial pangenome and reports similar numbers of core and accessory genes in comparison to existing tools. Availability and implementation: Source code at https://github.com/lh3/pangene; pre-built pangene graphs can be downloaded from https://zenodo.org/records/8118576 and visualized at https://pangene.bioinweb.org, Comment: 9 pages, 7 figures and 2 tables
- Published
- 2024
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.