60,068 results on '"Federico, P"'
Search Results
2. Extra cost of erasure due to quantum lifetime broadening
- Author
-
Dunlop, Joe, Cerisola, Federico, Monsel, Juliette, Sevitz, Sofia, Tabanera-Bravo, Jorge, Dexter, Jonathan, Fedele, Federico, Ares, Natalia, and Anders, Janet
- Subjects
Quantum Physics - Abstract
The energy cost of erasing a bit of information was fundamentally lower bounded by Landauer, in terms of the temperature of its environment: $W\geq k_\mathrm{B} T \ln 2$. However, in real electronic devices, the information-bearing system is usually in contact with two or more electrodes, with different temperatures and chemical potentials. It is not clear what sets the cost of erasure in such nonequilibrium situations. One promising technology for testing the thermodynamic limits of information processing is quantum dots, in which a bit is encoded in the presence or absence of a single electron. We here develop a thermodynamic description of devices of this type and find that, in addition to the electrode temperatures, the potential difference across the quantum dot and lifetime broadening of its energy level contribute to the minimum work cost of erasure. In practical contexts, these contributions may significantly outweigh the cost due to temperature alone., Comment: 6 pages plus 8-page supplemental material. 3 figures
- Published
- 2024
3. School Choice Strategies at the Intersections of Disability, Race, Class, and Geography
- Author
-
Federico R. Waitoller and Christopher Lubienski
- Abstract
While parents of students with disabilities (SWD) select schools according to various factors, schools also choose students through different sorting mechanisms. Thus, parents of SWD may need to employ different strategies to enroll their child in their preferred school. We employed an intersectional approach for studying school choice, integrating ethnographic interviews and descriptive GIS to answer the following questions: (a) What strategies do parents of SWD utilize to secure placement in the school of their choice? and (b) How is the engagement with such strategies shaped by their social and geographical locations? We found that parents engaged in five strategies: Accepting an IEP Team's school recommendations, securing placement through a sibling, testing into selective enrollments, changing IEP provisions, and engaging in due process. Moreover, these strategies were afforded and constrained by their intersecting social positions (i.e., race, class, and disability), their geographical locations, and the developmental school stage of their child (i.e., transitioning to kindergarten or high school).
- Published
- 2024
4. Cross-Correlating the Universe: The Gravitational Wave Background and Large-Scale Structure
- Author
-
Semenzato, Federico, Casey-Clyde, J. Andrew, Mingarelli, Chiara M. F., Raccanelli, Alvise, Bellomo, Nicola, Bartolo, Nicola, and Bertacca, Daniele
- Subjects
Astrophysics - Cosmology and Nongalactic Astrophysics - Abstract
The nature of the gravitational wave background (GWB) is a key question in modern astrophysics and cosmology, with significant implications for understanding of the structure and evolution of the Universe. We demonstrate how cross-correlating large-scale structure (LSS) tracers with the GWB spatial anisotropies can extract a clear astrophysical imprint from the GWB signal. Focusing on the unresolved population of supermassive black hole binaries (SMBHBs) as the primary source for the GWB at nanohertz frequencies, we construct full-sky maps of galaxy distributions and characteristic strain of the GWB to explore the relationship between GWB anisotropies and the LSS. We find that at current pulsar timing array (PTA) sensitivities, very few loud SMBHBs act as Poisson-like noise. This results in anisotropies dominated by a small number of sources, making GWB maps where SMBHBs trace the LSS indistinguishable from a GWBs from a uniform distribution of SMBHBs. In contrast, we find that the bulk of the unresolved SMBHBs produce anisotropies which mirror the spatial distribution of galaxies, and thus trace the LSS. Importantly, we show that cross-correlations are required to retrieve a clear LSS imprint in the GWB. Specifically, we find this LSS signature can me measured at a $3\sigma$ level in near-future PTA experiments that probe angular scales of $\ell_{\text{max}} \geq 42$, and $5\sigma$ for $\ell_{\text{max}} \geq 72$. Our approach opens new avenues to employ the GWB as an LSS tracer, providing unique insights into SMBHB population models and the nature of the GWB itself. Our results motivate further exploration of potential synergies between next-generation PTA experiments and cosmological tracers of the LSS., Comment: 9 pages, 7 figures. Submitted
- Published
- 2024
5. SelfCodeAlign: Self-Alignment for Code Generation
- Author
-
Wei, Yuxiang, Cassano, Federico, Liu, Jiawei, Ding, Yifeng, Jain, Naman, Mueller, Zachary, de Vries, Harm, von Werra, Leandro, Guha, Arjun, and Zhang, Lingming
- Subjects
Computer Science - Computation and Language ,Computer Science - Machine Learning ,Computer Science - Software Engineering - Abstract
Instruction tuning is a supervised fine-tuning approach that significantly improves the ability of large language models (LLMs) to follow human instructions. We propose SelfCodeAlign, the first fully transparent and permissive pipeline for self-aligning code LLMs without extensive human annotations or distillation. SelfCodeAlign employs the same base model for inference throughout the data generation process. It first extracts diverse coding concepts from high-quality seed snippets to generate new tasks. It then samples multiple responses per task, pairs each with test cases, and validates them in a sandbox environment. Finally, passing examples are selected for instruction tuning. In our primary experiments, we use SelfCodeAlign with CodeQwen1.5-7B to generate a dataset of 74k instruction-response pairs. Finetuning on this dataset leads to a model that achieves a 67.1 pass@1 on HumanEval+, surpassing CodeLlama-70B-Instruct despite being ten times smaller. Across all benchmarks, this finetuned model consistently outperforms the original version trained with OctoPack, the previous state-of-the-art method for instruction tuning without human annotations or distillation. Additionally, we show that SelfCodeAlign is effective across LLMs of various sizes, from 3B to 33B, and that the base models can benefit more from alignment with their own data distribution. We further validate each component's effectiveness in our pipeline, showing that SelfCodeAlign outperforms both direct distillation from GPT-4o and leading GPT-3.5-based distillation methods, such as OSS-Instruct and Evol-Instruct. SelfCodeAlign has also led to the creation of StarCoder2-Instruct, the first fully transparent, permissively licensed, and self-aligned code LLM that achieves state-of-the-art coding performance., Comment: Accepted to NeurIPS 2024
- Published
- 2024
6. Approaches to human activity recognition via passive radar
- Author
-
Bresciani, Christian, Cerutti, Federico, and Cominelli, Marco
- Subjects
Computer Science - Machine Learning - Abstract
The thesis explores novel methods for Human Activity Recognition (HAR) using passive radar with a focus on non-intrusive Wi-Fi Channel State Information (CSI) data. Traditional HAR approaches often use invasive sensors like cameras or wearables, raising privacy issues. This study leverages the non-intrusive nature of CSI, using Spiking Neural Networks (SNN) to interpret signal variations caused by human movements. These networks, integrated with symbolic reasoning frameworks such as DeepProbLog, enhance the adaptability and interpretability of HAR systems. SNNs offer reduced power consumption, ideal for privacy-sensitive applications. Experimental results demonstrate SNN-based neurosymbolic models achieve high accuracy making them a promising alternative for HAR across various domains.
- Published
- 2024
7. Darkness in interlayer and charge density wave states of 2H-TaS2
- Author
-
Camerano, Luigi, Mastrippolito, Dario, Pierucci, Debora, Dai, Ji, Tallarida, Massimo, Ottaviano, Luca, Profeta, Gianni, and Bisti, Federico
- Subjects
Condensed Matter - Strongly Correlated Electrons ,Condensed Matter - Materials Science - Abstract
The wave-like nature of electrons is evident from quantum interference effects observed during the photoemission process. When there are different nuclei in the unit cell of a crystal and/or structural distortions, photo-electron wavefunctions can interfere, giving rise to peculiar intensity modulation of the spectrum, which can also hide energy states in a photoemission experiment. The 2H phase of transition metal dichalcogenides, with two nonequivalent layers per unit cell and charge density wave distortion, is an optimal platform for such effects to be observed. Here, we discover undetectable states in 2H-TaS2, interpreting high-resolution angular resolved photoemission spectroscopy considering interference effects of the correlated electron wave functions. In addition, phase mismatching induced by the charge density wave distortion, results in evident signature of the phase transition in the photoemission spectrum. Our results highlight the importance of quantum interference, electronic correlations and structural distortion to understand the physics of layered materials., Comment: 7 pages, 3 figures
- Published
- 2024
8. Performance tests and hardware qualification of the FEBs for the Super-FGD of T2K Phase II
- Author
-
Giannessi, Lorenzo, Cadoux, Franck, Cap, Sebastien, Chakrani, Jaafar, Drapier, Olivier, Favre, Yannick, Gastaldi, Franck, Jakkapu, Mahesh, Nanni, Jerome, Sakashita, Ken, and Sánchez, Federico
- Subjects
Physics - Instrumentation and Detectors ,High Energy Physics - Experiment - Abstract
T2K is a long baseline neutrino experiment, entering Phase II with a Near Detector upgrade. The T2K near detector (ND280) upgrade consists of the installation of three new detector systems: a plastic scintillator neutrino active target (Super-FGD), two time projection chambers (HA-TPC) and a time of flight detector (TOF). The Super-FGD is composed of 2-million 1 cm-cube scintillating cubes read by almost 60 thousand wavelength-shifting (WLS) fibers coupled to an MPPC on one end. Given the large number of channels, the limited space inside magnetic environment, and the limited time from production to installation, the development and testing of the Front-end electronics boards (FEB) for the read-out of the Super-FGD channels represented a challenging task for the success of the upgrade. This work presents the performance tests confirming that the FEB aligns with detector requirements, and the hardware qualification of 240 FEBs through a custom QC test bench designed to detect and locate hardware failures to speed up the repairing process. Installation of the electronics in the detector took place in March 2024, one year after the beginning of the FEB mass production, and the first successful neutrino beam run took place in June of the same year., Comment: Topical Workshop on Electronics for Particle Physics - TWEPP 2024 - 30 September - 4 October, 2024 - Glasgow, Scotland
- Published
- 2024
9. Graph Neural Networks Uncover Geometric Neural Representations in Reinforcement-Based Motor Learning
- Author
-
Nardi, Federico, Han, Jinpei, Haar, Shlomi, and Faisal, A. Aldo
- Subjects
Computer Science - Machine Learning ,Electrical Engineering and Systems Science - Signal Processing - Abstract
Graph Neural Networks (GNN) can capture the geometric properties of neural representations in EEG data. Here we utilise those to study how reinforcement-based motor learning affects neural activity patterns during motor planning, leveraging the inherent graph structure of EEG channels to capture the spatial relationships in brain activity. By exploiting task-specific symmetries, we define different pretraining strategies that not only improve model performance across all participant groups but also validate the robustness of the geometric representations. Explainability analysis based on the graph structures reveals consistent group-specific neural signatures that persist across pretraining conditions, suggesting stable geometric structures in the neural representations associated with motor learning and feedback processing. These geometric patterns exhibit partial invariance to certain task space transformations, indicating symmetries that enable generalisation across conditions while maintaining specificity to individual learning strategies. This work demonstrates how GNNs can uncover the effects of previous outcomes on motor planning, in a complex real-world task, providing insights into the geometric principles governing neural representations. Our experimental design bridges the gap between controlled experiments and ecologically valid scenarios, offering new insights into the organisation of neural representations during naturalistic motor learning, which may open avenues for exploring fundamental principles governing brain activity in complex tasks., Comment: 19 pages, 7 figures, accepted at the NeurIPS 2024 workshop on Symmetry and Geometry in Neural Representations (NeurReps 2024)
- Published
- 2024
10. Characterization of symmetries of contact Hamiltonian systems
- Author
-
Zadra, Federico and Seri, Marcello
- Subjects
Mathematical Physics ,Mathematics - Dynamical Systems ,0G45, 70G65, 70H33, 34A26, 53D10 - Abstract
In this paper, we explore the relationship between dynamical symmetries, Cartan symmetries, and dynamical similarities in contact mechanics. Using an alternative decomposition of vector fields, we provide a characterization of those symmetries and a new description in terms of tensor densities. Additionally, we show that, in certain cases, this construction enables the recovery of integrals of motion.
- Published
- 2024
11. Emergence of meta-stable clustering in mean-field transformer models
- Author
-
Bruno, Giuseppe, Pasqualotto, Federico, and Agazzi, Andrea
- Subjects
Computer Science - Machine Learning ,Mathematics - Analysis of PDEs ,34D05, 34D06, 35Q83 - Abstract
We model the evolution of tokens within a deep stack of Transformer layers as a continuous-time flow on the unit sphere, governed by a mean-field interacting particle system, building on the framework introduced in (Geshkovski et al., 2023). Studying the corresponding mean-field Partial Differential Equation (PDE), which can be interpreted as a Wasserstein gradient flow, in this paper we provide a mathematical investigation of the long-term behavior of this system, with a particular focus on the emergence and persistence of meta-stable phases and clustering phenomena, key elements in applications like next-token prediction. More specifically, we perform a perturbative analysis of the mean-field PDE around the iid uniform initialization and prove that, in the limit of large number of tokens, the model remains close to a meta-stable manifold of solutions with a given structure (e.g., periodicity). Further, the structure characterizing the meta-stable manifold is explicitly identified, as a function of the inverse temperature parameter of the model, by the index maximizing a certain rescaling of Gegenbauer polynomials., Comment: 37 Pages, 6 figures
- Published
- 2024
12. TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters
- Author
-
Wang, Haiyang, Fan, Yue, Naeem, Muhammad Ferjad, Xian, Yongqin, Lenssen, Jan Eric, Wang, Liwei, Tombari, Federico, and Schiele, Bernt
- Subjects
Computer Science - Machine Learning - Abstract
Transformers have become the predominant architecture in foundation models due to their excellent performance across various domains. However, the substantial cost of scaling these models remains a significant concern. This problem arises primarily from their dependence on a fixed number of parameters within linear projections. When architectural modifications (e.g., channel dimensions) are introduced, the entire model typically requires retraining from scratch. As model sizes continue growing, this strategy results in increasingly high computational costs and becomes unsustainable. To overcome this problem, we introduce TokenFormer, a natively scalable architecture that leverages the attention mechanism not only for computations among input tokens but also for interactions between tokens and model parameters, thereby enhancing architectural flexibility. By treating model parameters as tokens, we replace all the linear projections in Transformers with our token-parameter attention layer, where input tokens act as queries and model parameters as keys and values. This reformulation allows for progressive and efficient scaling without necessitating retraining from scratch. Our model scales from 124M to 1.4B parameters by incrementally adding new key-value parameter pairs, achieving performance comparable to Transformers trained from scratch while greatly reducing training costs. Code and models are available at \url{https://github.com/Haiyang-W/TokenFormer}.
- Published
- 2024
13. Pomeranchuk instability from electronic correlations in CsTi$_3$Bi$_5$ kagome metal
- Author
-
Bigi, Chiara, Dürrnagel, Matteo, Klebl, Lennart, Consiglio, Armando, Pokharel, Ganesh, Bertran, Francois, Févre, Patrick Le, Jaouen, Thomas, Tchouekem, Hulerich C., Turban, Pascal, De Vita, Alessandro, Miwa, Jill A., Wells, Justin W., Oh, Dongjin, Comin, Riccardo, Thomale, Ronny, Zeljkovic, Ilija, Ortiz, Brenden R., Wilson, Stephen D., Sangiovanni, Giorgio, Mazzola, Federico, and Di Sante, Domenico
- Subjects
Condensed Matter - Strongly Correlated Electrons ,Condensed Matter - Materials Science - Abstract
Among many-body instabilities in correlated quantum systems, electronic nematicity, defined by the spontaneous breaking of rotational symmetry, has emerged as a critical phenomenon, particularly within high-temperature superconductors. Recently, this behavior has been identified in CsTi$_3$Bi$_5$, a member of the AV$_3$Sb$_5$ (A = K, Rb, Cs) kagome family, recognized for its intricate and unconventional quantum phases. Despite accumulating indirect evidence, the fundamental mechanisms driving nematicity in CsTi$_3$Bi$_5$ remain inadequately understood, sparking ongoing debates. In this study, we employ polarization-dependent angle-resolved photoemission spectroscopy to reveal definitive signatures of an orbital-selective nematic deformation in the electronic structure of CsTi$_3$Bi$_5$. This direct experimental evidence underscores the pivotal role of orbital degrees of freedom in symmetry breaking, providing new insights into the complex electronic environment. By applying the functional renormalization group technique to a fully interacting ab initio model, we demonstrate the emergence of a finite angular momentum ($d$-wave) Pomeranchuk instability in CsTi$_3$Bi$_5$, driven by the concomitant action of electronic correlations within specific orbital channels and chemical potential detuning away from Van Hove singularities. By elucidating the connection between orbital correlations and symmetry-breaking instabilities, this work lays a crucial foundation for future investigations into the broader role of orbital selectivity in quantum materials, with far-reaching implications for the design and manipulation of novel electronic phases.
- Published
- 2024
14. Generalized random processes related to Hadamard operators and Le~Roy measures
- Author
-
Beghin, Luisa, Cristofaro, Lorenzo, and Polito, Federico
- Subjects
Mathematics - Probability ,Primary: 60G22, 35S10. Secondary: 26A33, 33C15 - Abstract
The definition of generalized random processes in Gel'fand sense allows to extend well-known stochastic models, such as the fractional Brownian motion, and study the related fractional pde's, as well as stochastic differential equations in distributional sense. By analogy with the construction (in the infinite-dimensional white-noise space) of the latter, we introduce two processes defined by means of Hadamard-type fractional operators. When used to replace the time derivative in the governing p.d.e.'s, the Hadamard-type derivatives are usually associated with ultra-slow diffusions. On the other hand, in our construction, they directly determine the memory properties of the so-called Hadamard fractional Brownian motion (H-fBm) and its long-time behaviour. Still, for any finite time horizon, the H-fBm displays a standard diffusing feature. We then extend the definition of the H-fBm from the white noise space to an infinite dimensional grey-noise space built on the Le Roy measure, so that our model represents an alternative to the generalized grey Brownian motion. In this case, we prove that the one-dimensional distribution of the process satisfies a heat equation with non-constant coefficients and fractional Hadamard time-derivative. Finally, once proved the existence of the distributional derivative of the above defined processes and derived an integral formula for it, we construct an Ornstein-Uhlenbeck type process and evaluate its distribution., Comment: 30 pages
- Published
- 2024
15. Exposing Cross-Platform Coordinated Inauthentic Activity in the Run-Up to the 2024 U.S. Election
- Author
-
Cinus, Federico, Minici, Marco, Luceri, Luca, and Ferrara, Emilio
- Subjects
Computer Science - Social and Information Networks - Abstract
Coordinated information operations remain a persistent challenge on social media, despite platform efforts to curb them. While previous research has primarily focused on identifying these operations within individual platforms, this study shows that coordination frequently transcends platform boundaries. Leveraging newly collected data of online conversations related to the 2024 U.S. Election across $\mathbb{X}$ (formerly, Twitter), Facebook, and Telegram, we construct similarity networks to detect coordinated communities exhibiting suspicious sharing behaviors within and across platforms. Proposing an advanced coordination detection model, we reveal evidence of potential foreign interference, with Russian-affiliated media being systematically promoted across Telegram and $\mathbb{X}$. Our analysis also uncovers substantial intra- and cross-platform coordinated inauthentic activity, driving the spread of highly partisan, low-credibility, and conspiratorial content. These findings highlight the urgent need for regulatory measures that extend beyond individual platforms to effectively address the growing challenge of cross-platform coordinated influence campaigns., Comment: HUMANS Lab -- Working Paper No. 2024.7 -- The 2024 Election Integrity Initiative -- University of Southern California
- Published
- 2024
16. Spatio-temporal Transformers for Action Unit Classification with Event Cameras
- Author
-
Cultrera, Luca, Becattini, Federico, Berlincioni, Lorenzo, Ferrari, Claudio, and Del Bimbo, Alberto
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Face analysis has been studied from different angles to infer emotion, poses, shapes, and landmarks. Traditionally RGB cameras are used, yet for fine-grained tasks standard sensors might not be up to the task due to their latency, making it impossible to record and detect micro-movements that carry a highly informative signal, which is necessary for inferring the true emotions of a subject. Event cameras have been increasingly gaining interest as a possible solution to this and similar high-frame rate tasks. We propose a novel spatiotemporal Vision Transformer model that uses Shifted Patch Tokenization (SPT) and Locality Self-Attention (LSA) to enhance the accuracy of Action Unit classification from event streams. We also address the lack of labeled event data in the literature, which can be considered one of the main causes of an existing gap between the maturity of RGB and neuromorphic vision models. Gathering data is harder in the event domain since it cannot be crawled from the web and labeling frames should take into account event aggregation rates and the fact that static parts might not be visible in certain frames. To this end, we present FACEMORPHIC, a temporally synchronized multimodal face dataset composed of RGB videos and event streams. The dataset is annotated at a video level with facial Action Units and contains streams collected with various possible applications, ranging from 3D shape estimation to lip-reading. We then show how temporal synchronization can allow effective neuromorphic face analysis without the need to manually annotate videos: we instead leverage cross-modal supervision bridging the domain gap by representing face shapes in a 3D space. Our proposed model outperforms baseline methods by effectively capturing spatial and temporal information, crucial for recognizing subtle facial micro-expressions., Comment: Under review at CVIU. arXiv admin note: substantial text overlap with arXiv:2409.10213
- Published
- 2024
17. On a fractional magnetic pseudorelativistic operator: properties and applications
- Author
-
Bernini, Federico and d'Avenia, Pietro
- Subjects
Mathematics - Analysis of PDEs ,35S05, 35J61, 35Q60 - Abstract
We introduce a fractional magnetic pseudorelativistic operator for a general fractional order $s\in(0,1)$. First we define a suitable functional setting and we prove some fundamental properties. Then we show the behavior of the operator as $s \nearrow 1$ obtaining some results \`a la Bourgain-Brezis-Mironescu and removing the singularity from the integral definition. Finally we get existence of weak solutions for some semilinear equations involving a power type nonlinearity or a nonlocal (Choquard type) term., Comment: 35 pages
- Published
- 2024
18. Gradient Distance Function
- Author
-
Le, Hieu, Stella, Federico, Guillard, Benoit, and Fua, Pascal
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Unsigned Distance Functions (UDFs) can be used to represent non-watertight surfaces in a deep learning framework. However, UDFs tend to be brittle and difficult to learn, in part because the surface is located exactly where the UDF is non-differentiable. In this work, we show that Gradient Distance Functions (GDFs) can remedy this by being differentiable at the surface while still being able to represent open surfaces. This is done by associating to each 3D point a 3D vector whose norm is taken to be the unsigned distance to the surface and whose orientation is taken to be the direction towards the closest surface point. We demonstrate the effectiveness of GDFs on ShapeNet Car, Multi-Garment, and 3D-Scene datasets with both single-shape reconstruction networks or categorical auto-decoders., Comment: We developed this concurrently with 'Neural Vector Field,' and there are similarities between the two works so please pay them a visit as well. Here, we demonstrate how directly learning the gradient vector is much easier than learning the UDF
- Published
- 2024
19. Blowing up Chern-Ricci flat balanced metrics
- Author
-
Fusi, Elia and Giusti, Federico
- Subjects
Mathematics - Differential Geometry ,53C07, 53C25, 53C55 - Abstract
Given a compact Chern-Ricci flat balanced orbifold, we show that its blow-up at a finite family of smooth points admits constant Chern scalar curvature balanced metrics, extending Arezzo-Pacard's construction to the balanced setting. Moreover, if the orbifold has isolated singularities and admits crepant resolutions, we show that they always carry Chern-Ricci flat balanced metrics, without any further hypothesis. In addition, we discuss the general constant Chern scalar curvature balanced case and discuss another version of the main Theorem assuming the existence of a special (n-2, n-2)-form. We also present several classes of examples in which our results can be applied., Comment: 40 pages, comments are welcome!
- Published
- 2024
20. Major symmetry of the induced tangent stiffness tensor for the Zaremba-Jaumann rate and Kirchhoff stress in hyperelasticity: two different approaches
- Author
-
Federico, Salvatore, Holthausen, Sebastian, Husemann, Nina J., and Neff, Patrizio
- Subjects
Mathematics - Analysis of PDEs ,74B20 - Abstract
We recall in this note that the induced tangent stiffness tensor $\mathbb{H}^{\text{ZJ}}_{\tau}(\tau)$ appearing in a hypoelastic formulation based on the Zaremba-Jaumann corotational derivative and the rate constitutive equation for the Kirchhoff stress tensor $\tau$ is minor and major symmetric if the Kirchhoff stress $\tau$ is derived from an elastic potential $\mathrm{W}(F)$. This result is vaguely known in the literature. Here, we expose two different notational approaches which highlight the full symmetry of the tangent stiffness tensor $\mathbb{H}^{\text{ZJ}}_{\tau}(\tau)$. The first approach is based on the direct use of the definition of each symmetry (minor and major), i.e., via contractions of the tensor with the deformation rate tensor $D$. The second approach aims at finding an absolute expression of the tensor $\mathbb{H}^{\text{ZJ}}_{\tau}(\tau)$, by means of special tensor products and their symmetrisations. In some past works, the major symmetry of $\mathbb{H}^{\text{ZJ}}_{\tau}(\tau)$ has been missed because not all necessary symmetrisations were applied. The approach is exemplified for the isotropic Hencky energy. Corresponding stability checks of software packages are shortly discussed., Comment: arXiv admin note: text overlap with arXiv:2409.20051
- Published
- 2024
21. How linear can a non-linear hyperbolic IFS be?
- Author
-
Algom, Amir, Ovadia, Snir Ben, Hertz, Federico Rodriguez, and Shannon, Mario
- Subjects
Mathematics - Dynamical Systems - Abstract
Motivated by a question of M. Hochman, we construct examples of hyperbolic IFSs $\Phi$ on $[0,1]$ where linear and non-linear behaviour coexist. Namely, for every $2\leq r \leq \infty$ we exhibit the existence of a $C^r$-smooth IFS such that $f'\equiv c(\Phi)$ on the attractor and $f''\equiv 0$ for every $f \in \Phi$, yet $\Phi$ is not $C^t$-smooth for any $t>r$, nor $C^r$-conjugate to self-similar. We provide a complete classification of these systems. Furthermore, when $r>1$, we give a necessary and sufficient Livsic-like matching condition for a self-conformal $C^r$-smooth IFS to be conjugated to one of these systems having $f''=0$ on the attractor, for every $f\in \Phi$. We also show that this condition fails to ensure the existence of a $C^1$-conjugacy in mere $C^1$-regularity.
- Published
- 2024
22. Projection-based Reduced Order Modelling for Unsteady Parametrized Optimal Control Problems in 3D Cardiovascular Flows
- Author
-
Rathore, Surabhi, Africa, Pasquale Claudio, Ballarin, Francesco, Pichi, Federico, Girfoglio, Michele, and Rozza, Gianluigi
- Subjects
Mathematics - Numerical Analysis ,Mathematics - Optimization and Control ,Physics - Computational Physics ,Physics - Fluid Dynamics ,Physics - Medical Physics ,49M41 (Primary), 49K20, 65M60, 76-10, 92C50 (Secondary) - Abstract
This paper presents a projection-based reduced order modelling (ROM) framework for unsteady parametrized optimal control problems (OCP$_{(\mu)}$s) arising from cardiovascular (CV) applications. In real-life scenarios, accurately defining outflow boundary conditions in patient-specific models poses significant challenges due to complex vascular morphologies, physiological conditions, and high computational demands. These challenges make it difficult to compute realistic and reliable CV hemodynamics by incorporating clinical data such as 4D magnetic resonance imaging. To address these challenges, we focus on controlling the outflow boundary conditions to optimize CV flow dynamics and minimize the discrepancy between target and computed flow velocity profiles. The fluid flow is governed by unsteady Navier--Stokes equations with physical parametric dependence, i.e. the Reynolds number. Numerical solutions of OCP$_{(\mu)}$s require substantial computational resources, highlighting the need for robust and efficient ROMs to perform real-time and many-query simulations. Here, we aim at investigating the performance of a projection-based reduction technique that relies on the offline-online paradigm, enabling significant computational cost savings. The Galerkin finite element method is used to compute the high-fidelity solutions in the offline phase. We implemented a nested-proper orthogonal decomposition (nested-POD) for fast simulation of OCP$_{(\mu)}$s that encompasses two stages: temporal compression for reducing dimensionality in time, followed by parametric-space compression on the precomputed POD modes. We tested the efficacy of the methodology on vascular models, namely an idealized bifurcation geometry and a patient-specific coronary artery bypass graft, incorporating stress control at the outflow boundary, observing consistent speed-up with respect to high-fidelity strategies.
- Published
- 2024
23. CaloChallenge 2022: A Community Challenge for Fast Calorimeter Simulation
- Author
-
Krause, Claudius, Giannelli, Michele Faucci, Kasieczka, Gregor, Nachman, Benjamin, Salamani, Dalila, Shih, David, Zaborowska, Anna, Amram, Oz, Borras, Kerstin, Buckley, Matthew R., Buhmann, Erik, Buss, Thorsten, Cardoso, Renato Paulo Da Costa, Caterini, Anthony L., Chernyavskaya, Nadezda, Corchia, Federico A. G., Cresswell, Jesse C., Diefenbacher, Sascha, Dreyer, Etienne, Ekambaram, Vijay, Eren, Engin, Ernst, Florian, Favaro, Luigi, Franchini, Matteo, Gaede, Frank, Gross, Eilam, Hsu, Shih-Chieh, Jaruskova, Kristina, Käch, Benno, Kalagnanam, Jayant, Kansal, Raghav, Kim, Taewoo, Kobylianskii, Dmitrii, Korol, Anatolii, Korcari, William, Krücker, Dirk, Krüger, Katja, Letizia, Marco, Li, Shu, Liu, Qibin, Liu, Xiulong, Loaiza-Ganem, Gabriel, Madula, Thandikire, McKeown, Peter, Melzer-Pellmann, Isabell-A., Mikuni, Vinicius, Nguyen, Nam, Ore, Ayodele, Schweitzer, Sofia Palacios, Pang, Ian, Pedro, Kevin, Plehn, Tilman, Pokorski, Witold, Qu, Huilin, Raikwar, Piyush, Raine, John A., Reyes-Gonzalez, Humberto, Rinaldi, Lorenzo, Ross, Brendan Leigh, Scham, Moritz A. W., Schnake, Simon, Shimmin, Chase, Shlizerman, Eli, Soybelman, Nathalie, Srivatsa, Mudhakar, Tsolaki, Kalliopi, Vallecorsa, Sofia, Yeo, Kyongmin, and Zhang, Rui
- Subjects
Computer Science - Machine Learning ,High Energy Physics - Experiment ,High Energy Physics - Phenomenology ,Physics - Instrumentation and Detectors - Abstract
We present the results of the "Fast Calorimeter Simulation Challenge 2022" - the CaloChallenge. We study state-of-the-art generative models on four calorimeter shower datasets of increasing dimensionality, ranging from a few hundred voxels to a few tens of thousand voxels. The 31 individual submissions span a wide range of current popular generative architectures, including Variational AutoEncoders (VAEs), Generative Adversarial Networks (GANs), Normalizing Flows, Diffusion models, and models based on Conditional Flow Matching. We compare all submissions in terms of quality of generated calorimeter showers, as well as shower generation time and model size. To assess the quality we use a broad range of different metrics including differences in 1-dimensional histograms of observables, KPD/FPD scores, AUCs of binary classifiers, and the log-posterior of a multiclass classifier. The results of the CaloChallenge provide the most complete and comprehensive survey of cutting-edge approaches to calorimeter fast simulation to date. In addition, our work provides a uniquely detailed perspective on the important problem of how to evaluate generative models. As such, the results presented here should be applicable for other domains that use generative AI and require fast and faithful generation of samples in a large phase space., Comment: 204 pages, 100+ figures, 30+ tables
- Published
- 2024
24. Efficient Training of Sparse Autoencoders for Large Language Models via Layer Groups
- Author
-
Ghilardi, Davide, Belotti, Federico, and Molinari, Marco
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
Sparse AutoEnocders (SAEs) have recently been employed as an unsupervised approach for understanding the inner workings of Large Language Models (LLMs). They reconstruct the model's activations with a sparse linear combination of interpretable features. However, training SAEs is computationally intensive, especially as models grow in size and complexity. To address this challenge, we propose a novel training strategy that reduces the number of trained SAEs from one per layer to one for a given group of contiguous layers. Our experimental results on Pythia 160M highlight a speedup of up to 6x without compromising the reconstruction quality and performance on downstream tasks. Therefore, layer clustering presents an efficient approach to train SAEs in modern LLMs.
- Published
- 2024
25. Denoising Diffusion Planner: Learning Complex Paths from Low-Quality Demonstrations
- Author
-
Nikken, Michiel, Botteghi, Nicolò, Roozing, Weasley, and Califano, Federico
- Subjects
Computer Science - Robotics - Abstract
Denoising Diffusion Probabilistic Models (DDPMs) are powerful generative deep learning models that have been very successful at image generation, and, very recently, in path planning and control. In this paper, we investigate how to leverage the generalization and conditional-sampling capabilities of DDPMs to generate complex paths for a robotic end effector. We show that training a DDPM with synthetical and low-quality demonstrations is sufficient for generating nontrivial paths reaching arbitrary targets and avoiding obstacles. Additionally, we investigate different strategies for conditional sampling combining classifier-free and classifier-guided approaches. Eventually, we deploy the DDPM in a receding-horizon control scheme to enhance its planning capabilities. The Denoising Diffusion Planner is experimentally validated through various experiments on a Franka Emika Panda robot.
- Published
- 2024
26. A revisited Correction to the Halo Mass Function for local-type Primordial non-Gaussianity
- Author
-
Fiorino, Luca, Contarini, Sofia, Marulli, Federico, Sanchez, Ariel G., Baldi, Marco, Fiorilli, Andrea, and Moscardini, Lauro
- Subjects
Astrophysics - Cosmology and Nongalactic Astrophysics - Abstract
We investigate the effect of primordial non-Gaussianities on halo number counts using N-body simulations with different values of $f_{\rm NL}^{\rm loc}$. We show how current theoretical models fail to adequately describe the non-Gaussian mass function of halos identified with different overdensity thresholds, $\Delta_{\rm b}$. We explain how these discrepancies are related to a variation in the density profile of dark matter halos, finding that the internal steepness (i.e. the compactness) of halos depends on the value of $f_{\rm NL}^{\rm loc}$. We then parametrize these deviations in halo number counts with a factor $\kappa(\Delta_{\rm b})$ that modifies the linear density threshold for collapse according to the halo identification threshold used, defined with respect to the Universe background density. We rely on a second-degree polynomial to describe $\kappa$ and employ a Bayesian analysis to determine the coefficients of this polynomial. In addition, we verify the independence of the latter on the sign and absolute value of $f_{\rm NL}^{\rm loc}$. Finally, we show how this re-parametrization prevents the extraction of biased constraints on $f_{\rm NL}^{\rm loc}$, correcting for large systematic errors especially in the case of halos identified with high density thresholds. This improvement is crucial in the perspective of deriving cosmological constraints with the non-Gaussian mass function from real data, as different mass definitions can be employed depending on the properties of the survey., Comment: 28 pages, 8 figures, to be submitted to JCAP
- Published
- 2024
27. Exploring reinforcement learning for incident response in autonomous military vehicles
- Author
-
Madsen, Henrik, Grov, Gudmund, Mancini, Federico, Baksaas, Magnus, and Sommervoll, Åvald Åslaugson
- Subjects
Computer Science - Cryptography and Security ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning ,Computer Science - Robotics - Abstract
Unmanned vehicles able to conduct advanced operations without human intervention are being developed at a fast pace for many purposes. Not surprisingly, they are also expected to significantly change how military operations can be conducted. To leverage the potential of this new technology in a physically and logically contested environment, security risks are to be assessed and managed accordingly. Research on this topic points to autonomous cyber defence as one of the capabilities that may be needed to accelerate the adoption of these vehicles for military purposes. Here, we pursue this line of investigation by exploring reinforcement learning to train an agent that can autonomously respond to cyber attacks on unmanned vehicles in the context of a military operation. We first developed a simple simulation environment to quickly prototype and test some proof-of-concept agents for an initial evaluation. This agent was then applied to a more realistic simulation environment and finally deployed on an actual unmanned ground vehicle for even more realism. A key contribution of our work is demonstrating that reinforcement learning is a viable approach to train an agent that can be used for autonomous cyber defence on a real unmanned ground vehicle, even when trained in a simple simulation environment., Comment: DIGILIENCE 2024
- Published
- 2024
28. Belief in the Machine: Investigating Epistemological Blind Spots of Language Models
- Author
-
Suzgun, Mirac, Gur, Tayfun, Bianchi, Federico, Ho, Daniel E., Icard, Thomas, Jurafsky, Dan, and Zou, James
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computer Science - Computers and Society - Abstract
As language models (LMs) become integral to fields like healthcare, law, and journalism, their ability to differentiate between fact, belief, and knowledge is essential for reliable decision-making. Failure to grasp these distinctions can lead to significant consequences in areas such as medical diagnosis, legal judgments, and dissemination of fake news. Despite this, current literature has largely focused on more complex issues such as theory of mind, overlooking more fundamental epistemic challenges. This study systematically evaluates the epistemic reasoning capabilities of modern LMs, including GPT-4, Claude-3, and Llama-3, using a new dataset, KaBLE, consisting of 13,000 questions across 13 tasks. Our results reveal key limitations. First, while LMs achieve 86% accuracy on factual scenarios, their performance drops significantly with false scenarios, particularly in belief-related tasks. Second, LMs struggle with recognizing and affirming personal beliefs, especially when those beliefs contradict factual data, which raises concerns for applications in healthcare and counseling, where engaging with a person's beliefs is critical. Third, we identify a salient bias in how LMs process first-person versus third-person beliefs, performing better on third-person tasks (80.7%) compared to first-person tasks (54.4%). Fourth, LMs lack a robust understanding of the factive nature of knowledge, namely, that knowledge inherently requires truth. Fifth, LMs rely on linguistic cues for fact-checking and sometimes bypass the deeper reasoning. These findings highlight significant concerns about current LMs' ability to reason about truth, belief, and knowledge while emphasizing the need for advancements in these areas before broad deployment in critical sectors., Comment: https://github.com/suzgunmirac/belief-in-the-machine
- Published
- 2024
29. Unstable minimal spheres with degree-1 Gauss lift in hyperk\'ahler 4-manifolds
- Author
-
Foscolo, Lorenzo and Trinca, Federico
- Subjects
Mathematics - Differential Geometry ,53C40, 53C42 - Abstract
We exhibit new minimal 2-spheres in hyperk\"ahler 4-manifolds arising from the Gibbons--Hawking ansatz and in the K3 manifold endowed with a hyperk\"ahler metric. These minimal surfaces are obtained via a gluing construction using the Scherk surface in flat space and the holomorphic cigar in the Taub-NUT space as building blocks. As for the stable minimal 2-sphere in the Atiyah--Hitchin manifold, the minimal surfaces we construct are not holomorphic with respect to any complex structure compatible with the metric, have degree-1 positive Gauss lift so they can be parametrised by a harmonic map that satisfies a first-order Fueter-type PDE, and yet are unstable. This shows that there is no characterisation of stable minimal surfaces in hyperk\"ahler 4-manifolds in terms of topological data., Comment: 27 pages, 1 figure
- Published
- 2024
30. Testing Refracted Gravity with kinematics of galaxy clusters
- Author
-
Pizzuti, Lorenzo, Fantoccoli, Federico, Broccolato, Valeria, Biviano, Andrea, and Diaferio, Antonaldo
- Subjects
Astrophysics - Cosmology and Nongalactic Astrophysics ,General Relativity and Quantum Cosmology - Abstract
Refracted Gravity (RG) is a a classical theory of gravity where a gravitational permittivity $ a monotonically-increasing function of the local density rho , is introduced in the Poisson equation to mimic the effect of dark matter at astrophysical scales. We use high precision spectroscopic data of two massive galaxy clusters, MACS J1206.2-0847 at redshift z=0.44, and Abell S1063 (RXC J2248.7-4431) at z=0.35, to determine the total gravitational potential in the context of RG and to constrain the three, supposedly universal, free parameters of this model. Using an upgraded version of the MG-MAMPOSSt algorithm, we perform a kinematic analysis which combines the velocity distribution of the cluster galaxies and the velocity dispersion profile of the stars within the Brightest Cluster Galaxy (BCG). The unprecedented dataset used has been obtained by an extensive spectroscopic campaign carried out with the VIMOS and MUSE spectrographs at the ESO VLT. We found that RG describes the kinematics of these two clusters as well as Newtonian gravity, although the latter is slightly preferred. However, (i) each cluster requires a different set of the three free RG parameters, and (ii) the two sets are inconsistent with other results in the literature at different scales. We discuss the limitation of the method used to constrain the RG parameters as well as possible systematic effects which can give rise to the observed tension, notably deviations from the spherical symmetry and from the dynamical equilibrium of the clusters., Comment: 10 pages, 6 figures. Submitted to A&A
- Published
- 2024
31. Automated generation of photonic circuits for Bell tests with homodyne measurements
- Author
-
Lanore, Corentin, Grasselli, Federico, Valcarce, Xavier, Bancal, Jean-Daniel, and Sangouard, Nicolas
- Subjects
Quantum Physics - Abstract
Nonlocal quantum realizations, certified by the violation of a Bell inequality, are core resources for device-independent quantum information processing. Although proof-of-principle experiments demonstrating device-independent quantum information processing have already been reported, identifying physical platforms that are realistically closer to practical, viable devices remains a significant challenge. In this work, we present an automated framework for designing photonic implementations of nonlocal realizations using homodyne detections and quantum state heralding. Combining deep reinforcement learning and efficient simulations of quantum optical processes, our method generates photonic circuits that achieve significant violations of the Clauser-Horne-Shimony-Holt inequality. In particular, we find an experimental setup, robust to losses, that yields a CHSH violation of \boldsymbol{$2.068$} with \boldsymbol{$3.9$}~dB and \boldsymbol{$0.008$}~dB squeezed light sources and two beam splitters., Comment: 11+10 pages, 3+2 figures
- Published
- 2024
32. Impact of Leakage on Data Harmonization in Machine Learning Pipelines in Class Imbalance Across Sites
- Author
-
Nieto, Nicolás, Eickhoff, Simon B., Jung, Christian, Reuter, Martin, Diers, Kersten, Kelm, Malte, Lichtenberg, Artur, Raimondo, Federico, and Patil, Kaustubh R.
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
Machine learning (ML) models benefit from large datasets. Collecting data in biomedical domains is costly and challenging, hence, combining datasets has become a common practice. However, datasets obtained under different conditions could present undesired site-specific variability. Data harmonization methods aim to remove site-specific variance while retaining biologically relevant information. This study evaluates the effectiveness of popularly used ComBat-based methods for harmonizing data in scenarios where the class balance is not equal across sites. We find that these methods struggle with data leakage issues. To overcome this problem, we propose a novel approach PrettYharmonize, designed to harmonize data by pretending the target labels. We validate our approach using controlled datasets designed to benchmark the utility of harmonization. Finally, using real-world MRI and clinical data, we compare leakage-prone methods with PrettYharmonize and show that it achieves comparable performance while avoiding data leakage, particularly in site-target-dependence scenarios.
- Published
- 2024
33. Divisorial properties and special metrics on hypercomplex twistor spaces
- Author
-
Federico, Alberto Pipitone
- Subjects
Mathematics - Algebraic Geometry ,Mathematics - Complex Variables ,Mathematics - Differential Geometry - Abstract
We prove that the general fiber of a compact hypercomplex twistor space with a K\"{a}hler fiber has no divisors nor curves. This is first used to prove that, under the same assumption, the trascendental degree of the field of meromoprhic functions is one. The same result allows to prove that these spaces admit no K\"{a}hler and not even pluriclosed metrics.
- Published
- 2024
34. Doppler correlation-driven vetoes for the Frequency Hough analysis in continuous gravitational-wave searches
- Author
-
Di Giovanni, Matteo, Leaci, Paola, Astone, Pia, Pra, Stefano Dal, D'Atonio, Sabrina, D'Onofrio, Luca, Frasca, Sergio, Muciaccia, Federico, Palomba, Cristiano, Pierini, Lorenzo, and Tehrani, Francesco Safai
- Subjects
General Relativity and Quantum Cosmology ,Astrophysics - Instrumentation and Methods for Astrophysics ,Physics - Data Analysis, Statistics and Probability - Abstract
We present an improved method for vetoing candidates of continuous gravitational-wave sources during all-sky searches utilizing the Frequency Hough pipeline. This approach leverages linear correlations between source parameters induced by the Earth Doppler effect, which can be effectively identified through the Hough Transform. Candidates that do not align with these patterns are considered spurious and can thus be vetoed, enhancing the depth and statistical significance of follow-up analyses. Additionally, we provide a comprehensive explanation of the method calibration, which intrinsically linked to the total duration of the observing run. On average, the procedure successfully vetoes $56\%$ of candidates. To assess the method performance, we conducted a Monte-Carlo simulation injecting fake continuous-wave signals into data from the third observing run of the LIGO detectors. This analysis allowed us to infer strain amplitude upper limits at a $90\%$ confidence level. We found that the optimal sensitivity is $h_0^{90\%} = 3.62^{+0.23}_{-0.22}\times 10^{-26}$ in the [128, 200] Hz band, which is within the most sensible frequency band of the LIGO detectors., Comment: 13 pages, 9 figures, 5 tables
- Published
- 2024
35. Monopole excitations in the $U(1)$ Dirac spin liquid on the triangular lattice
- Author
-
Budaraju, Sasank, Parola, Alberto, Iqbal, Yasir, Becca, Federico, and Poilblanc, Didier
- Subjects
Condensed Matter - Strongly Correlated Electrons - Abstract
The $U(1)$ Dirac spin liquid might realize an exotic phase of matter whose low-energy properties are described by quantum electrodynamics in $2+1$ dimensions, where gapless modes exists but spinons and gauge fields are strongly coupled. Its existence has been proposed in frustrated Heisenberg models in presence of frustrating super-exchange interactions, by the (Abrikosov) fermionic representation of the spin operators [X.-G. Wen, Phys. Rev. B 65, 165113 (2002)}], supplemented by the Gutzwiller projection. Here, we construct charge-$Q$ monopole excitations in the Heisenberg model on the triangular lattice with nearest- ($J_1$) and next-neighbor ($J_2$) couplings. In the highly frustrated regime, singlet and triplet monopoles with $Q=1$ become gapless in the thermodynamic limit; in addition, the energies for generic $Q$ agree with field-theoretical predictions, obtained for a large number of gapless fermion modes. Finally, we consider localized gauge excitations, in which magnetic $\pi$-fluxes are concentrated in the triangular plaquettes (in analogy with $\mathbb{Z}_2$ visons), showing that these kind of states do not play a relevant role at low-energies. All our findings lend support to a stable $U(1)$ Dirac spin liquid in the $J_1-J_2$ Heisenberg model on the triangular lattice., Comment: 9 pages, 7 figures
- Published
- 2024
36. SENSEI at SNOLAB: Single-Electron Event Rate and Implications for Dark Matter
- Author
-
Bloch, Itay M., Botti, Ana M., Cababie, Mariano, Cancelo, Gustavo, Cervantes-Vergara, Brenda A., Daal, Miguel, Desai, Ansh, Drlica-Wagner, Alex, Essig, Rouven, Estrada, Juan, Etzion, Erez, Moroni, Guillermo Fernandez, Holland, Stephen E., Kehat, Jonathan, Lawson, Ian, Luoma, Steffon, Orly, Aviv, Perez, Santiago E., Rodrigues, Dario, Saffold, Nathan A., Scorza, Silvia, Sofo-Haro, Miguel, Stifter, Kelly, Tiffenberg, Javier, Uemura, Sho, Villalpando, Edgar Marrufo, Volansky, Tomer, Winkel, Federico, Wu, Yikai, and Yu, Tien-Tien
- Subjects
Astrophysics - Cosmology and Nongalactic Astrophysics ,High Energy Physics - Experiment - Abstract
We present results from data acquired by the SENSEI experiment at SNOLAB after a major upgrade in May 2023, which includes deploying 16 new sensors and replacing the copper trays that house the CCDs with a new light-tight design. We observe a single-electron event rate of $(1.39 \pm 0.11) \times 10^{-5}$ e$^-$/pix/day, corresponding to $(39.8 \pm 3.1)$ e$^-$/gram/day. This is an order-of-magnitude improvement compared to the previous lowest single-electron rate in a silicon detector and the lowest for any photon detector in the near-infrared-ultraviolet range. We use these data to obtain a 90% confidence level upper bound of $1.53 \times 10^{-5}$ e$^-$/pix/day and to set constraints on sub-GeV dark matter candidates that produce single-electron events. We hypothesize that the data taken at SNOLAB in the previous run, with an older tray design for the sensors, contained a larger rate of single-electron events due to light leaks. We test this hypothesis using data from the SENSEI detector located in the MINOS cavern at Fermilab.
- Published
- 2024
37. Improving Small-Scale Large Language Models Function Calling for Reasoning Tasks
- Author
-
Manduzio, Graziano A., Galatolo, Federico A., Cimino, Mario G. C. A., Scilingo, Enzo Pasquale, and Cominelli, Lorenzo
- Subjects
Computer Science - Artificial Intelligence - Abstract
Recent advancements in Large Language Models (LLMs) have demonstrated exceptional capabilities in natural language understanding and generation. While these models excel in general complex reasoning tasks, they still face challenges in mathematical problem-solving and logical reasoning. To address these limitations, researchers have explored function calling abilities, allowing LLMs to execute provided functions and utilize their outputs for task completion. However, concentrating on specific tasks can be very inefficient for large-scale LLMs to be used, because of the expensive cost of training and inference stages they need in terms of computational resources. This study introduces a novel framework for training smaller language models in function calling, focusing on specific logical and mathematical reasoning tasks. The approach aims to improve performances of small-scale models for these tasks using function calling, ensuring a high level of accuracy. Our framework employs an agent that, given a problem and a set of callable functions, queries the LLM by injecting a description and examples of the usable functions into the prompt and managing their calls in a step-by-step reasoning chain. This process is used to create a dataset of correct and incorrect reasoning chain chat completions from a large-scale LLM. This dataset is used to train a smaller LLM using Reinforcement Learning from Human Feedback (RLHF), specifically employing the Direct Preference Optimization (DPO) technique. Experimental results demonstrate how the proposed approach balances the trade-off between model size and performance, improving the ability of function calling for reasoning tasks, in smaller models.
- Published
- 2024
38. Partially Identified Rankings from Pairwise Interactions
- Author
-
Crippa, Federico and Fedchenko, Danil
- Subjects
Economics - Econometrics - Abstract
This paper considers the problem of ranking objects based on their latent merits using data from pairwise interactions. Existing approaches rely on the restrictive assumption that all the interactions are either observed or missed randomly. We investigate what can be inferred about rankings when this assumption is relaxed. First, we demonstrate that in parametric models, such as the popular Bradley-Terry-Luce model, rankings are point-identified if and only if the tournament graph is connected. Second, we show that in nonparametric models based on strong stochastic transitivity, rankings in a connected tournament are only partially identified. Finally, we propose two statistical tests to determine whether a ranking belongs to the identified set. One test is valid in finite samples but computationally intensive, while the other is easy to implement and valid asymptotically. We illustrate our procedure using Brazilian employer-employee data to test whether male and female workers rank firms differently when making job transitions.
- Published
- 2024
39. Bilayer one-dimensional Convection-Diffusion-Reaction-Source problem. Analytical and numerical solution
- Author
-
Umbricht, Guillermo Federico, Rubio, Diana, and Tarzia, Domingo Alberto
- Subjects
Physics - Fluid Dynamics ,Mathematical Physics ,Mathematics - Analysis of PDEs - Abstract
This article presents a theoretical analysis of a one-dimensional heat transfer problem in two layers involving diffusion, advection, internal heat generation or loss linearly dependent on temperature in each layer, and heat generation due to external sources. Additionally, the thermal resistance at the interface between the materials is considered. The situation of interest is modeled mathematically, explicit analytical solutions are found using Fourier techniques, and a convergent finite difference scheme is formulated to simulate specific cases. The solution is consistent with previous results. A numerical example is included that shows coherence between the obtained results and the physics of the problem. The conclusions drawn in this work expand the theoretical understanding of two-layer heat transfer and may also contribute to improving the thermal design of multilayer engineering systems., Comment: 22 Pages, 7 Figures, 3 Tables
- Published
- 2024
- Full Text
- View/download PDF
40. Long and short term variability of the possible nascent planetary nebula IRAS 22568+6141: A late thermal pulse?
- Author
-
Cala, Roldán A., Miranda, Luis F., Gómez, José F., Morisset, Christophe, Soto, Federico, Guillén, Pedro F., and Vázquez, Roberto
- Subjects
Astrophysics - Solar and Stellar Astrophysics - Abstract
IRAS 22568+6141 has been classified as a low-ionisation planetary nebula (PN) and presents non-thermal radio continuum emission, which could be a signature of nascent PNe. We present intermediate-resolution long-slit spectra obtained in 2021 and 2023, high-resolution long-slit spectra taken in 2023, and a light curve at the $r$-filter between 1953 and 2019, that reveal changes in IRAS 22568+6141 with timescales of decades and a few years. The object underwent an energetic event around 1990 that suddenly increased its brightness which has been fading since then. A comparison with a published spectrum from 1988 shows an increase of the H$\beta$ flux in 2021 by factor of $\simeq$6 and the [O III] emission lines that were absent in 1988. Between 2021 and 2023 the H$\beta$ flux decreased by a factor of $\simeq$1.7, and the [O III] emission lines almost vanished. These results and the variability observed in other emission lines indicate that IRAS 22568+6141 is recombining and cooling down between 2021 and 2023, and probably since 2005, as suggested by archival radio continuum and mid-IR observations. The intermediate- and high-resolution spectra show that the excitation of the emission lines is dominated by shocks in 2021 and 2023, and, probably, also in 1988, which may be related to the non-thermal radio continuum emission from the object. Although the variability might be due to changes in the physical conditions in the shocks or in a nova-like eruption, it accommodates better to that expected from a late thermal pulse, which is further suggested by a comparison with other similar objects. New observations and monitoring in the coming years are crucial to corroborate the origin of the variability., Comment: 11 pages, 8 figures, 6 tables. Accepted for publication in A&A
- Published
- 2024
41. A class of modular and flexible covariate-based covariance functions for nonstationary spatial modeling
- Author
-
Blasi, Federico and Furrer, Reinhard
- Subjects
Statistics - Methodology ,Statistics - Applications ,Statistics - Machine Learning - Abstract
The assumptions of stationarity and isotropy often stated over spatial processes have not aged well during the last two decades, partly explained by the combination of computational developments and the increasing availability of high-resolution spatial data. While a plethora of approaches have been developed to relax these assumptions, it is often a costly tradeoff between flexibility and a diversity of computational challenges. In this paper, we present a class of covariance functions that relies on fixed, observable spatial information that provides a convenient tradeoff while offering an extra layer of numerical and visual representation of the flexible spatial dependencies. This model allows for separate parametric structures for different sources of nonstationarity, such as marginal standard deviation, geometric anisotropy, and smoothness. It simplifies to a Mat\'ern covariance function in its basic form and is adaptable for large datasets, enhancing flexibility and computational efficiency. We analyze the capabilities of the presented model through simulation studies and an application to Swiss precipitation data., Comment: 24 pages, 12 figures
- Published
- 2024
42. Properties of black hole-star binaries formed in $N$-body simulations of massive star clusters: implications for Gaia black holes
- Author
-
Fantoccoli, Federico, Barber, Jordan, Dosopoulou, Fani, Chattopadhyay, Debatri, and Antonini, Fabio
- Subjects
Astrophysics - Astrophysics of Galaxies - Abstract
We investigate black hole-star binaries formed in $N$-body simulations of massive, dense star clusters. We simulate 32 clusters with varying initial masses ($10^{4}~\rm M_{\odot}$ to $10^{6}~\rm M_{\odot}$), densities ($1200~\rm M_{\odot}~pc^{-3}$ to $10^{5}~\rm M_{\odot}~pc^{-3}$), and metallicities $(Z = 0.01,~0.001,~0.0001)$. Our results reveal that star clusters produce a diverse range of BH-star binaries, with dynamical interactions leading to extreme systems characterised by large orbital separations and high black hole masses. Of the ejected BH-main sequence (BH-MS) binaries, $20\%$ form dynamically, while the rest originate from the primordial binary population initially present in the cluster. Ejected BH-MS binaries that are dynamically formed have more massive black holes, lower-mass stellar companions, and over half are in a hierarchical triple system. All unbound BH-giant star (BH-GS) binaries were ejected as BH-MS binaries and evolved into the BH-GS phase outside the cluster. Due to their lower-mass companions, most dynamically formed binaries do not evolve into BH-GS systems within a Hubble time. Consequently, only 2 of the 35 ejected BH-GS binaries are dynamically formed. We explore the formation pathways of Gaia-like systems, identifying two Gaia BH1-like binaries that formed through dynamical interactions, and two Gaia BH2-like systems with a primordial origin. We did not find any system resembling Gaia BH3, which may however be attributed to the limited sample size of our simulations.
- Published
- 2024
43. AlphaChimp: Tracking and Behavior Recognition of Chimpanzees
- Author
-
Ma, Xiaoxuan, Lin, Yutang, Xu, Yuan, Kaufhold, Stephan P., Terwilliger, Jack, Meza, Andres, Zhu, Yixin, Rossano, Federico, and Wang, Yizhou
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Understanding non-human primate behavior is crucial for improving animal welfare, modeling social behavior, and gaining insights into both distinctly human and shared behaviors. Despite recent advances in computer vision, automated analysis of primate behavior remains challenging due to the complexity of their social interactions and the lack of specialized algorithms. Existing methods often struggle with the nuanced behaviors and frequent occlusions characteristic of primate social dynamics. This study aims to develop an effective method for automated detection, tracking, and recognition of chimpanzee behaviors in video footage. Here we show that our proposed method, AlphaChimp, an end-to-end approach that simultaneously detects chimpanzee positions and estimates behavior categories from videos, significantly outperforms existing methods in behavior recognition. AlphaChimp achieves approximately 10% higher tracking accuracy and a 20% improvement in behavior recognition compared to state-of-the-art methods, particularly excelling in the recognition of social behaviors. This superior performance stems from AlphaChimp's innovative architecture, which integrates temporal feature fusion with a Transformer-based self-attention mechanism, enabling more effective capture and interpretation of complex social interactions among chimpanzees. Our approach bridges the gap between computer vision and primatology, enhancing technical capabilities and deepening our understanding of primate communication and sociality. We release our code and models and hope this will facilitate future research in animal social dynamics. This work contributes to ethology, cognitive science, and artificial intelligence, offering new perspectives on social intelligence., Comment: An eXpressive extension of ChimpACT [arXiv:2310.16447], proposes AlphaChimp for tracking and behavior recognition of chimpanzees. arXiv admin note: substantial text overlap with arXiv:2310.16447
- Published
- 2024
44. PEtra: A Flexible and Open-Source PE Loop Tracer for Polymer Thin-Film Transducers
- Author
-
Wessner, Marc-Andre, Villani, Federico, Papa, Sofia, Keller, Kirill, Ferrari, Laura, Greco, Francesco, Benini, Luca, and Leitner, Christoph
- Subjects
Electrical Engineering and Systems Science - Systems and Control ,Physics - Instrumentation and Detectors - Abstract
Accurate characterization of ferroelectric properties in polymer piezoelectrics is critical for optimizing the performance of flexible and wearable ultrasound transducers, such as screen-printed PVDF devices. Standard charge measurement techniques, like the Sawyer-Tower circuit, often fall short when applied to ferroelectric polymers due to low-frequency leakage. In this work, we present PEtra, an open-source and versatile piezoelectric loop tracer. PEtra employs a transimpedance amplifier (LMP7721, TI) to convert picoampere-level currents into measurable voltages, covering a frequency range of 0.1 Hz to 5 Hz for a gain setting of 10^7 V/A, and 0.1 Hz to 200 Hz for gain settings between 10^3 V/A to 10^6 V/A (10-fold increments). We demonstrate through simulations and experimental validations that PEtra achieves a sensitivity down to 2 pA, effectively addressing the limitations of traditional charge measurement methods. Compared to the Sawyer-Tower circuit, PEtra directly amplifies currents without the need for a reference capacitor. As a result, it is less susceptible to leakage and can operate at lower frequencies, improving measurement accuracy and reliability. PEtra's design is fully open source, offering researchers and engineers a versatile tool to drive advancements in flexible PVDF transducer technology.
- Published
- 2024
45. On conditional diffusion models for PDE simulations
- Author
-
Shysheya, Aliaksandra, Diaconu, Cristiana, Bergamin, Federico, Perdikaris, Paris, Hernández-Lobato, José Miguel, Turner, Richard E., and Mathieu, Emile
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
Modelling partial differential equations (PDEs) is of crucial importance in science and engineering, and it includes tasks ranging from forecasting to inverse problems, such as data assimilation. However, most previous numerical and machine learning approaches that target forecasting cannot be applied out-of-the-box for data assimilation. Recently, diffusion models have emerged as a powerful tool for conditional generation, being able to flexibly incorporate observations without retraining. In this work, we perform a comparative study of score-based diffusion models for forecasting and assimilation of sparse observations. In particular, we focus on diffusion models that are either trained in a conditional manner, or conditioned after unconditional training. We address the shortcomings of existing models by proposing 1) an autoregressive sampling approach that significantly improves performance in forecasting, 2) a new training strategy for conditional score-based models that achieves stable performance over a range of history lengths, and 3) a hybrid model which employs flexible pre-training conditioning on initial conditions and flexible post-training conditioning to handle data assimilation. We empirically show that these modifications are crucial for successfully tackling the combination of forecasting and data assimilation, a task commonly encountered in real-world scenarios., Comment: Accepted at NeurIPS 2024
- Published
- 2024
46. Runtime Reduction in Linear Quantum Charge-Coupled Devices using the Parity Flow Formalism
- Author
-
Domínguez, Federico, Fellner, Michael, Klaver, Berend, Rombouts, Stefan, Ertler, Christian, and Lechner, Wolfgang
- Subjects
Quantum Physics - Abstract
Using the Parity Flow formalism, we show that physical SWAP gates can be eliminated in linear hardware architectures, without increasing the total number of two-qubit operations. This has a significant impact on the execution time of quantum circuits in linear Quantum Charge-Coupled Devices (QCCDs), where SWAP gates are implemented by physically changing the position of the ions. Because SWAP gates are one of the most time-consuming operations in QCCDs, our scheme considerably reduces the runtime of the quantum Fourier transform and the quantum approximate optimization algorithm on all-to-all spin models, compared to circuits generated with standard compilers (TKET and Qiskit). While increasing the problem size (and therefore the number of qubits) typically demands longer runtimes, which are constrained by coherence time, our runtime reduction enables a significant increase in the number of qubits at a given coherence time., Comment: 9 pages, 4 figures. Typos corrected
- Published
- 2024
47. The microscale organization of directed hypergraphs
- Author
-
Lotito, Quintino Francesco, Vendramini, Alberto, Montresor, Alberto, and Battiston, Federico
- Subjects
Physics - Physics and Society ,Computer Science - Social and Information Networks - Abstract
Many real-world complex systems are characterized by non-pairwise -- higher-order -- interactions among system's units, and can be effectively modeled as hypergraphs. Directed hypergraphs distinguish between source and target sets within each hyperedge, and allow to account for the directional flow of information between nodes. Here, we provide a framework to characterize the structural organization of directed higher-order networks at their microscale. First, we extract the fingerprint of a directed hypergraph, capturing the frequency of hyperedges with a certain source and target sizes, and use this information to compute differences in higher-order connectivity patterns among real-world systems. Then, we formulate reciprocity in hypergraphs, including exact, strong, and weak definitions, to measure to which extent hyperedges are reciprocated. Finally, we extend motif analysis to identify recurring interaction patterns and extract the building blocks of directed hypergraphs. We validate our framework on empirical datasets, including Bitcoin transactions, metabolic networks, and citation data, revealing structural principles behind the organization of real-world systems.
- Published
- 2024
48. Lossless optimal transient control for rigid bodies in 3D space
- Author
-
Zanella, Riccardo, Califano, Federico, Franchi, Antonio, and Stramigioli, Stefano
- Subjects
Electrical Engineering and Systems Science - Systems and Control - Abstract
In this letter, we propose a control scheme for rigid bodies designed to optimise transient behaviors. The search space for the optimal control input is parameterized to yield a passive, specifically lossless, nonlinear feedback controller. As a result, it can be combined with other stabilizing controllers without compromising the stability of the closed-loop system. The controller commands torques generating fictitious gyroscopic effects characteristics of 3D rotational rigid body motions, and as such does not inject nor extract kinetic energy from the system. We validate the controller in simulation using a model predictive control (MPC) scheme, successfully combining stability and performance in a stabilization task with obstacle avoidance constraints.
- Published
- 2024
49. On the Design and Performance of Machine Learning Based Error Correcting Decoders
- Author
-
Yuan, Yuncheng, Scheepers, Péter, Tasiou, Lydia, Gültekin, Yunus Can, Corradi, Federico, and Alvarado, Alex
- Subjects
Electrical Engineering and Systems Science - Signal Processing ,Computer Science - Machine Learning - Abstract
This paper analyzes the design and competitiveness of four neural network (NN) architectures recently proposed as decoders for forward error correction (FEC) codes. We first consider the so-called single-label neural network (SLNN) and the multi-label neural network (MLNN) decoders which have been reported to achieve near maximum likelihood (ML) performance. Here, we show analytically that SLNN and MLNN decoders can always achieve ML performance, regardless of the code dimensions -- although at the cost of computational complexity -- and no training is in fact required. We then turn our attention to two transformer-based decoders: the error correction code transformer (ECCT) and the cross-attention message passing transformer (CrossMPT). We compare their performance against traditional decoders, and show that ordered statistics decoding outperforms these transformer-based decoders. The results in this paper cast serious doubts on the application of NN-based FEC decoders in the short and medium block length regime., Comment: 6 pages, 4 figures, submitted for possible presentation in a conference (v2: Pre-FEC BER curves are corrected)
- Published
- 2024
50. Taming Mambas for Voxel Level 3D Medical Image Segmentation
- Author
-
Lumetti, Luca, Pipoli, Vittorio, Marchesini, Kevin, Ficarra, Elisa, Grana, Costantino, and Bolelli, Federico
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Recently, the field of 3D medical segmentation has been dominated by deep learning models employing Convolutional Neural Networks (CNNs) and Transformer-based architectures, each with their distinctive strengths and limitations. CNNs are constrained by a local receptive field, whereas transformers are hindered by their substantial memory requirements as well as they data hungriness, making them not ideal for processing 3D medical volumes at a fine-grained level. For these reasons, fully convolutional neural networks, as nnUNet, still dominate the scene when segmenting medical structures in 3D large medical volumes. Despite numerous advancements towards developing transformer variants with subquadratic time and memory complexity, these models still fall short in content-based reasoning. A recent breakthrough is Mamba, a Recurrent Neural Network (RNN) based on State Space Models (SSMs) outperforming Transformers in many long-context tasks (million-length sequences) on famous natural language processing and genomic benchmarks while keeping a linear complexity.
- Published
- 2024
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.