Author: "Hei, P" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Hei, P"' showing total 3,679 results

Start Over Author "Hei, P"

3,679 results on '"Hei, P"'

1. Unified Kernel-Segregated Transpose Convolution Operation

Author: Tida, Vijay Srinivas, Hossen, Md Imran, Shan, Liqun, Chilukoti, Sai Venkatesh, Hsu, Sonya, and Hei, Xiali
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: The optimization of the transpose convolution layer for deep learning applications is achieved with the kernel segregation mechanism. However, kernel segregation has disadvantages, such as computing extra elements to obtain the output feature map with odd dimensions while launching a thread. To mitigate this problem, we introduce a unified kernel segregation approach that limits the usage of memory and computational resources by employing one unified kernel to execute four sub-kernels. The findings reveal that the suggested approach achieves an average computational speedup of 2.03x (3.89x) when tested on specific datasets with an RTX 2070 GPU (Intel Xeon CPU). The ablation study shows an average computational speedup of 3.5x when evaluating the transpose convolution layers from well-known Generative Adversarial Networks (GANs). The implementation of the proposed method for the transpose convolution layers in the EB-GAN model demonstrates significant memory savings of up to 35 MB.
Published: 2025

2. Stochastic trace estimation for parameter-dependent matrices applied to spectral density approximation

Author: Matti, Fabio, He, Haoze, Kressner, Daniel, and Lam, Hei Yin
Subjects: Mathematics - Numerical Analysis, 65C05, 65F15, 65Y20, 68W20, 68W25, 68W40
Abstract: Stochastic trace estimation is a well-established tool for approximating the trace of a large symmetric matrix $\mathbf{B}$. Several applications involve a matrix that depends continuously on a parameter $t \in [a,b]$, and require trace estimates of $\mathbf{B}(t)$ for many values of $t$. This is, for example, the case when approximating the spectral density of a matrix. Approximating the trace separately for each matrix $\mathbf{B}(t_1), \dots, \mathbf{B}(t_m)$ clearly incurs redundancies and a cost that scales linearly with $m$. To address this issue, we propose and analyze modifications for three stochastic trace estimators, the Girard-Hutchinson, Nystr\"om, and Nystr\"om++ estimators. Our modification uses \emph{constant} randomization across different values of $t$, that is, every matrix $\mathbf{B}(t_1), \dots, \mathbf{B}(t_m)$ is multiplied with the \emph{same} set of random vectors. When combined with Chebyshev approximation in $t$, the use of such constant random matrices allows one to reuse matrix-vector products across different values of $t$, leading to significant cost reduction. Our analysis shows that the loss of stochastic independence across different $t$ does not lead to deterioration. In particular, we show that $\mathcal{O}(\varepsilon^{-1})$ random matrix-vector products suffice to ensure an error of $\varepsilon > 0$ for Nystr\"om++, independent of low-rank properties of $\mathbf{B}(t)$. We discuss in detail how the combination of Nystr\"om++ with Chebyshev approximation applies to spectral density estimation and provide an analysis of the resulting method. This improves various aspects of an existing stochastic estimator for spectral density estimation. Several numerical experiments from electronic structure interaction, statistical thermodynamics, and neural network optimization validate our findings.
Published: 2025

3. Not-So-Optimal Transport Flows for 3D Point Cloud Generation

Author: Hui, Ka-Hei, Liu, Chao, Zeng, Xiaohui, Fu, Chi-Wing, and Vahdat, Arash
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: Learning generative models of 3D point clouds is one of the fundamental problems in 3D generative learning. One of the key properties of point clouds is their permutation invariance, i.e., changing the order of points in a point cloud does not change the shape they represent. In this paper, we analyze the recently proposed equivariant OT flows that learn permutation invariant generative models for point-based molecular data and we show that these models scale poorly on large point clouds. Also, we observe learning (equivariant) OT flows is generally challenging since straightening flow trajectories makes the learned flow model complex at the beginning of the trajectory. To remedy these, we propose not-so-optimal transport flow models that obtain an approximate OT by an offline OT precomputation, enabling an efficient construction of OT pairs for training. During training, we can additionally construct a hybrid coupling by combining our approximate OT and independent coupling to make the target flow models easier to learn. In an extensive empirical study, we show that our proposed model outperforms prior diffusion- and flow-based approaches on a wide range of unconditional generation and shape completion on the ShapeNet benchmark.
Published: 2025

4. Differentially private fine-tuned NF-Net to predict GI cancer type

Author: Chilukoti, Sai Venkatesh, Md, Imran Hossen, Shan, Liqun, Tida, Vijay Srinivas, and Hei, Xiali
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Based on global genomic status, the cancer tumor is classified as Microsatellite Instable (MSI) and Microsatellite Stable (MSS). Immunotherapy is used to diagnose MSI, whereas radiation and chemotherapy are used for MSS. Therefore, it is significant to classify a gastro-intestinal (GI) cancer tumor into MSI vs. MSS to provide appropriate treatment. The existing literature showed that deep learning could directly predict the class of GI cancer tumors from histological images. However, deep learning (DL) models are susceptible to various threats, including membership inference attacks, model extraction attacks, etc. These attacks render the use of DL models impractical in real-world scenarios. To make the DL models useful and maintain privacy, we integrate differential privacy (DP) with DL. In particular, this paper aims to predict the state of GI cancer while preserving the privacy of sensitive data. We fine-tuned the Normalizer Free Net (NF-Net) model. We obtained an accuracy of 88.98\% without DP to predict (GI) cancer status. When we fine-tuned the NF-Net using DP-AdamW and adaptive DP-AdamW, we got accuracies of 74.58% and 76.48%, respectively. Moreover, we investigate the Weighted Random Sampler (WRS) and Class weighting (CW) to solve the data imbalance. We also evaluated and analyzed the DP algorithms in different settings., Comment: 10 pages, 8 tables, 2 figures
Published: 2025

5. Exploring Test Time Adaptation for Subcortical Segmentation of the Fetal Brain in 3D Ultrasound

Author: Omolegan, Joshua, Yeung, Pak Hei, Wyburd, Madeleine K., Hesse, Linde, Haak, Monique, Consortium, Intergrowth-21st, Namburete, Ana I. L., and Dinsdale, Nicola K.
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Monitoring the growth of subcortical regions of the fetal brain in ultrasound (US) images can help identify the presence of abnormal development. Manually segmenting these regions is a challenging task, but recent work has shown that it can be automated using deep learning. However, applying pretrained models to unseen freehand US volumes often leads to a degradation of performance due to the vast differences in acquisition and alignment. In this work, we first demonstrate that test time adaptation (TTA) can be used to improve model performance in the presence of both real and simulated domain shifts. We further propose a novel TTA method by incorporating a normative atlas as a prior for anatomy. In the presence of various types of domain shifts, we benchmark the performance of different TTA methods and demonstrate the improvements brought by our proposed approach, which may further facilitate automated monitoring of fetal brain development. Our code is available at https://github.com/joshuaomolegan/TTA-for-3D-Fetal-Subcortical-Segmentation., Comment: 5 pages, 5 figures
Published: 2025

6. Community detection for directed networks revisited using bimodularity

Author: Cionca, Alexandre, Chan, Chun Hei Michael, and Van De Ville, Dimitri
Subjects: Computer Science - Social and Information Networks
Abstract: Community structure is a key feature omnipresent in real-world network data. Plethora of methods have been proposed to reveal subsets of densely interconnected nodes using criteria such as the modularity index. These approaches have been successful for undirected graphs, but directed edge information has not yet been dealt with in a satisfactory way. Here, we revisit the concept of directed communities as a mapping between sending and receiving communities. This translates into a new definition that we term bimodularity. Using convex relaxation, bimodularity can be optimized with the singular value decomposition of the directed modularity matrix. Subsequently, we propose an edge-based clustering approach to reveal the directed communities including their mappings. The feasibility of the new framework is illustrated on a synthetic model and further applied to the neuronal wiring diagram of the \textit{C. elegans}, for which it yields meaningful feedforward loops of the head and body motion systems. This framework sets the ground for the understanding and detection of community structures in directed networks., Comment: 7 pages, 4 figures, 12 supplementary pages with 10 figures. Code and data are available at https://github.com/MIPLabCH/Bimodularity
Published: 2025

7. Decoding Human Attentive States from Spatial-temporal EEG Patches Using Transformers

Author: Ding, Yi, Lee, Joon Hei, Zhang, Shuailei, Luo, Tianze, and Guan, Cuntai
Subjects: Electrical Engineering and Systems Science - Signal Processing
Abstract: Learning the spatial topology of electroencephalogram (EEG) channels and their temporal dynamics is crucial for decoding attention states. This paper introduces EEG-PatchFormer, a transformer-based deep learning framework designed specifically for EEG attention classification in Brain-Computer Interface (BCI) applications. By integrating a Temporal CNN for frequency-based EEG feature extraction, a pointwise CNN for feature enhancement, and Spatial and Temporal Patching modules for organizing features into spatial-temporal patches, EEG-PatchFormer jointly learns spatial-temporal information from EEG data. Leveraging the global learning capabilities of the self-attention mechanism, it captures essential features across brain regions over time, thereby enhancing EEG data decoding performance. Demonstrating superior performance, EEG-PatchFormer surpasses existing benchmarks in accuracy, area under the ROC curve (AUC), and macro-F1 score on a public cognitive attention dataset. The code can be found via: https://github.com/yi-ding-cs/EEG-PatchFormer ., Comment: Implementation details are updated
Published: 2025

8. Classic4Children: Adapting Chinese Literary Classics for Children with Large Language Model

Author: Chen, Jiali, Hei, Xusen, Xue, Yuqi, Wu, Zihan, Xie, Jiayuan, and Cai, Yi
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Chinese literary classics hold significant cultural and educational value, offering deep insights into morality, history, and human nature. These works often include classical Chinese and complex narratives, making them difficult for children to read. To bridge this gap, we introduce a child-friendly literary adaptation (CLA) task to adapt the Chinese literary classic into engaging and accessible text for children. However, recent large language models (LLMs) overlook children's reading preferences (\ie, vivid character portrayals, concise narrative structures, and appropriate readability), which poses challenges in CLA. In this paper, we propose a method called InstructChild, which augments the LLM with these preferences for adaptation. Specifically, we first obtain the characters' personalities and narrative structure as additional information for fine-grained instruction tuning. Then, we devise a readability metric as the reward to align the LLM with the children's reading level. Finally, a lookahead decoding strategy is applied to improve the readability of the generated text during inference. To support the evaluation of CLA task, we construct the Classic4Children dataset, which comprises both the original and child-friendly versions of the Four Great Classical Novels of Chinese literature. Experimental results show that our InstructChild significantly improves automatic and human evaluation performance., Comment: Accepted at NAACL 2025 Findings
Published: 2025

9. MAIA: A new detector concept for a 10 TeV muon collider

Author: Bell, Charles, Calzolari, Daniele, Carli, Christian, Di Petrillo, Karri Folan, Hillman, Micah, Holmes, Tova R., Jindariani, Sergo, Kennedy, Kiley E., Kwok, Ka Hei Martin, Lechner, Anton, Lee, Lawrence, Madlener, Thomas, Meloni, Federico, Ojalvo, Isobel, Pani, Priscilla, Powers, Rose, Rosser, Benjamin, Rozanov, Leo, Skoufaris, Kyriacos, Sledge, Elise, Tuna, Alexander, and Zhang, Junjia
Subjects: Physics - Instrumentation and Detectors, High Energy Physics - Experiment
Abstract: Muon colliders offer a compelling opportunity to explore the TeV scale and conduct precision tests of the Standard Model, all within a relatively compact geographical footprint. This paper introduces a new detector concept, MAIA (Muon Accelerator Instrumented Apparatus), optimized for $\sqrt{s}=10$ TeV $\mu\mu$ collisions. The detector features an all-silicon tracker immersed in a 5T solenoid field. High-granularity silicon-tungsten and iron-scintillator calorimeters surrounding the solenoid capture high-energy electronic and hadronic showers, respectively, and support particle-flow reconstruction. The outermost subsystem comprises an air-gap muon spectrometer, which enables standalone track reconstruction for high-momentum muons. The performance of the MAIA detector is evaluated in terms of differential particle reconstruction efficiencies and resolutions. Beam-induced background (BIB) simulations generated in FLUKA are overlaid with single particle gun samples to assess detector reconstruction capabilities under realistic experimental conditions. Even with BIB, reconstruction efficiencies exceed 95% for energetic tracks, photons, and neutrons in the central region of the detector. This paper outlines promising avenues of future work, including forward region optimization and opportunities for enhanced flavor/boosted object tagging, and addresses the technological assumptions needed to achieve the desired detector performance., Comment: 41 pages, 24 figures
Published: 2025

10. Aspects of Spatially-Correlated Random Fields: Extreme-Value Statistics and Clustering Properties

Author: Choi, Ka Hei, Creswell, James, Kuhnel, Florian, and Schwarz, Dominik J.
Subjects: Astrophysics - Cosmology and Nongalactic Astrophysics, General Relativity and Quantum Cosmology, High Energy Physics - Phenomenology, Physics - Data Analysis, Statistics and Probability
Abstract: Rare events of large-scale spatially-correlated exponential random fields are studied. The influence of spatial correlations on clustering and non-sphericity is investigated. The size of the performed simulations permits to study beyond-$7.5$-sigma events ($1$ in $10^{13}$). As an application, this allows to resolve individual Hubble patches which fulfill the condition for primordial black hole formation. It is argued that their mass spectrum is drastically altered due to co-collapse of clustered overdensities as well as the mutual threshold-lowering through the latter. Furthermore, the corresponding non-sphericities imply possibly large changes in the initial black hole spin distribution., Comment: 7 pages, 6 figures
Published: 2025

11. One-Bit Sigma-Delta DFRC Waveform Design: Using Quantization Noise for Radar Probing

Author: Keung, Wai-Yiu, Cheng, Hei Victor, and Ma, Wing-Kin
Subjects: Electrical Engineering and Systems Science - Signal Processing
Abstract: Dual-functional radar-communication (DFRC) signal design has received much attention lately. We consider the scenario of one-bit massive multi-input multi-output (MIMO) wherein one-bit DACs are employed for the sake of saving hardware costs. Specifically, a spatial Sigma-Delta $(\Sigma\Delta)$ modulation scheme is proposed for one-bit MIMO-DFRC waveform design. Unlike the existing approaches which require large-scale binary optimization, the proposed scheme performs $\Sigma\Delta$ modulation on a continuous-valued DFRC signal. The subsequent waveform design is formulated as a constrained least square problem, which can be efficiently solved. Moreover, we leverage quantization noise for radar probing purposes, rather than treating it as unwanted noise. Numerical results demonstrate that the proposed scheme performs well in both radar probing and downlink precoding.
Published: 2025

12. Convergence Analysis of Real-time Recurrent Learning (RTRL) for a class of Recurrent Neural Networks

Author: Lam, Samuel Chun-Hei, Sirignano, Justin, and Spiliopoulos, Konstantinos
Subjects: Computer Science - Machine Learning, Mathematics - Probability, Statistics - Machine Learning, 68T07 (Primary), 68T05, 60J20 (Secondary)
Abstract: Recurrent neural networks (RNNs) are commonly trained with the truncated backpropagation-through-time (TBPTT) algorithm. For the purposes of computational tractability, the TBPTT algorithm truncates the chain rule and calculates the gradient on a finite block of the overall data sequence. Such approximation could lead to significant inaccuracies, as the block length for the truncated backpropagation is typically limited to be much smaller than the overall sequence length. In contrast, Real-time recurrent learning (RTRL) is an online optimization algorithm which asymptotically follows the true gradient of the loss on the data sequence as the number of sequence time steps $t \rightarrow \infty$. RTRL forward propagates the derivatives of the RNN hidden/memory units with respect to the parameters and, using the forward derivatives, performs online updates of the parameters at each time step in the data sequence. RTRL's online forward propagation allows for exact optimization over extremely long data sequences, although it can be computationally costly for models with large numbers of parameters. We prove convergence of the RTRL algorithm for a class of RNNs. The convergence analysis establishes a fixed point for the joint distribution of the data sequence, RNN hidden layer, and the RNN hidden layer forward derivatives as the number of data samples from the sequence and the number of training steps tend to infinity. We prove convergence of the RTRL algorithm to a stationary point of the loss. Numerical studies illustrate our theoretical results. One potential application area for RTRL is the analysis of financial data, which typically involve long time series and models with small to medium numbers of parameters. This makes RTRL computationally tractable and a potentially appealing optimization method for training models. Thus, we include an example of RTRL applied to limit order book data.
Published: 2025

13. TrustRAG: Enhancing Robustness and Trustworthiness in RAG

Author: Zhou, Huichi, Lee, Kin-Hei, Zhan, Zhonghao, Chen, Yue, Li, Zhenhao, Wang, Zhaoyang, Haddadi, Hamed, and Yilmaz, Emine
Subjects: Computer Science - Computation and Language
Abstract: Retrieval-Augmented Generation (RAG) systems enhance large language models (LLMs) by integrating external knowledge sources, enabling more accurate and contextually relevant responses tailored to user queries. However, these systems remain vulnerable to corpus poisoning attacks that can significantly degrade LLM performance through the injection of malicious content. To address these challenges, we propose TrustRAG, a robust framework that systematically filters compromised and irrelevant contents before they are retrieved for generation. Our approach implements a two-stage defense mechanism: At the first stage, it employs K-means clustering to identify potential attack patterns in retrieved documents using cosine similarity and ROUGE metrics as guidance, effectively isolating suspicious content. Secondly, it performs a self-assessment which detects malicious documents and resolves discrepancies between the model's internal knowledge and external information. TrustRAG functions as a plug-and-play, training-free module that integrates seamlessly with any language model, whether open or closed-source. In addition, TrustRAG maintains high contextual relevance while strengthening defenses against corpus poisoning attacks. Through extensive experimental validation, we demonstrate that TrustRAG delivers substantial improvements in retrieval accuracy, efficiency, and attack resistance compared to existing approaches across multiple model architectures and datasets. We have made TrustRAG available as open-source software at \url{https://github.com/HuichiZhou/TrustRAG}.
Published: 2025

14. Early Childhood Visual Arts Education: Teachers' Content Knowledge, Pedagogical Content Knowledge, and Challenges

Author: Suzannie K. Y. Leung, Joseph Wu, and Tung Hei Ho
Abstract: In the past, visual arts education in Hong Kong was not considered an important area of early childhood education. While the Hong Kong kindergarten curriculum has recently been updated to encourage creativity, there remains a lack of adequate visual arts education for young children. This deficiency stems from the fact that the visual arts receive minimal attention within Hong Kong teacher education programs. Little research has been conducted on how visual arts education is actually delivered in local kindergarten classrooms in Hong Kong and what kinds of artistic knowledge and skills kindergarten teachers need. Therefore, this study aimed to investigate kindergarten teachers' content knowledge and pedagogical content knowledge in early visual arts education (EVAE) and to identify the challenges they faced in teaching visual arts to children. The study surveyed 342 in-service kindergarten teachers in Hong Kong and conducted individual interviews with 12 participants. The findings revealed that Hong Kong kindergarten teachers generally performed well in terms of their pedagogical content knowledge, but they lacked content knowledge in various forms of early visual arts (EVA) and faced challenges in teaching visual arts effectively. This study has the potential to change how early childhood visual arts teaching is conceptualized and taught in Hong Kong and other Asian regions.
Published: 2025
Full Text: View/download PDF

15. Hilbert Transform on Graphs: Let There Be Phase

Author: Chan, Chun Hei Michael, Cionca, Alexandre, and Van De Ville, Dimitri
Subjects: Electrical Engineering and Systems Science - Signal Processing
Abstract: In the past years, many signal processing operations have been successfully adapted to the graph setting. One elegant and effective approach is to exploit the eigendecomposition of a graph shift operator (GSO), such as the adjacency or Laplacian operator, to define a graph Fourier transform when projecting graph signals on the corresponding basis. However, the extension of this scheme to directed graphs is challenging since the associated GSO is non-symmetric and, in general, not diagonalizable. Here, we build upon a recent framework that adds a minimal number of edges to allow diagonalization of the GSO and thus provide a proper graph Fourier transform. Furthermore, we show that such minimal addition of edges creates a cycle cover and that it is essential for the phase analysis of a signal throughout the graph. Concurrently, we propose a generalization of the Hilbert transform interpreted over the newfound cycle cover, which re-establishes intuitions from traditional Hilbert Transform, equivalent to the generalized Hilbert Transform on a single cycle. This generalization leads to a number of simple and elegant recipes to effectively exploit the phase information of graph signals provided by the graph Fourier transform. The feasibility of the approach is demonstrated on several examples., Comment: Submitted to IEEE Signal Processing Letters (4 pages of contents and 1 possible extra reference page)
Published: 2024

16. A versatile method for nano-fabrication on diamond film: flexible diamond metasurfaces as a demonstration

Author: Wang, Yicheng, Jing, Jixiang, Luo, Yumeng, Ma, Linjie, Wang, Zhongqiang, Wang, Qi, Li, Kwai Hei, and Chu, Zhiqin
Subjects: Physics - Optics
Abstract: Diamond exhibits superb performance across a wide range of applications due to its enormous outstanding properties in electronic, photonic and quantum fields. Yet heterogeneous integration of diamond for on-chip functionalities, like 2D materials, remains challenging due to the hard acquisition of scalable, transferable and ultrathin diamond samples. Recently, the edge-exposed exfoliation has been demonstrated as an effective way to produce wafer-scale, freestanding and ultrathin diamond films. However, the incompatibility of the newly developed diamond film with conventional nano-fabrication methods makes it difficult to fabricate diamond film into practical devices. Herein, we demonstrate the mask-transferring by sugar as a versatile method for pattern-definition on diamond films, which shows excellent geometrical resolution and accuracy comparing to conventional approaches. Additionally, based on this method, the flexible all-diamond metasurfaces functioning as structural colors have been achieved, which indicates its huge potential for fabricating more diamond-related devices.
Published: 2024

17. Gamma-Convergence and Asymptotic Analysis for a Diffuse Domain Problem with Transmission Boundary Conditions

Author: Luong, Toai, Mengesha, Tadele, Wise, Steven M., and Wong, Ming Hei
Subjects: Mathematics - Analysis of PDEs, Mathematics - Numerical Analysis
Abstract: Diffuse domain methods (DDMs) have garnered significant attention for approximating solutions to partial differential equations on complex geometries. These methods implicitly represent the geometry by replacing the sharp boundary interface with a diffuse layer of thickness $\varepsilon$, which scales with the minimum grid size. This approach reformulates the original equations on an extended regular domain, incorporating boundary conditions through singular source terms. In this work, we conduct a matched asymptotic analysis of a DDM for a two-sided problem with transmission Robin boundary conditions. Our results show that, in one dimension, the solution of the diffuse domain approximation asymptotically converges to the solution of the original problem, with exactly first-order accuracy in $\varepsilon$. We provide numerical simulations that validate and illustrate the analytical result. Furthermore, for the Neumann boundary condition case, we show that the associated energy functional of the diffuse domain approximation $\Gamma-$convergences to the energy functional of the original problem, and the solution of the diffuse domain approximation strongly converges, up to a subsequence, to the solution of the original problem in $H^1(\Omega)$, as $\varepsilon \to 0$.
Published: 2024

18. Learning to Correction: Explainable Feedback Generation for Visual Commonsense Reasoning Distractor

Author: Chen, Jiali, Hei, Xusen, Xue, Yuqi, Wei, Yuancheng, Xie, Jiayuan, Cai, Yi, and Li, Qing
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, Computer Science - Computation and Language
Abstract: Large multimodal models (LMMs) have shown remarkable performance in the visual commonsense reasoning (VCR) task, which aims to answer a multiple-choice question based on visual commonsense within an image. However, the ability of LMMs to correct potential visual commonsense errors in the distractor upon their occurrence is yet under-explored. Drawing inspiration from how a human teacher crafts challenging distractors to test students' comprehension of the concepts or skills and assists them in identifying and correcting errors toward the answer, we are the pioneering research for LMMs to simulate this error correction process. To this end, we employ GPT-4 as a ``teacher'' to collect the explainable feedback dataset VCR-DF for error correction, which serves as a benchmark to evaluate the ability of LMMs to identify misconceptions and clarify reasons behind the error in VCR distractors toward final answers. In addition, we propose an LMM-based Pedagogical Expert Instructed Feedback Generation (PEIFG) model to incorporate the learnable expert prompts and multimodal instruction as guidance for feedback generation. Experimental results show that our PEIFG significantly outperforms existing LMMs. We believe that our benchmark provides a new direction for evaluating the capabilities of LMMs., Comment: Accepted by ACM MM 2024
Published: 2024
Full Text: View/download PDF

19. Modular addition without black-boxes: Compressing explanations of MLPs that compute numerical integration

Author: Yip, Chun Hei, Agrawal, Rajashree, Chan, Lawrence, and Gross, Jason
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: The goal of mechanistic interpretability is discovering simpler, low-rank algorithms implemented by models. While we can compress activations into features, compressing nonlinear feature-maps -- like MLP layers -- is an open problem. In this work, we present the first case study in rigorously compressing nonlinear feature-maps, which are the leading asymptotic bottleneck to compressing small transformer models. We work in the classic setting of the modular addition models, and target a non-vacuous bound on the behaviour of the ReLU MLP in time linear in the parameter-count of the circuit. To study the ReLU MLP analytically, we use the infinite-width lens, which turns post-activation matrix multiplications into approximate integrals. We discover a novel interpretation of} the MLP layer in one-layer transformers implementing the ``pizza'' algorithm: the MLP can be understood as evaluating a quadrature scheme, where each neuron computes the area of a rectangle under the curve of a trigonometric integral identity. Our code is available at https://tinyurl.com/mod-add-integration.
Published: 2024

20. Revisiting Atomic Patterns for Elliptic Curve Scalar Multiplication Revealing Inherent Vulnerability to Simple SCA

Author: Sigourou, Alkistis Aikaterini, Dyka, Zoya, Li, Sze Hei, Langendoerfer, Peter, and Kabin, Ievgen
Subjects: Computer Science - Cryptography and Security
Abstract: Elliptic Curve Scalar Multiplication denoted as kP operation is the basic operation in all Elliptic Curve based cryptographic protocols. The atomicity principle and different atomic patterns for kP algorithms were proposed in the past as countermeasures against simple side-channel analysis. In this work, we investigated the resistance of a kP algorithm implemented in hardware using Longa's atomic patterns. We analysed its simulated power trace. We show in the example of our kP implementation for the NIST EC P-256 that the field squaring operations are distinguishable from the field multiplications even if they are performed by the same field multiplier, due to the addressing of the second multiplicand. This inherent vulnerability of atomic patterns can be successfully exploited for revealing the scalar k.
Published: 2024

21. Wall-Proximity Matters: Understanding the Effect of Device Placement with Respect to the Wall for Indoor Wireless Sensing

Author: Wang, He, Ge, Yunpeng, and Ho, Ivan Wang-Hei
Subjects: Computer Science - Networking and Internet Architecture
Abstract: Wi-Fi sensing has been extensively explored for various applications, including vital sign monitoring, human activity recognition, indoor localization, and tracking. However, practical implementation in real-world scenarios is hindered by unstable sensing performance and limited knowledge of wireless sensing coverage. While previous works have aimed to address these challenges, they have overlooked the impact of walls on sensing capabilities in indoor environments. To fill this gap, we present a theoretical model that accounts for the effect of wall-device distance on sensing coverage. By incorporating both the wall-reflected path and the line-of-sight (LoS) path, we develop a comprehensive sensing coverage model tailored for indoor environments. This model demonstrates that strategically deploying the transmitter and receiver in proximity to the wall within a specific range can significantly expand sensing coverage. We assess the properties of our model through experiments in respiratory monitoring and stationary crowd counting applications, showcasing a notable 11.2% improvement in counting accuracy. These findings pave the way for optimized deployment strategies in Wi-Fi sensing, facilitating more effective and accurate sensing solutions across various applications., Comment: This work has been submitted to the IEEE for possible publication
Published: 2024

22. BiCSI: A Binary Encoding and Fingerprint-Based Matching Algorithm for Wi-Fi Indoor Positioning

Author: Tang, Pei, Guo, Jingtao, and Ho, Ivan Wang-Hei
Subjects: Electrical Engineering and Systems Science - Signal Processing, Computer Science - Information Theory
Abstract: Traditional global positioning systems often underperform indoors, whereas Wi-Fi has become an effective medium for various radio sensing services. Specifically, utilizing channel state information (CSI) from Wi-Fi networks provides a non-contact method for precise indoor positioning; yet, accurately interpreting the complex CSI matrix to develop a reliable strategy for physical similarity measurement remains challenging. This paper presents BiCSI, which merges binary encoding with fingerprint-based techniques to improve position matching for detecting semi-stationary targets. Inspired by gene sequencing processes, BiCSI initially converts CSI matrices into binary sequences and employs Hamming distances to evaluate signal similarity. The results show that BiCSI achieves an average accuracy above 98% and a mean absolute error (MAE) of less than three centimeters, outperforming algorithms directly dependent on physical measurements by at least two-fold. Moreover, the proposed method for extracting feature vectors from CSI matrices as fingerprints significantly reduces data storage requirements to the kilobyte range, far below the megabytes typically required by conventional machine learning models. Additionally, the results demonstrate that the proposed algorithm adapts well to multiple physical similarity metrics, and remains robust over different time periods, enhancing its utility and versatility in various scenarios., Comment: 10 pages, 14 figures, this article was submitted to IEEE for possible publication
Published: 2024

23. Generative LiDAR Editing with Controllable Novel Object Layouts

Author: Ho, Shing-Hei, Thach, Bao, and Zhu, Minghan
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, Computer Science - Machine Learning, Computer Science - Robotics
Abstract: We propose a framework to edit real-world Lidar scans with novel object layouts while preserving a realistic background environment. Compared to the synthetic data generation frameworks where Lidar point clouds are generated from scratch, our framework focuses on new scenario generation in a given background environment, and our method also provides labels for the generated data. This approach ensures the generated data remains relevant to the specific environment, aiding both the development and the evaluation of algorithms in real-world scenarios. Compared with novel view synthesis, our framework allows the creation of counterfactual scenarios with significant changes in the object layout and does not rely on multi-frame optimization. In our framework, the object removal and insertion are supported by generative background inpainting and object point cloud completion, and the entire pipeline is built upon spherical voxelization, which realizes the correct Lidar projective geometry by construction. Experiments show that our framework generates realistic Lidar scans with object layout changes and benefits the development of Lidar-based self-driving systems., Comment: Submitted to IEEE International Conference on Robotics and Automation (ICRA). 6 pages, 7 figures
Published: 2024

24. On the Adversarial Robustness of Instruction-Tuned Large Language Models for Code

Author: Hossen, Md Imran and Hei, Xiali
Subjects: Computer Science - Software Engineering, Computer Science - Cryptography and Security
Abstract: The advent of instruction-tuned Large Language Models designed for coding tasks (Code LLMs) has transformed software engineering practices. However, their robustness against various input challenges remains a critical concern. This study introduces DegradePrompter, a novel method designed to systematically evaluate the robustness of instruction-tuned Code LLMs. We assess the impact of diverse input challenges on the functionality and correctness of generated code using rigorous metrics and established benchmarks. Our comprehensive evaluation includes five state-of-the-art open-source models and three production-grade closed-source models, revealing varying degrees of robustness. Open-source models demonstrate an increased susceptibility to input perturbations, resulting in declines in functional correctness ranging from 12% to 34%. In contrast, commercial models demonstrate relatively greater resilience, with performance degradation ranging from 3% to 24%. To enhance the robustness of the models against these vulnerabilities, we investigate a straightforward yet effective mitigation strategy. Our findings highlight the need for robust defense mechanisms and comprehensive evaluations during both the development and deployment phases to ensure the resilience and reliability of automated code generation systems.
Published: 2024

25. Personalized Generative AI in VR for Enhanced Engagement: Eye-Tracking Insights into Cultural Heritage Learning through Neapolitan Pizza Making

Author: Lau, Ka Hei Carrie, Sen, Sema, Stark, Philipp, Bozkir, Efe, and Kasneci, Enkelejda
Subjects: Computer Science - Human-Computer Interaction
Abstract: Virtual Reality (VR) and Generative Artificial Intelligence (Gen-AI) are transforming personalized learning, particularly in intangible cultural heritage (ICH) education. However, designing immersive experiences that enhance engagement without overwhelming learners presents a challenge. This study examines the impact of personalized AI narration on user engagement and attention in a VR environment through eye-tracking metrics. In a controlled experiment with 54 participants, we explored three levels of personalization (high, moderate, none) in a Neapolitan pizza-making task, measuring attention and cognitive load through fixation duration, saccade duration, and pupil diameter. Results indicate that high personalization increased engagement by 64.1% over no personalization (p < 0.001). Furthermore, regression analysis reveals specific eye-tracking metrics significantly predict gameplay duration, underscoring eye-tracking's potential to capture real-time engagement. These findings support the use of eye-tracking to inform the development of adaptive VR learning experiences. Future work may integrate subjective assessments to better understand users' underlying motivations.
Published: 2024

26. Deep Fourier-embedded Network for RGB and Thermal Salient Object Detection

Author: Lyu, Pengfei, Yeung, Pak-Hei, Yu, Xiaosheng, Wu, Chengdong, and Rajapakse, Jagath C.
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: The rapid development of deep learning has significantly improved salient object detection (SOD) combining both RGB and thermal (RGB-T) images. However, existing deep learning-based RGB-T SOD models suffer from two major limitations. First, Transformer-based models with quadratic complexity are computationally expensive and memory-intensive, limiting their application in high-resolution bi-modal feature fusion. Second, even when these models converge to an optimal solution, there remains a frequency gap between the prediction and ground-truth. To overcome these limitations, we propose a purely Fourier transform-based model, namely Deep Fourier-Embedded Network (DFENet), for accurate RGB-T SOD. To address the computational complexity when dealing with high-resolution images, we leverage the efficiency of fast Fourier transform with linear complexity to design three key components: (1) the Modal-coordinated Perception Attention, which fuses RGB and thermal modalities with enhanced multi-dimensional representation; (2) the Frequency-decomposed Edge-aware Block, which clarifies object edges by deeply decomposing and enhancing frequency components of low-level features; and (3) the Fourier Residual Channel Attention Block, which prioritizes high-frequency information while aligning channel-wise global relationships. To mitigate the frequency gap, we propose Co-focus Frequency Loss, which dynamically weights hard frequencies during edge frequency reconstruction by cross-referencing bi-modal edge information in the Fourier domain. Extensive experiments on four RGB-T SOD benchmark datasets demonstrate that DFENet outperforms fifteen existing state-of-the-art RGB-T SOD models. Comprehensive ablation studies further validate the value and effectiveness of our newly proposed components. The code is available at https://github.com/JoshuaLPF/DFENet., Comment: 12 pages, 13 figures. Submitted to Journal on April 29, 2024
Published: 2024

27. A Hybrid Scheme for Fuzzy Dark Matter Simulations Combining the Schr\'odinger and Hamilton-Jacobi-Madelung Equations

Author: Kunkel, Alexander, Chan, Hei Yin Jowett, Schive, Hsi-Yu, Huang, Hsinhao, and Liao, Pin-Yu
Subjects: Astrophysics - Instrumentation and Methods for Astrophysics
Abstract: This paper introduces a hybrid numerical scheme for the fuzzy dark matter model: It combines a wave-based approach to solve the Schr\"odinger equation using Fourier continuations with Gram polynomials and a fluid-based approach to solve the Hamilton-Jacobi-Madelung equations. This hybrid scheme facilitates zoom-in simulations for cosmological volumes beyond the capabilities of wave-based solvers alone and accurately simulates the full nonlinear dynamics of fuzzy dark matter. We detail the implementation of a Hamilton-Jacobi-Madelung solver, the methodology for phase matching at fluid-wave boundaries, the development of a local pseudospectral wave solver based on Fourier continuations, new grid refinement criteria for both fluid and wave solvers, an interpolation algorithm based on Fourier continuations, and the integration of these building blocks into the adaptive mesh refinement code GAMER. The superiority of the scheme is demonstrated through various performance and accuracy tests, tracking the linear power spectrum evolution in a 10 Mpc/h box, and a hybrid cosmological simulation in a 5.6 Mpc/h box. The corresponding code is published as part of the GAMER project on https://github.com/gamer-project/gamer., Comment: 25 pages, 18 figures, submitted to ApJS
Published: 2024

28. Practitioner Paper: Decoding Intellectual Property: Acoustic and Magnetic Side-channel Attack on a 3D Printer

Author: Jamarani, Amirhossein, Tu, Yazhou, and Hei, Xiali
Subjects: Computer Science - Cryptography and Security
Abstract: The widespread accessibility and ease of use of additive manufacturing (AM), widely recognized as 3D printing, has put Intellectual Property (IP) at great risk of theft. As 3D printers emit acoustic and magnetic signals while printing, the signals can be captured and analyzed using a smartphone for the purpose of IP attack. This is an instance of physical-to-cyber exploitation, as there is no direct contact with the 3D printer. Although cyber vulnerabilities in 3D printers are becoming more apparent, the methods for protecting IPs are yet to be fully investigated. The threat scenarios in previous works have mainly rested on advanced recording devices for data collection and entailed placing the device very close to the 3D printer. However, our work demonstrates the feasibility of reconstructing G-codes by performing side-channel attacks on a 3D printer using a smartphone from greater distances. By training models using Gradient Boosted Decision Trees, our prediction results for each axial movement, stepper, nozzle, and rotor speed achieve high accuracy, with a mean of 98.80%, without any intrusiveness. We effectively deploy the model in a real-world examination, achieving a Mean Tendency Error (MTE) of 4.47% on a plain G-code design., Comment: 22 pages, 14 figures, EAI SmartSP 2024 - 2nd EAI International Conference on Security and Privacy in Cyber-Physical Systems and Smart Vehicles
Published: 2024

29. Nonreciprocal interaction and entanglement between two superconducting qubits

Author: Ren, Yu-Meng, Pan, Xue-Feng, Yao, Xiao-Yu, Huo, Xiao-Wen, Zheng, Jun-Cong, Hei, Xin-Lei, Qiao, Yi-Fan, and Li, Peng-Bo
Subjects: Quantum Physics
Abstract: Nonreciprocal interaction between two spatially separated subsystems plays a crucial role in signal processing and quantum networks. Here, we propose an efficient scheme to achieve nonreciprocal interaction and entanglement between two qubits by combining coherent and dissipative couplings in a superconducting platform, where two coherently coupled transmon qubits simultaneously interact with a transmission line waveguide. The coherent interaction between the transmon qubits can be achieved via capacitive coupling or via an intermediary cavity mode, while the dissipative interaction is induced by the transmission line via reservoir engineering. With high tunability of superconducting qubits, their positions along the transmission line can be adjusted to tune the dissipative coupling, enabling to tailor reciprocal and nonreciprocal interactions between the qubits. A fully nonreciprocal interaction can be achieved when the separation between the two qubits is $(4n+3)\lambda_{0} /4$, where $n$ is an integer and $\lambda_{0}$ is the photon wavelength. This nonreciprocal interaction enables the generation of nonreciprocal entanglement between the two transmon qubits. Furthermore, applying a drive field to one of the qubit can stabilize the system into a nonreciprocal steady-state entangled state. Remarkably, the nonreciprocal interaction in this work does not rely on the presence of nonlinearity or complex configurations, which has more potential applications in designing nonreciprocal quantum devices, processing quantum information, and building quantum networks., Comment: 11 pages, 7 figures
Published: 2024

30. CUIfy the XR: An Open-Source Package to Embed LLM-powered Conversational Agents in XR

Author: Buldu, Kadir Burak, Özdel, Süleyman, Lau, Ka Hei Carrie, Wang, Mengdi, Saad, Daniel, Schönborn, Sofie, Boch, Auxane, Kasneci, Enkelejda, and Bozkir, Efe
Subjects: Computer Science - Human-Computer Interaction, Computer Science - Artificial Intelligence
Abstract: Recent developments in computer graphics, machine learning, and sensor technologies enable numerous opportunities for extended reality (XR) setups for everyday life, from skills training to entertainment. With large corporations offering affordable consumer-grade head-mounted displays (HMDs), XR will likely become pervasive, and HMDs will develop as personal devices like smartphones and tablets. However, having intelligent spaces and naturalistic interactions in XR is as important as technological advances so that users grow their engagement in virtual and augmented spaces. To this end, large language model (LLM)--powered non-player characters (NPCs) with speech-to-text (STT) and text-to-speech (TTS) models bring significant advantages over conventional or pre-scripted NPCs for facilitating more natural conversational user interfaces (CUIs) in XR. This paper provides the community with an open-source, customizable, extendable, and privacy-aware Unity package, CUIfy, that facilitates speech-based NPC-user interaction with widely used LLMs, STT, and TTS models. Our package also supports multiple LLM-powered NPCs per environment and minimizes latency between different computational models through streaming to achieve usable interactions between users and NPCs. We publish our source code in the following repository: https://gitlab.lrz.de/hctl/cuify, Comment: 7th IEEE International Conference on Artificial Intelligence & eXtended and Virtual Reality (IEEE AIxVR 2025)
Published: 2024

31. Efficient Fourier Filtering Network with Contrastive Learning for UAV-based Unaligned Bi-modal Salient Object Detection

Author: Lyu, Pengfei, Yeung, Pak-Hei, Cheng, Xiufei, Yu, Xiaosheng, Wu, Chengdong, and Rajapakse, Jagath C.
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Unmanned aerial vehicle (UAV)-based bi-modal salient object detection (BSOD) aims to segment salient objects in a scene utilizing complementary cues in unaligned RGB and thermal image pairs. However, the high computational expense of existing UAV-based BSOD models limits their applicability to real-world UAV devices. To address this problem, we propose an efficient Fourier filter network with contrastive learning that achieves both real-time and accurate performance. Specifically, we first design a semantic contrastive alignment loss to align the two modalities at the semantic level, which facilitates mutual refinement in a parameter-free way. Second, inspired by the fast Fourier transform that obtains global relevance in linear complexity, we propose synchronized alignment fusion, which aligns and fuses bi-modal features in the channel and spatial dimensions by a hierarchical filtering mechanism. Our proposed model, AlignSal, reduces the number of parameters by 70.0%, decreases the floating point operations by 49.4%, and increases the inference speed by 152.5% compared to the cutting-edge BSOD model (i.e., MROS). Extensive experiments on the UAV RGB-T 2400 and three weakly aligned datasets demonstrate that AlignSal achieves both real-time inference speed and better performance and generalizability compared to sixteen state-of-the-art BSOD models across most evaluation metrics. In addition, our ablation studies further verify AlignSal's potential in boosting the performance of existing aligned BSOD models on UAV-based unaligned data. The code is available at: https://github.com/JoshuaLPF/AlignSal., Comment: 11 pages, 7 figures
Published: 2024

32. RSSI-Assisted CSI-Based Passenger Counting with Multiple Wi-Fi Receivers

Author: Guo, Jingtao, Zhuang, Wenhao, Mao, Yuyi, and Ho, Ivan Wang-Hei
Subjects: Electrical Engineering and Systems Science - Signal Processing, Computer Science - Machine Learning
Abstract: Passenger counting is crucial for public transport vehicle scheduling and traffic capacity evaluation. However, most existing methods are either costly or with low counting accuracy, leading to the recent use of Wi-Fi signals for this purpose. In this paper, we develop an efficient edge computing-based passenger counting system consists of multiple Wi-Fi receivers and an edge server. It leverages channel state information (CSI) and received signal strength indicator (RSSI) to facilitate the collaboration among multiple receivers. Specifically, we design a novel CSI feature fusion module called Adaptive RSSI-weighted CSI Feature Concatenation, which integrates locally extracted CSI and RSSI features from multiple receivers for information fusion at the edge server. Performance of our proposed system is evaluated using a real-world dataset collected from a double-decker bus in Hong Kong, with up to 20 passengers. The experimental results reveal that our system achieves an average accuracy and F1-score of over 94%, surpassing other cooperative sensing baselines by at least 2.27% in accuracy and 2.34% in F1-score., Comment: 6 pages, 9 figures, this article was submitted to IEEE for possible publication
Published: 2024

33. PCF-Lift: Panoptic Lifting by Probabilistic Contrastive Fusion

Author: Zhu, Runsong, Qiu, Shi, Wu, Qianyi, Hui, Ka-Hei, Heng, Pheng-Ann, and Fu, Chi-Wing
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Panoptic lifting is an effective technique to address the 3D panoptic segmentation task by unprojecting 2D panoptic segmentations from multi-views to 3D scene. However, the quality of its results largely depends on the 2D segmentations, which could be noisy and error-prone, so its performance often drops significantly for complex scenes. In this work, we design a new pipeline coined PCF-Lift based on our Probabilis-tic Contrastive Fusion (PCF) to learn and embed probabilistic features throughout our pipeline to actively consider inaccurate segmentations and inconsistent instance IDs. Technical-wise, we first model the probabilistic feature embeddings through multivariate Gaussian distributions. To fuse the probabilistic features, we incorporate the probability product kernel into the contrastive loss formulation and design a cross-view constraint to enhance the feature consistency across different views. For the inference, we introduce a new probabilistic clustering method to effectively associate prototype features with the underlying 3D object instances for the generation of consistent panoptic segmentation results. Further, we provide a theoretical analysis to justify the superiority of the proposed probabilistic solution. By conducting extensive experiments, our PCF-lift not only significantly outperforms the state-of-the-art methods on widely used benchmarks including the ScanNet dataset and the challenging Messy Room dataset (4.4% improvement of scene-level PQ), but also demonstrates strong robustness when incorporating various 2D segmentation models or different levels of hand-crafted noise., Comment: ECCV 2024. The code is publicly available at https://github.com/Runsong123/PCF-Lift
Published: 2024

34. Records from the S-Matrix Marathon: A Timeless History of Time

Author: Lee, Mang Hei Gordon, Pajer, Enrico, Giroux, Mathieu, Hannesdottir, Holmfridur S., Mizera, Sebastian, and Pasiecznik, Celina
Subjects: High Energy Physics - Theory
Abstract: By directly probing the initial conditions of our universe, cosmological surveys offer us a unique observational handle on quantum field theory in curved spacetime with dynamical gravity and might even allow us to glean information about a full theory of quantum gravity. Here we report on recent progress to study the natural observables in the problem, namely cosmological correlators. After setting the stage, we review results from three different approaches. First, we present the in-out formalism as an interesting alternative to the well-known in-in formalism and stress some of its advantages, such as the derivation of recursion relations, correlators cutting rules and a proposal for a de Sitter scattering matrix. Second, we tackle the important open problem of constructing effective theories in curved spacetime, which generally requires an open quantum system approach. Third, we provide an executive summary of general properties of the field-theoretic wavefunction that follow from symmetries, unitarity, causality and locality. We describe how these properties can be leveraged to bootstrap all tree-level results and we discuss loop contributions. These notes are based on a series of lectures held during the S-Matrix Marathon workshop at the Institute for Advanced Study on 11-22 March 2024., Comment: A chapter to be published by Springer Lecture Notes in Physics
Published: 2024

35. Internalizing ASR with Implicit Chain of Thought for Efficient Speech-to-Speech Conversational LLM

Author: Yuen, Robin Shing-Hei, Tse, Timothy Tin-Long, and Zhu, Jian
Subjects: Computer Science - Computation and Language
Abstract: Current speech-based LLMs are predominantly trained on extensive ASR and TTS datasets, excelling in tasks related to these domains. However, their ability to handle direct speech-to-speech conversations remains notably constrained. These models often rely on an ASR-to-TTS chain-of-thought pipeline, converting speech into text for processing before generating audio responses, which introduces latency and loses audio features. We propose a method that implicitly internalizes ASR chain of thought into a speech LLM, enhancing its native speech understanding capabilities. Our approach reduces latency and improves the model's native understanding of speech, paving the way for more efficient and natural real-time audio interactions. We also release a large-scale synthetic conversational dataset to facilitate further research., Comment: Updated for reviewer comments
Published: 2024

36. Wrapped in Anansi's Web: Unweaving the Impacts of Generative-AI Personalization and VR Immersion in Oral Storytelling

Author: Lau, Ka Hei Carrie, Yun, Bhada, Saruba, Samuel, Bozkir, Efe, and Kasneci, Enkelejda
Subjects: Computer Science - Human-Computer Interaction
Abstract: Oral traditions, vital to cultural identity, are losing relevance among youth due to the dominance of modern media. This study addresses the revitalization of these traditions by reconnecting young people with folklore. We introduce Anansi the Spider VR, a novel virtual space that combines first-person virtual reality (VR) with generative artificial intelligence (Gen-AI)-driven narrative personalization. This space immerses users in the Anansi Spider story, empowering them to influence the narrative as they envision themselves as the `protagonists,' thereby enhancing personal reflection. In a 2 by 2 between-subjects study with 48 participants, we employed a mixed-method approach to measure user engagement and changes in interest, complemented by semi-structured interviews providing qualitative insights into personalization and immersion. Our results indicate that personalization in VR significantly boosts engagement and cultural learning interest. We recommend that future studies using VR and Gen-AI to revitalize oral storytelling prioritize respecting cultural integrity and honoring original storytellers and communities.
Published: 2024

37. Practical Investigation on the Distinguishability of Longa's Atomic Patterns

Author: Li, Sze Hei, Dyka, Zoya, Sigourou, Alkistis Aikaterini, Langendoerfer, Peter, and Kabin, Ievgen
Subjects: Computer Science - Cryptography and Security
Abstract: This paper investigates the distinguishability of the atomic patterns for elliptic curve point doubling and addition operations proposed by Longa. We implemented a binary elliptic curve scalar multiplication kP algorithm with Longa's atomic patterns for the NIST elliptic curve P-256 using the open-source cryptographic library FLECC in C. We measured and analysed an electromagnetic trace of a single kP execution on a microcontroller (TI Launchpad F28379 board). Due to various technical limitations, significant differences in the execution time and the shapes of the atomic blocks could not be determined. Further investigations of the side channel analysis-resistance can be performed based on this work. Last but not least, we examined and corrected Longa's atomic patterns corresponding to formulae proposed by Longa.
Published: 2024

38. The central limit theorem for entries of random matrices with specific rank over finite fields

Author: Chan, Chin Hei and Xiong, Maosheng
Subjects: Mathematics - Number Theory, Mathematics - Combinatorics, 15B52, 11T99, 05C50, 60F05
Abstract: Let $\mathbb{F}_q$ be the finite field of order $q$, and $\mathcal{A}$ a non-empty proper subset of $\mathbb{F}_q$. Let $\mathbf{M}$ be a random $m \times n$ matrix of rank $r$ over $\mathbb{F}_q$ taken with uniform distribution. It was proved recently by Sanna that as $m,n \to \infty$ and $r,q,\mathcal{A}$ are fixed, the number of entries of $\mathbf{M}$ in $\mathcal{A}$ approaches a normal distribution. The question was raised as to whether or not one can still obtain a central limit theorem of some sort when $r$ goes to infinity in a way controlled by $m$ and $n$. In this paper we answer this question affirmatively.
Published: 2024

39. Securing the Future: Exploring Privacy Risks and Security Questions in Robotic Systems

Author: Afroze, Diba, Tu, Yazhou, and Hei, Xiali
Subjects: Computer Science - Robotics, Computer Science - Human-Computer Interaction
Abstract: The integration of artificial intelligence, especially large language models in robotics, has led to rapid advancements in the field. We are now observing an unprecedented surge in the use of robots in our daily lives. The development and continual improvements of robots are moving at an astonishing pace. Although these remarkable improvements facilitate and enhance our lives, several security and privacy concerns have not been resolved yet. Therefore, it has become crucial to address the privacy and security threats of robotic systems while improving our experiences. In this paper, we aim to present existing applications and threats of robotics, anticipated future evolution, and the security and privacy issues they may imply. We present a series of open questions for researchers and practitioners to explore further., Comment: 11 pages, Conference Paper
Published: 2024
Full Text: View/download PDF

40. Scalable Reshaping of Diamond Particles via Programmable Nanosculpting

Author: Zhang, Tongtong, Sun, Fuqiang, Wang, Yaorong, Li, Yingchi, Wang, Jing, Wang, Zhongqiang, Li, Kwai Hei, Zhu, Ye, Wang, Qi, Shao, Lei, Wong, Ngai, Lei, Dangyuan, Lin, Yuan, and Chu, Zhiqin
Subjects: Condensed Matter - Materials Science, Condensed Matter - Mesoscale and Nanoscale Physics
Abstract: Diamond particles have many interesting properties and possible applications. However, producing diamond particles with well-defined shapes at scale is challenging because diamonds are chemically inert and extremely hard. Here, we show air oxidation, a routine method for purifying diamonds, can be used to precisely shape diamond particles at scale. By exploiting the distinct reactivities of different crystal facets and defects inside the diamond, layer-by-layer outward-to-inward and inward-to-outward oxidation produced diverse diamond shapes including sphere, twisted surface, pyramidal islands, inverted pyramids, nano-flowers, and hollow polygons. The nanosculpted diamonds had more and finer features that enabled them to outperform the original raw diamonds in various applications. Using experimental observations and Monte Carlo simulations, we built a shape library that guides the design and fabrication of diamond particles with well-defined shapes and functional value. Our study presents a simple, economical and scalable way to produce shape-customized diamonds for various photonics, catalysis, quantum and information technology applications.
Published: 2024
Full Text: View/download PDF

41. Exploring code portability solutions for HEP with a particle tracking test code

Author: Ather, Hammad, Berkman, Sophie, Cerati, Giuseppe, Kortelainen, Matti, Kwok, Ka Hei Martin, Lantz, Steven, Lee, Seyong, Norris, Boyana, Reid, Michael, Hall, Allison Reinsvold, Riley, Daniel, Strelchenko, Alexei, and Wang, Cong
Subjects: High Energy Physics - Experiment
Abstract: Traditionally, high energy physics (HEP) experiments have relied on x86 CPUs for the majority of their significant computing needs. As the field looks ahead to the next generation of experiments such as DUNE and the High-Luminosity LHC, the computing demands are expected to increase dramatically. To cope with this increase, it will be necessary to take advantage of all available computing resources, including GPUs from different vendors. A broad landscape of code portability tools -- including compiler pragma-based approaches, abstraction libraries, and other tools -- allow the same source code to run efficiently on multiple architectures. In this paper, we use a test code taken from a HEP tracking algorithm to compare the performance and experience of implementing different portability solutions.
Published: 2024

42. Distinguishability Investigation on Longa's Atomic Patterns when used as a Basis for Implementing Elliptic Curve Scalar Multiplication Algorithms

Author: Li, Sze Hei
Subjects: Computer Science - Cryptography and Security
Abstract: In the evolving landscape of cryptographic security, the robustness of Elliptic Curve Cryptography (ECC) against side-channel analysis (SCA) attacks is of paramount importance due to the widespread use of ECC and the growing sophistication of SCAs. This thesis delves into the investigation of Longa's atomic patterns applied within Elliptic Curve scalar multiplication algorithms, assessing their resistance to horizontal SCAs. The research employs these atomic patterns in practical implementation on a microcontroller (Texas Instruments Launchpad F28379 board) using the open-source cryptographic library FLECC in C. In our analysis, we only focused on the distinguishability of the first atomic block in the Elliptic Curve point doubling and point addition patterns. Due to various technical limitations, we were unable to determine significant differences in the execution time and the shapes of the atomic blocks. Further investigations of the SCA-resistance can be performed based on this work. A significant contribution of this work is the identification and correction of several discrepancies in Longa's original atomic patterns. This thesis marks the first practical implementation of Longa's patterns, extending the theoretical research into empirical analysis., Comment: MSc thesis
Published: 2024

43. Dynamic Decoupling of Placid Terminal Attractor-based Gradient Descent Algorithm

Author: Zhao, Jinwei, Gori, Marco, Betti, Alessandro, Melacci, Stefano, Zhang, Hongtao, Liu, Jiedong, and Hei, Xinhong
Subjects: Computer Science - Machine Learning
Abstract: Gradient descent (GD) and stochastic gradient descent (SGD) have been widely used in a large number of application domains. Therefore, understanding the dynamics of GD and improving its convergence speed is still of great importance. This paper carefully analyzes the dynamics of GD based on the terminal attractor at different stages of its gradient flow. On the basis of the terminal sliding mode theory and the terminal attractor theory, four adaptive learning rates are designed. Their performances are investigated in light of a detailed theoretical investigation, and the running times of the learning procedures are evaluated and compared. The total times of their learning processes are also studied in detail. To evaluate their effectiveness, various simulation results are investigated on a function approximation problem and an image classification problem., Comment: 8 pages, 4 figures
Published: 2024

44. Randomized low-rank Runge-Kutta methods

Author: Lam, Hei Yin, Ceruti, Gianluca, and Kressner, Daniel
Subjects: Mathematics - Numerical Analysis, 65F30, 68W20
Abstract: This work proposes and analyzes a new class of numerical integrators for computing low-rank approximations to solutions of matrix differential equation. We combine an explicit Runge-Kutta method with repeated randomized low-rank approximation to keep the rank of the stages limited. The so-called generalized Nystr\"om method is particularly well suited for this purpose; it builds low-rank approximations from random sketches of the discretized dynamics. In contrast, all existing dynamical low-rank approximation methods are deterministic and usually perform tangent space projections to limit rank growth. Using such tangential projections can result in larger error compared to approximating the dynamics directly. Moreover, sketching allows for increased flexibility and efficiency by choosing structured random matrices adapted to the structure of the matrix differential equation. Under suitable assumptions, we establish moment and tail bounds on the error of our randomized low-rank Runge-Kutta methods. When combining the classical Runge-Kutta method with generalized Nystr\"om, we obtain a method called Rand RK4, which exhibits fourth-order convergence numerically -- up to the low-rank approximation error. For a modified variant of Rand RK4, we also establish fourth-order convergence theoretically. Numerical experiments for a range of examples from the literature demonstrate that randomized low-rank Runge-Kutta methods compare favorably with two popular dynamical low-rank approximation methods, in terms of robustness and speed of convergence., Comment: 27 pages
Published: 2024

45. ELO-Rated Sequence Rewards: Advancing Reinforcement Learning Models

Author: Ju, Qi, Hei, Falin, Fang, Zhemei, and Luo, Yunfeng
Subjects: Computer Science - Machine Learning
Abstract: Reinforcement Learning (RL) is highly dependent on the meticulous design of the reward function. However, accurately assigning rewards to each state-action pair in Long-Term RL (LTRL) challenges is formidable. Consequently, RL agents are predominantly trained with expert guidance. Drawing on the principles of ordinal utility theory from economics, we propose a novel reward estimation algorithm: ELO-Rating based RL (ERRL). This approach is distinguished by two main features. Firstly, it leverages expert preferences over trajectories instead of cardinal rewards (utilities) to compute the ELO rating of each trajectory as its reward. Secondly, a new reward redistribution algorithm is introduced to mitigate training volatility in the absence of a fixed anchor reward. Our method demonstrates superior performance over several leading baselines in long-term scenarios (extending up to 5000 steps), where conventional RL algorithms falter. Furthermore, we conduct a thorough analysis of how expert preferences affect the outcomes.
Published: 2024

46. Mirror contrastive loss based sliding window transformer for subject-independent motor imagery based EEG signal recognition

Author: Luo, Jing, Mao, Qi, Shi, Weiwei, Shi, Zhenghao, Wang, Xiaofan, Lu, Xiaofeng, and Hei, Xinhong
Subjects: Electrical Engineering and Systems Science - Signal Processing, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: While deep learning models have been extensively utilized in motor imagery based EEG signal recognition, they often operate as black boxes. Motivated by neurological findings indicating that the mental imagery of left or right-hand movement induces event-related desynchronization (ERD) in the contralateral sensorimotor area of the brain, we propose a Mirror Contrastive Loss based Sliding Window Transformer (MCL-SWT) to enhance subject-independent motor imagery-based EEG signal recognition. Specifically, our proposed mirror contrastive loss enhances sensitivity to the spatial location of ERD by contrasting the original EEG signals with their mirror counterparts-mirror EEG signals generated by interchanging the channels of the left and right hemispheres of the EEG signals. Moreover, we introduce a temporal sliding window transformer that computes self-attention scores from high temporal resolution features, thereby improving model performance with manageable computational complexity. We evaluate the performance of MCL-SWT on subject-independent motor imagery EEG signal recognition tasks, and our experimental results demonstrate that MCL-SWT achieved accuracies of 66.48% and 75.62%, surpassing the state-of-the-art (SOTA) model by 2.82% and 2.17%, respectively. Furthermore, ablation experiments confirm the effectiveness of the proposed mirror contrastive loss. A code demo of MCL-SWT is available at https://github.com/roniusLuo/MCL_SWT., Comment: This paper has been accepted by the Fourth International Workshop on Human Brain and Artificial Intelligence, joint workshop of the 33rd International Joint Conference on Artificial Intelligence, Jeju Island, South Korea, from August 3rd to August 9th, 2024
Published: 2024

47. Evaluating Usability and Engagement of Large Language Models in Virtual Reality for Traditional Scottish Curling

Author: Lau, Ka Hei Carrie, Bozkir, Efe, Gao, Hong, and Kasneci, Enkelejda
Subjects: Computer Science - Human-Computer Interaction, Computer Science - Artificial Intelligence
Abstract: This paper explores the innovative application of Large Language Models (LLMs) in Virtual Reality (VR) environments to promote heritage education, focusing on traditional Scottish curling presented in the game ``Scottish Bonspiel VR''. Our study compares the effectiveness of LLM-based chatbots with pre-defined scripted chatbots, evaluating key criteria such as usability, user engagement, and learning outcomes. The results show that LLM-based chatbots significantly improve interactivity and engagement, creating a more dynamic and immersive learning environment. This integration helps document and preserve cultural heritage and enhances dissemination processes, which are crucial for safeguarding intangible cultural heritage (ICH) amid environmental changes. Furthermore, the study highlights the potential of novel technologies in education to provide immersive experiences that foster a deeper appreciation of cultural heritage. These findings support the wider application of LLMs and VR in cultural education to address global challenges and promote sustainable practices to preserve and enhance cultural heritage.
Published: 2024

48. Multimodal Relational Triple Extraction with Query-based Entity Object Transformer

Author: Hei, Lei, An, Ning, Liao, Tingjing, Ma, Qi, Wang, Jiaqi, and Ren, Feiliang
Subjects: Computer Science - Information Retrieval
Abstract: Multimodal Relation Extraction is crucial for constructing flexible and realistic knowledge graphs. Recent studies focus on extracting the relation type with entity pairs present in different modalities, such as one entity in the text and another in the image. However, existing approaches require entities and objects given beforehand, which is costly and impractical. To address the limitation, we propose a novel task, Multimodal Entity-Object Relational Triple Extraction, which aims to extract all triples (entity span, relation, object region) from image-text pairs. To facilitate this study, we modified a multimodal relation extraction dataset MORE, which includes 21 relation types, to create a new dataset containing 20,264 triples, averaging 5.75 triples per image-text pair. Moreover, we propose QEOT, a query-based model with a selective attention mechanism, to dynamically explore the interaction and fusion of textual and visual information. In particular, the proposed method can simultaneously accomplish entity extraction, relation classification, and object detection with a set of queries. Our method is suitable for downstream applications and reduces error accumulation due to the pipeline-style approaches. Extensive experimental results demonstrate that our proposed method outperforms the existing baselines by 8.06% and achieves state-of-the-art performance., Comment: 15 pages, 7 figures, preprint
Published: 2024

49. A local diagnostic program for unitary evolution in general space-times

Author: Choi, Ka Hei, Hofmann, Stefan, and Schneider, Marc
Subjects: High Energy Physics - Theory, General Relativity and Quantum Cosmology
Abstract: We present a local framework for investigating non-unitary evolution groups pertinent to effective field theories in general semi-classical spacetimes. Our approach is based on a rigorous local stability analysis of the algebra of observables and solely employs geometric concepts in the functional representation of quantum field theory. In this representation, it is possible to construct infinitely many self-adjoint extensions of the canonical momentum field at the kinematic level, and by the usual functional calculus arguments this holds for the Hamiltonian, as well. However, these self-adjoint domains have only the trivial wave functional in common with the solution space of the functional Schr\"odinger equation. This is related to the existence of boundaries in configuration field space which can be penetrated by the probability flux, causing probability to leak into regions in configuration field space that require a more fundamental description. As a consequence the evolution admits no unitary representation. Instead, in the absence of ghosts, the evolution is represented by contractive semi-groups in the semiclassical approximation. This allows to quantify the unitarity loss and, in turn, to assess the quality of the semi-classical approximation. We perform numerical experiments based on our formal investigations to determine regions in cosmological spacetimes where the semiclassical approximation breaks down for free quantum fields., Comment: 20 pages, 7 figures
Published: 2024
Full Text: View/download PDF

50. Approximate Relational Reasoning for Higher-Order Probabilistic Programs

Author: Haselwarter, Philipp G., Li, Kwing Hei, Aguirre, Alejandro, Gregersen, Simon Oddershede, Tassarotti, Joseph, and Birkedal, Lars
Subjects: Computer Science - Logic in Computer Science, Computer Science - Programming Languages
Abstract: Properties such as provable security and correctness for randomized programs are naturally expressed relationally as approximate equivalences. As a result, a number of relational program logics have been developed to reason about such approximate equivalences of probabilistic programs. However, existing approximate relational logics are mostly restricted to first-order programs without general state. In this paper we develop Approxis, a higher-order approximate relational separation logic for reasoning about approximate equivalence of programs written in an expressive ML-like language with discrete probabilistic sampling, higher-order functions, and higher-order state. The Approxis logic recasts the concept of error credits in the relational setting to reason about relational approximation, which allows for expressive notions of modularity and composition, a range of new approximate relational rules, and an internalization of a standard limiting argument for showing exact probabilistic equivalences by approximation. We also use Approxis to develop a logical relation model that quantifies over error credits, which can be used to prove exact contextual equivalence. We demonstrate the flexibility of our approach on a range of examples, including the PRP/PRF switching lemma, IND\$-CPA security of an encryption scheme, and a collection of rejection samplers. All of the results have been mechanized in the Coq proof assistant and the Iris separation logic framework., Comment: Camera-ready POPL submission including additional appendix
Published: 2024
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

3,679 results on '"Hei, P"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources