3,467 results on '"Joyce, A"'
Search Results
2. TEOChat: A Large Vision-Language Assistant for Temporal Earth Observation Data
- Author
-
Irvin, Jeremy Andrew, Liu, Emily Ruoyu, Chen, Joyce Chuyi, Dormoy, Ines, Kim, Jinyoung, Khanna, Samar, Zheng, Zhuo, and Ermon, Stefano
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning - Abstract
Large vision and language assistants have enabled new capabilities for interpreting natural images. These approaches have recently been adapted to earth observation data, but they are only able to handle single image inputs, limiting their use for many real-world tasks. In this work, we develop a new vision and language assistant called TEOChat that can engage in conversations about temporal sequences of earth observation data. To train TEOChat, we curate an instruction-following dataset composed of many single image and temporal tasks including building change and damage assessment, semantic change detection, and temporal scene classification. We show that TEOChat can perform a wide variety of spatial and temporal reasoning tasks, substantially outperforming previous vision and language assistants, and even achieving comparable or better performance than specialist models trained to perform these specific tasks. Furthermore, TEOChat achieves impressive zero-shot performance on a change detection and change question answering dataset, outperforms GPT-4o and Gemini 1.5 Pro on multiple temporal tasks, and exhibits stronger single image capabilities than a comparable single EO image instruction-following model. We publicly release our data, models, and code at https://github.com/ermongroup/TEOChat .
- Published
- 2024
3. SWE-bench Multimodal: Do AI Systems Generalize to Visual Software Domains?
- Author
-
Yang, John, Jimenez, Carlos E., Zhang, Alex L., Lieret, Kilian, Yang, Joyce, Wu, Xindi, Press, Ori, Muennighoff, Niklas, Synnaeve, Gabriel, Narasimhan, Karthik R., Yang, Diyi, Wang, Sida I., and Press, Ofir
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computer Science - Software Engineering - Abstract
Autonomous systems for software engineering are now capable of fixing bugs and developing features. These systems are commonly evaluated on SWE-bench (Jimenez et al., 2024a), which assesses their ability to solve software issues from GitHub repositories. However, SWE-bench uses only Python repositories, with problem statements presented predominantly as text and lacking visual elements such as images. This limited coverage motivates our inquiry into how existing systems might perform on unrepresented software engineering domains (e.g., front-end, game development, DevOps), which use different programming languages and paradigms. Therefore, we propose SWE-bench Multimodal (SWE-bench M), to evaluate systems on their ability to fix bugs in visual, user-facing JavaScript software. SWE-bench M features 617 task instances collected from 17 JavaScript libraries used for web interface design, diagramming, data visualization, syntax highlighting, and interactive mapping. Each SWE-bench M task instance contains at least one image in its problem statement or unit tests. Our analysis finds that top-performing SWE-bench systems struggle with SWE-bench M, revealing limitations in visual problem-solving and cross-language generalization. Lastly, we show that SWE-agent's flexible language-agnostic features enable it to substantially outperform alternatives on SWE-bench M, resolving 12% of task instances compared to 6% for the next best system.
- Published
- 2024
4. Reconstructing Galaxy Cluster Mass Maps using Score-based Generative Modeling
- Author
-
Hsu, Alan, Ho, Matthew, Lin, Joyce, Markey, Carleen, Ntampaka, Michelle, Trac, Hy, and Póczos, Barnabás
- Subjects
Astrophysics - Cosmology and Nongalactic Astrophysics ,Computer Science - Machine Learning - Abstract
We present a novel approach to reconstruct gas and dark matter projected density maps of galaxy clusters using score-based generative modeling. Our diffusion model takes in mock SZ and X-ray images as conditional observations, and generates realizations of corresponding gas and dark matter maps by sampling from a learned data posterior. We train and validate the performance of our model by using mock data from a hydrodynamical cosmological simulation. The model accurately reconstructs both the mean and spread of the radial density profiles in the spatial domain to within 5\%, indicating that the model is able to distinguish between clusters of different sizes. In the spectral domain, the model achieves close-to-unity values for the bias and cross-correlation coefficients, indicating that the model can accurately probe cluster structures on both large and small scales. Our experiments demonstrate the ability of score models to learn a strong, nonlinear, and unbiased mapping between input observables and fundamental density distributions of galaxy clusters. These diffusion models can be further fine-tuned and generalized to not only take in additional observables as inputs, but also real observations and predict unknown density distributions of galaxy clusters., Comment: 15 pages, 9 figures, submitted to The Open Journal of Astrophysics
- Published
- 2024
5. Large-scale, Longitudinal, Hybrid Participatory Design Program to Create Navigation Technology for the Blind
- Author
-
Chung, Daeun Joyce, Guoji, Muya, Mindel, Nina, Malkin, Alexis, Alberotrio, Fernando, Lowe, Shane, McNally, Chris, Xavier, Casandra, and Ruvolo, Paul
- Subjects
Computer Science - Human-Computer Interaction - Abstract
Empowering people who are blind or visually impaired (BVI) to enhance their orientation and mobility skills is critical to equalizing their access to social and economic opportunities. To manage this crucial challenge, we employed a novel design process based on a large-scale, longitudinal, community-based structure. Across three annual programs we engaged with the BVI community in online and in-person modes. In total, our team included 67 total BVI participatory design participants online, 11 BVI co-designers in-person, and 4 BVI program coordinators. Through this design process we built a mobile application that enables users to generate, share, and navigate maps of indoor and outdoor environments without the need to instrument each environment with beacons or fiducial markers. We evaluated this app at a healthcare facility, and participants in the evaluation rated the app highly with respect to its design, features, and potential for positive impact on quality of life.
- Published
- 2024
6. Thematic Analysis with Open-Source Generative AI and Machine Learning: A New Method for Inductive Qualitative Codebook Development
- Author
-
Katz, Andrew, Fleming, Gabriella Coloyan, and Main, Joyce
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computer Science - Human-Computer Interaction - Abstract
This paper aims to answer one central question: to what extent can open-source generative text models be used in a workflow to approximate thematic analysis in social science research? To answer this question, we present the Generative AI-enabled Theme Organization and Structuring (GATOS) workflow, which uses open-source machine learning techniques, natural language processing tools, and generative text models to facilitate thematic analysis. To establish validity of the method, we present three case studies applying the GATOS workflow, leveraging these models and techniques to inductively create codebooks similar to traditional procedures using thematic analysis. Specifically, we investigate the extent to which a workflow comprising open-source models and tools can inductively produce codebooks that approach the known space of themes and sub-themes. To address the challenge of gleaning insights from these texts, we combine open-source generative text models, retrieval-augmented generation, and prompt engineering to identify codes and themes in large volumes of text, i.e., generate a qualitative codebook. The process mimics an inductive coding process that researchers might use in traditional thematic analysis by reading text one unit of analysis at a time, considering existing codes already in the codebook, and then deciding whether or not to generate a new code based on whether the extant codebook provides adequate thematic coverage. We demonstrate this workflow using three synthetic datasets from hypothetical organizational research settings: a study of teammate feedback in teamwork settings, a study of organizational cultures of ethical behavior, and a study of employee perspectives about returning to their offices after the pandemic. We show that the GATOS workflow is able to identify themes in the text that were used to generate the original synthetic datasets.
- Published
- 2024
7. The Einstein Probe transient EP240414a: Linking Fast X-ray Transients, Gamma-ray Bursts and Luminous Fast Blue Optical Transients
- Author
-
van Dalen, Joyce N. D., Levan, Andrew J., Jonker, Peter G., Malesani, Daniele B., Izzo, Luca, Sarin, Nikhil, Quirola-Vásquez, Jonathan, Sánchez, Daniel Mata, Postigo, Antonio de Ugarte, van Hoof, Agnes P. C., Torres, Manuel A. P., Schulze, Steve, Littlefair, Stuart P., Chrimes, Ashley, Ravasio, Maria E., Bauer, Franz E., Martin-Carrillo, Antonio, Fraser, Morgan, van der Horst, Alexander J., Jakobsson, Pall, O'Brien, Paul, De Pasquale, Massimiliano, Pugliese, Giovanna, Sollerman, Jesper, Tanvir, Nial R., Zafar, Tayyaba, Anderson, Joseph P., Galbany, Lluís, Gal-Yam, Avishay, Gromadzki, Mariusz, Muller-Bravo, Tomas E., Ragosta, Fabio, and Terwel, Jacco H.
- Subjects
Astrophysics - High Energy Astrophysical Phenomena - Abstract
Detections of fast X-ray transients (FXTs) have been accrued over the last few decades. However, their origin has remained mysterious. There is now rapid progress thanks to timely discoveries and localisations with the Einstein Probe mission. Early results indicate that FXTs may frequently, but not always, be associated with gamma-ray bursts (GRBs). Here, we report on the multi-wavelength counterpart of FXT EP240414a, which has no reported gamma-ray counterpart. The transient is located 25.7~kpc in projection from a massive galaxy at $z=0.40$. We perform comprehensive photometric and spectroscopic follow-up. The optical light curve shows at least three distinct emission episodes with timescales of $\sim 1, 4$ and 15 days and peak absolute magnitudes of $M_R \sim -20$, $-21$, and $-19.5$, respectively. The optical spectrum at early times is extremely blue, inconsistent with afterglow emission. It may arise from the interaction of both jet and supernova shock waves with the stellar envelope and a dense circumstellar medium, as has been suggested for some Fast Blue Optical Transients (LFBOTs). At late times, the spectrum evolves to a broad-lined~Type~Ic supernova, similar to those seen in collapsar long-GRBs. This implies that the progenitor of EP240414a is a massive star creating a jet-forming supernova inside a dense envelope, resulting in an X-ray outburst with a luminosity of $\sim 10^{48}$ erg s$^{-1}$, and the complex observed optical/IR light curves. If correct, this argues for a causal link between the progenitors of long-GRBs, FXTs and LFBOTs., Comment: 36 pages, 13 figures, submitted to ApJ
- Published
- 2024
8. RACER: Rich Language-Guided Failure Recovery Policies for Imitation Learning
- Author
-
Dai, Yinpei, Lee, Jayjun, Fazeli, Nima, and Chai, Joyce
- Subjects
Computer Science - Robotics ,Computer Science - Computation and Language ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Developing robust and correctable visuomotor policies for robotic manipulation is challenging due to the lack of self-recovery mechanisms from failures and the limitations of simple language instructions in guiding robot actions. To address these issues, we propose a scalable data generation pipeline that automatically augments expert demonstrations with failure recovery trajectories and fine-grained language annotations for training. We then introduce Rich languAge-guided failure reCovERy (RACER), a supervisor-actor framework, which combines failure recovery data with rich language descriptions to enhance robot control. RACER features a vision-language model (VLM) that acts as an online supervisor, providing detailed language guidance for error correction and task execution, and a language-conditioned visuomotor policy as an actor to predict the next actions. Our experimental results show that RACER outperforms the state-of-the-art Robotic View Transformer (RVT) on RLbench across various evaluation settings, including standard long-horizon tasks, dynamic goal-change tasks and zero-shot unseen tasks, achieving superior performance in both simulated and real world environments. Videos and code are available at: https://rich-language-failure-recovery.github.io., Comment: Project Website: https://rich-language-failure-recovery.github.io
- Published
- 2024
9. Haptic Shoulder for Rendering Biomechanically Accurate Joint Limits for Human-Robot Physical Interactions
- Author
-
Peiros, Elizabeth, Joyce, Calvin, Murugesan, Tarun, Nguyen, Roger, Fiorini, Isabella, Galibut, Rizzi, and Yip, Michael C.
- Subjects
Computer Science - Robotics - Abstract
Human-robot physical interaction (pHRI) is a rapidly evolving research field with significant implications for physical therapy, search and rescue, and telemedicine. However, a major challenge lies in accurately understanding human constraints and safety in human-robot physical experiments without an IRB and physical human experiments. Concerns regarding human studies include safety concerns, repeatability, and scalability of the number and diversity of participants. This paper examines whether a physical approximation can serve as a stand-in for human subjects to enhance robot autonomy for physical assistance. This paper introduces the SHULDRD (Shoulder Haptic Universal Limb Dynamic Repositioning Device), an economical and anatomically similar device designed for real-time testing and deployment of pHRI planning tasks onto robots in the real world. SHULDRD replicates human shoulder motion, providing crucial force feedback and safety data. The device's open-source CAD and software facilitate easy construction and use, ensuring broad accessibility for researchers. By providing a flexible platform able to emulate infinite human subjects, ensure repeatable trials, and provide quantitative metrics to assess the effectiveness of the robotic intervention, SHULDRD aims to improve the safety and efficacy of human-robot physical interactions., Comment: Submitted to ICRA '25
- Published
- 2024
10. A Convolutional Neural Network-based Ensemble Post-processing with Data Augmentation for Tropical Cyclone Precipitation Forecasts
- Author
-
Chen, Sing-Wen, Juang, Joyce, Wang, Charlotte, Chang, Hui-Ling, Hong, Jing-Shan, and Hsiao, Chuhsing Kate
- Subjects
Statistics - Applications ,Physics - Geophysics ,86A05 (Primary) 62-08 (Secondary) - Abstract
Heavy precipitation from tropical cyclones (TCs) may result in disasters, such as floods and landslides, leading to substantial economic damage and loss of life. Prediction of TC precipitation based on ensemble post-processing procedures using machine learning (ML) approaches has received considerable attention for its flexibility in modeling and its computational power in managing complex models. However, when applying ML techniques to TC precipitation for a specific area, the available observation data are typically insufficient for comprehensive training, validation, and testing of the ML model, primarily due to the rapid movement of TCs. We propose to use the convolutional neural network (CNN) as a deep ML model to leverage the spatial information of precipitation. The proposed model has three distinct features that differentiate it from traditional CNNs applied in meteorology. First, it utilizes data augmentation to alleviate challenges posed by the small sample size. Second, it contains geographical and dynamic variables to account for area-specific features and the relative distance between the study area and the moving TC. Third, it applies unequal weights to accommodate the temporal structure in the training data when calculating the objective function. The proposed CNN-all model is then illustrated with the TC Soudelor's impact on Taiwan. Soudelor was the strongest TC of the 2015 Pacific typhoon season. The results show that the inclusion of augmented data and dynamic variables improves the prediction of heavy precipitation. The proposed CNN-all outperforms traditional CNN models, based on the continuous probability skill score (CRPSS), probability plots, and reliability diagram. The proposed model has the potential to be utilized in a wide range of meteorological studies., Comment: 21 pages, 6 figures, 2 tables, 2 supplementary figures
- Published
- 2024
11. Unveiling the 5$f$ electron hybridization process in UPd$_2$Al$_3$ via ARPES and Time-resolved PES
- Author
-
Song, Jiao-Jiao, Wu, Qi-Yi, Zhang, Chen, Gilbertson, Steve M., Riseborough, Peter S., Rusz, Jan, Joyce, John J., Graham, Kevin S., Olson, Clifford G., Tobash, Paul H., Bauer, Eric D., Chen, Bo, Liu, Hao, Duan, Yu-Xia, Oppeneer, Peter M., Rodriguez, George, Durakiewicz, Tomasz, and Meng, Jian-Qiao
- Subjects
Condensed Matter - Strongly Correlated Electrons ,Condensed Matter - Superconductivity - Abstract
This study investigates the 5$f$-electron-conduction electron hybridization process in the heavy fermion superconductor UPd$_2$Al$_3$ using a combination of angle-resolved photoemission spectroscopy (ARPES) and time-resolved photoemission spectroscopy (tr-PES). ARPES measurements reveal the formation of a hybridization gap at a temperature of approximately 75 K, which becomes more pronounced as the temperature decreases. Notably, the persistence of a flat U 5$f$ band at temperatures well above the hybridization onset challenges conventional understanding. Our findings demonstrate a non-monotonic temperature dependence of the quasiparticle relaxation time, with an anomalous decrease at 20 K, suggesting complex electronic and magnetic interactions. These findings provide detailed insights into the 5$f$-electron hybridization process in UPd$_2$Al$_3$, with significant implications for the understanding of heavy fermion superconductivity and the role of 5$f$-electron hybridization in uranium-based materials., Comment: 5 pages, 4 figures
- Published
- 2024
12. Matching seismic masses for RR Lyrae-type and oscillating red horizontal-branch stars in M4
- Author
-
Molnár, László, Netzel, Henryka, Howell, Madeline, Kalup, Csilla, and Joyce, Meridith
- Subjects
Astrophysics - Solar and Stellar Astrophysics ,Astrophysics - Astrophysics of Galaxies - Abstract
Globular clusters offer a powerful way to test the properties of stellar populations and the late stages of low-mass stellar evolution. In this paper we study oscillating giant stars and overtone RR Lyrae-type pulsators in the nearest globular cluster, M4, with the help of high-precision, continuous light curves collected by the Kepler space telescope in the K2 mission. We determine the frequency composition of five RRc stars and model their physical parameters with a grid of linear pulsation models. We are able, for the first time, to compare seismic masses of RR Lyrae stars directly to the masses of the very similar red horizontal branch stars in the same stellar population, independently determined from asteroseismic scaling relations. We find a close match, with an average seismic mass of $0.651\pm0.028\,M_\odot$ for RR Lyrae stars and $0.657\pm0.034\,M_\odot$ for red horizontal-branch stars. While the validity of our RR Lyrae masses still relies on the similarity of neighboring horizontal branch subgroups, this result strongly indicates that RRc stars may indeed exhibit high-degree, $l = 8$ and 9 non-radial modes, and modeling these modes can provide realistic mass estimates. We also determine the He content of the cluster to be $Y = 0.266\pm 0.008$, and compare the seismic masses for our sample of RR Lyrae to theoretical mass relations and highlight the limitations of these relations., Comment: 11 pages, 10 figures, submitted to A&A for review
- Published
- 2024
13. Impact of Transit on Mobility, Equity, and Economy in the Chicago Metropolitan Region
- Author
-
Verbas, Omer, Cokyasar, Taner, Joyce-Johnson, Seamus, Wainwright, Scott, Coates, Maeve, Rousseau, Aymeric, Aloisi, Jim, Stewart, Anson, and Auld, Joshua
- Subjects
Mathematics - Optimization and Control - Abstract
Transit is essential for urban transportation and achieving net-zero targets. In urban areas like the Chicago Metropolitan Region, transit enhances mobility and connects people, fostering a dynamic economy. To quantify the mobility and selected economic impacts of transit, we use a novel agent-based simulation model POLARIS to compare baseline service against a scenario in which transit is completely removed. The transit-removal scenario assumes higher car ownership and results in higher traffic congestion, numerous activity cancellations, and economic decline. In this scenario, average travel times increase by 14.2% regionally and 34.7% within the City of Chicago. The resulting congestion causes significant activity cancellations despite increased car ownership: 11.8% of non-work and 2.8% of work/school activities regionally, totaling an 8.6% overall cancellation rate. In the city, non-work cancellations would reach 26.9%, and work/school cancellations 7.3%, leading to a 19.9% overall cancellation rate. The impact varies between groups. Women and lower-income individuals are more likely to cancel activities than men and higher-income groups. Women account for 53.7% of non-work and 53.0% of total cancellations. The lowest 40% income group experiences 50.2% of non-work and 48.0% of overall cancellations. Combined, activity cancellations, travel time losses, and increased car ownership cost the region $35.4 billion. With annual public transit funding at $2.7 billion, the ratio is 13 to 1, underscoring transit's critical role in mobility, equity, and economic health.
- Published
- 2024
14. The Continuous Electron Beam Accelerator Facility at 12 GeV
- Author
-
Adderley, P. A., Ahmed, S., Allison, T., Bachimanchi, R., Baggett, K., BastaniNejad, M., Bevins, B., Bevins, M., Bickley, M., Bodenstein, R. M., Bogacz, S. A., Bruker, M., Burrill, A., Cardman, L., Creel, J., Chao, Y. -C., Cheng, G., Ciovati, G., Chattopadhyay, S., Clark, J., Clemens, W. A., Croke, G., Daly, E., Davis, G. K., Delayen, J., De Silva, S. U., Dickson, R., Diaz, M., Drury, M., Doolittle, L., Douglas, D., Feldl, E., Fischer, J., Freyberger, A., Ganni, V., Geng, R. L., Ginsburg, C., Gomez, J., Grames, J., Gubeli, J., Guo, J., Hannon, F., Hansknecht, J., Harwood, L., Henry, J., Hernandez-Garcia, C., Higgins, S., Higinbotham, D., Hofler, A. S., Hiatt, T., Hogan, J., Hovater, C., Hutton, A., Jones, C., Jordan, K., Joyce, M., Kazimi, R., Keesee, M., Kelley, M. J., Keppel, C., Kimber, A., King, L., Kjeldsen, P., Kneisel, P., Koval, J., Krafft, G. A., Lahti, G., Larrieu, T., Lauze, R., Leemann, C., Legg, R., Li, R., Lin, F., Machie, D., Mammosser, J., Macha, K., Mahoney, K., Marhauser, F., Mastracci, B., Matalevich, J., McCarter, J., McCaughan, M., Merminga, L., Michaud, R., Morozov, V., Mounts, C., Musson, J., Nelson, R., Oren, W., Overton, R. B., Palacios-Serrano, G., Park, H. -K., Phillips, L., Philip, S., Pilat, F., Plawski, T., Poelker, M., Powers, P., Powers, T., Preble, J., Reilly, T., Rimmer, R., Reece, C., Robertson, H., Roblin, Y., Rode, C., Satogata, T., Seidman, D. J., Seryi, A., Shabalina, A., Shin, I., Slominski, R., Slominski, C., Spata, M., Spell, D., Spradlin, J., Stirbet, M., Stutzman, M. L., Suhring, S., Surles-Law, K., Suleiman, R., Tennant, C., Tian, H., Turner, D., Tiefenback, M., Trofimova, O., Valente, A. -M., Wang, H., Wang, Y., White, K., Whitlatch, C., Whitlatch, T., Wiseman, M., Wissman, M. J., Wu, G., Yang, S., Yunn, B., Zhang, S., and Zhang, Y.
- Subjects
Physics - Accelerator Physics ,Nuclear Experiment - Abstract
This review paper describes the energy-upgraded CEBAF accelerator. This superconducting linac has achieved 12 GeV beam energy by adding 11 new high-performance cryomodules containing eighty-eight superconducting cavities that have operated CW at an average accelerating gradient of 20 MV/m. After reviewing the attributes and performance of the previous 6 GeV CEBAF accelerator, we discuss the upgraded CEBAF accelerator system in detail with particular attention paid to the new beam acceleration systems. In addition to doubling the acceleration in each linac, the upgrade included improving the beam recirculation magnets, adding more helium cooling capacity to allow the newly installed modules to run cold, adding a new experimental hall, and improving numerous other accelerator components. We review several of the techniques deployed to operate and analyze the accelerator performance, and document system operating experience and performance. In the final portion of the document, we present much of the current planning regarding projects to improve accelerator performance and enhance operating margins, and our plans for ensuring CEBAF operates reliably into the future. For the benefit of potential users of CEBAF, the performance and quality measures for beam delivered to each of the experimental halls is summarized in the appendix., Comment: 66 pages, 73 figures, 21 tables
- Published
- 2024
- Full Text
- View/download PDF
15. Multimodal Methods for Analyzing Learning and Training Environments: A Systematic Literature Review
- Author
-
Cohn, Clayton, Davalos, Eduardo, Vatral, Caleb, Fonteles, Joyce Horn, Wang, Hanchen David, Ma, Meiyi, and Biswas, Gautam
- Subjects
Computer Science - Machine Learning ,Computer Science - Multimedia - Abstract
Recent technological advancements have enhanced our ability to collect and analyze rich multimodal data (e.g., speech, video, and eye gaze) to better inform learning and training experiences. While previous reviews have focused on parts of the multimodal pipeline (e.g., conceptual models and data fusion), a comprehensive literature review on the methods informing multimodal learning and training environments has not been conducted. This literature review provides an in-depth analysis of research methods in these environments, proposing a taxonomy and framework that encapsulates recent methodological advances in this field and characterizes the multimodal domain in terms of five modality groups: Natural Language, Video, Sensors, Human-Centered, and Environment Logs. We introduce a novel data fusion category -- mid fusion -- and a graph-based technique for refining literature reviews, termed citation graph pruning. Our analysis reveals that leveraging multiple modalities offers a more holistic understanding of the behaviors and outcomes of learners and trainees. Even when multimodality does not enhance predictive accuracy, it often uncovers patterns that contextualize and elucidate unimodal data, revealing subtleties that a single modality may miss. However, there remains a need for further research to bridge the divide between multimodal learning and training studies and foundational AI research., Comment: Submitted to ACM Computing Surveys. Currently under review
- Published
- 2024
16. Enhanced Infield Agriculture with Interpretable Machine Learning Approaches for Crop Classification
- Author
-
Murindanyi, Sudi, Nakatumba-Nabende, Joyce, Sanya, Rahman, Nakibuule, Rose, and Katumba, Andrew
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence - Abstract
The increasing popularity of Artificial Intelligence in recent years has led to a surge in interest in image classification, especially in the agricultural sector. With the help of Computer Vision, Machine Learning, and Deep Learning, the sector has undergone a significant transformation, leading to the development of new techniques for crop classification in the field. Despite the extensive research on various image classification techniques, most have limitations such as low accuracy, limited use of data, and a lack of reporting model size and prediction. The most significant limitation of all is the need for model explainability. This research evaluates four different approaches for crop classification, namely traditional ML with handcrafted feature extraction methods like SIFT, ORB, and Color Histogram; Custom Designed CNN and established DL architecture like AlexNet; transfer learning on five models pre-trained using ImageNet such as EfficientNetV2, ResNet152V2, Xception, Inception-ResNetV2, MobileNetV3; and cutting-edge foundation models like YOLOv8 and DINOv2, a self-supervised Vision Transformer Model. All models performed well, but Xception outperformed all of them in terms of generalization, achieving 98% accuracy on the test data, with a model size of 80.03 MB and a prediction time of 0.0633 seconds. A key aspect of this research was the application of Explainable AI to provide the explainability of all the models. This journal presents the explainability of Xception model with LIME, SHAP, and GradCAM, ensuring transparency and trustworthiness in the models' predictions. This study highlights the importance of selecting the right model according to task-specific needs. It also underscores the important role of explainability in deploying AI in agriculture, providing insightful information to help enhance AI-driven crop management strategies.
- Published
- 2024
17. A Buddy for Betelgeuse: Binarity as the Origin of the Long Secondary Period in $\alpha$ Orionis
- Author
-
Goldberg, Jared A., Joyce, Meridith, and Molnár, László
- Subjects
Astrophysics - Solar and Stellar Astrophysics ,Astrophysics - High Energy Astrophysical Phenomena - Abstract
We predict the existence of $\alpha$ Ori B, a low-mass companion orbiting Betelgeuse. This is motivated by the presence of a 2170-day Long Secondary Period (LSP) in Betelgeuse's lightcurve, a periodicity $\approx5$ times longer than the star's 416 day fundamental radial pulsation mode. While binarity is currently the leading hypothesis for LSPs in general, the LSP and the radial velocity variation observed in Betelgeuse, taken together, necessitate a revision of the prevailing physical picture. The lightcurve-RV phase difference requires a companion to be behind Betelgeuse at the LSP luminosity minimum, 180 degrees out of phase with the system orientation associated with occultation. We demonstrate the consistency of this model with available observational constraints and identify tensions in all other proposed LSP hypotheses. Within this framework, we calculate a mass for $\alpha$ Ori B of $1.17\pm0.7\,M_\odot$ and an orbital separation of $1850\pm70\,R_\odot$, or $2.43^{+0.21}_{-0.32}$ times the radius of Betelgeuse. We then describe the features of the companion as constrained by the fundamental parameters of Betelgeuse and its orbital system, and discuss what would be required to confirm the companion's existence observationally., Comment: 16 pages, 4 figures, 2 tables, submitted to AAS Journals. Comments welcome
- Published
- 2024
18. Focused deposition of levitated nanoscale Au droplets
- Author
-
Coppock, Joyce and Kane, B. E.
- Subjects
Condensed Matter - Mesoscale and Nanoscale Physics - Abstract
We describe a method for depositing nanoscale liquid Au droplets, initially levitated in an ion trap in high vacuum, onto a remote substrate. A levitated Au nanosphere is melted, expelled from the trap, and maintained in the molten state with a laser directed along the droplet trajectory until it reaches the substrate and rapidly solidifies. Also during transit, the charged droplets are focused to a small region of the substrate with an electrostatic lens. After deposition, the substrate can be removed from the vacuum chamber and imaged and analyzed by techniques such as electron microscopy and energy dispersive spectroscopy (EDS). Over 90% of launched particles are deposited on the substrate, and when the lens is focused, particles land in a region of diameter 120 $\mu$m after traversing a distance of 236 mm. Our technique is of value for analysis of materials prepared or processed while levitated that can be melted. Also, Au droplets may be useful as tracers for future experiments involving smaller projectiles or oriented solids.
- Published
- 2024
19. Algebraic Representations for Faster Predictions in Convolutional Neural Networks
- Author
-
Joyce, Johnny and Verschelde, Jan
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Symbolic Computation - Abstract
Convolutional neural networks (CNNs) are a popular choice of model for tasks in computer vision. When CNNs are made with many layers, resulting in a deep neural network, skip connections may be added to create an easier gradient optimization problem while retaining model expressiveness. In this paper, we show that arbitrarily complex, trained, linear CNNs with skip connections can be simplified into a single-layer model, resulting in greatly reduced computational requirements during prediction time. We also present a method for training nonlinear models with skip connections that are gradually removed throughout training, giving the benefits of skip connections without requiring computational overhead during during prediction time. These results are demonstrated with practical examples on Residual Networks (ResNet) architecture., Comment: Accepted for publication in the proceedings of the 27th International Workshop on Computer Algebra in Scientific Computing (CASC 2024)
- Published
- 2024
20. Impact of Jahn-Teller distortions on persistent molecular ring current in benzene
- Author
-
Joyce, T. and Jaron, A.
- Subjects
Quantum Physics - Abstract
Circularly polarized femtosecond UV laser pulse can remove a {\pi} electron from benzene in such a way that the leftover hole circulates around the cation as a persistent ring current. We investigate the time dependent strength of the current as the molecule relaxes from the D6h symmetry of the neutral to the D2h symmetry of the cation due to the Jahn-Teller effect. We explore the effect of spontaneous symmetry breaking on persistent ring currents for benzene cation, because it is one of the most comprehensively studied examples of the Jahn-Teller effect.
- Published
- 2024
21. High School Students' Use and Impressions of AI Tools. ACT Research
- Author
-
ACT, Inc., Jeff Schiel, Becky L. Bobek, and Joyce Z. Schnieders
- Abstract
There is growing interest in artificial intelligence (AI) tools, especially high-profile tools like ChatGPT, and these tools now appear to be part of the education experience for many high school students. To investigate students' use of AI tools for school assignments, their impressions of how using the tools might affect them cognitively and academically, and their thoughts on using AI tools to write their college admissions essays, ACT developed a survey in June 2023 which was administered to a large nationwide sample of students in Grades 10 through 12. In this study, almost half of the participating students reported that they had used AI tools, and the most common tool they used was ChatGPT. Among students who did not use AI tools, the top reason for not using them was having no interest in them. About two thirds of students also reported that they did not trust the information provided by AI tools, and a little over half indicated that they did not know enough about AI tools to use them. Students with higher academic performance were significantly more likely to use AI tools than were students with lower academic performance. Findings show that these tools appear to have much potential to enhance student learning, but there are concerns about appropriate use and potential negative outcomes.
- Published
- 2023
22. Stellar Models are Reliable at Low Metallicity: An Asteroseismic Age for the Ancient Very Metal-Poor Star KIC 8144907
- Author
-
Huber, Daniel, Slumstrup, Ditte, Hon, Marc, Li, Yaguang, Borsen-Koch, Victor Aguirre, Bedding, Timothy R., Joyce, Meridith, Ong, J. M. Joel, Serenelli, Aldo, Stello, Dennis, Berger, Travis, Grunblatt, Samuel K., Greklek-McKeon, Michael, Hirano, Teruyuki, Kirby, Evan N., Pinsonneault, Marc H., Puls, Arthur Alencastro, and Zinn, Joel
- Subjects
Astrophysics - Solar and Stellar Astrophysics ,Astrophysics - Astrophysics of Galaxies - Abstract
Very metal-poor stars ([Fe/H]<-2) are important laboratories for testing stellar models and reconstructing the formation history of our galaxy. Asteroseismology is a powerful tool to probe stellar interiors and measure ages, but few asteroseismic detections are known in very metal-poor stars and none have allowed detailed modeling of oscillation frequencies. We report the discovery of a low-luminosity Kepler red giant (KIC 8144907) with high S/N oscillations, [Fe/H]=-2.66+/-0.08 and [alpha/Fe]=0.38+/-0.06, making it by far the most metal-poor star to date for which detailed asteroseismic modeling is possible. By combining the oscillation spectrum from Kepler with high-resolution spectroscopy we measure an asteroseismic mass and age of 0.79+/-0.02(ran)+/-0.01(sys) Msun and 12.0+/-0.6(ran)+/-0.4(sys) Gyr, with remarkable agreement across different codes and input physics, demonstrating that stellar models and asteroseismology are reliable for very metal-poor stars when individual frequencies are used. The results also provide a direct age anchor for the early formation of the Milky Way, implying that substantial star formation did not commence until redshift z~3 (if the star formed in-situ) or that the Milky Way has undergone merger events for at least ~12 Gyr (if the star was accreted by a dwarf satellite merger such as Gaia Enceladus)., Comment: 10 pages, 4 figures, accepted for publication in ApJ
- Published
- 2024
23. A Security Assessment tool for Quantum Threat Analysis
- Author
-
Halak, Basel, Csete, Cristian Sebastian, Joyce, Edward, Papaioannou, Jack, Pires, Alexandre, Soma, Jin, Gokkaya, Betul, and Murphy, Michael
- Subjects
Computer Science - Cryptography and Security ,Computer Science - Computers and Society - Abstract
The rapid advancement of quantum computing poses a significant threat to many current security algorithms used for secure communication, digital authentication, and information encryption. A sufficiently powerful quantum computer could potentially exploit vulnerabilities in these algorithms, rendering data in transit insecure. This threat is expected to materialize within the next 20 years. Immediate transition to quantum-resilient cryptographic schemes is crucial, primarily to mitigate store-now-decrypt-later attacks and to ensure the security of products with decade-long operational lives. This transition requires a systematic approach to identifying and upgrading vulnerable cryptographic implementations. This work developed a quantum assessment tool for organizations, providing tailored recommendations for transitioning their security protocols into a post-quantum world. The work included a systematic evaluation of the proposed solution using qualitative feedback from network administrators and cybersecurity experts. This feedback was used to refine the accuracy and usability of the assessment process. The results demonstrate its effectiveness and usefulness in helping organizations prepare for quantum computing threats. The assessment tool is publicly available at (https://quantum-watch.soton.ac.uk).
- Published
- 2024
24. Realistic Uncertainties for Fundamental Properties of Asteroseismic Red Giants and the Interplay Between Mixing Length, Metallicity and $\nu_{\rm max}$
- Author
-
Li, Yaguang, Bedding, Timothy R., Huber, Daniel, Stello, Dennis, van Saders, Jennifer, Zhou, Yixiao, Crawford, Courtney L., Joyce, Meridith, Li, Tanda, Murphy, Simon J., and Sreenivas, K. R.
- Subjects
Astrophysics - Solar and Stellar Astrophysics - Abstract
Asteroseismic modelling is a powerful way to derive stellar properties. However, the derived quantities are limited by built-in assumptions used in stellar models. This work presents a detailed characterisation of stellar model uncertainties in asteroseismic red giants, focusing on the mixing-length parameter $\alpha_{\rm MLT}$, the initial helium fraction $Y_{\rm init}$, the solar abundance scale, and the overshoot parameters. First, we estimate error floors due to model uncertainties to be $\approx$0.4\% in mass, $\approx$0.2\% in radius, and $\approx$17\% in age, primarily due to the uncertain state of $\alpha_{\rm MLT}$ and $Y_{\rm init}$. The systematic uncertainties in age exceed typical statistical uncertainties, suggesting the importance of their evaluation in asteroseismic applications. Second, we demonstrate that the uncertainties from $\alpha_{\rm MLT}$ can be entirely mitigated by direct radius measurements or partially through $\nu_{\rm max}$. Utilizing radii from Kepler eclipsing binaries, we determined the $\alpha_{\rm MLT}$ values and calibrated the $\alpha_{\rm MLT}$--[M/H] relation. The correlation observed between the two variables is positive, consistent with previous studies using 1-D stellar models, but in contrast with outcomes from 3-D simulations. Third, we explore the implications of using asteroseismic modelling to test the $\nu_{\rm max}$ scaling relation. We found that a perceived dependency of $\nu_{\rm max}$ on [M/H] from individual frequency modelling can be largely removed by incorporating the calibrated $\alpha_{\rm MLT}$--[M/H] relation. Variations in $Y_{\rm init}$ can also affect $\nu_{\rm max}$ predictions. These findings suggest that $\nu_{\rm max}$ conveys information not fully captured by individual frequencies, and that it should be carefully considered as an important observable for asteroseismic modelling., Comment: 18 pages, 7 figures; submitted to ApJ
- Published
- 2024
25. Large Language Models for Integrating Social Determinant of Health Data: A Case Study on Heart Failure 30-Day Readmission Prediction
- Author
-
Fensore, Chase, Carrillo-Larco, Rodrigo M., Patel, Shivani A., Morris, Alanna A., and Ho, Joyce C.
- Subjects
Computer Science - Computation and Language - Abstract
Social determinants of health (SDOH) $-$ the myriad of circumstances in which people live, grow, and age $-$ play an important role in health outcomes. However, existing outcome prediction models often only use proxies of SDOH as features. Recent open data initiatives present an opportunity to construct a more comprehensive view of SDOH, but manually integrating the most relevant data for individual patients becomes increasingly challenging as the volume and diversity of public SDOH data grows. Large language models (LLMs) have shown promise at automatically annotating structured data. Here, we conduct an end-to-end case study evaluating the feasibility of using LLMs to integrate SDOH data, and the utility of these SDOH features for clinical prediction. We first manually label 700+ variables from two publicly-accessible SDOH data sources to one of five semantic SDOH categories. Then, we benchmark performance of 9 open-source LLMs on this classification task. Finally, we train ML models to predict 30-day hospital readmission among 39k heart failure (HF) patients, and we compare the prediction performance of the categorized SDOH variables with standard clinical variables. Additionally, we investigate the impact of few-shot LLM prompting on LLM annotation performance, and perform a metadata ablation study on prompts to evaluate which information helps LLMs accurately annotate these variables. We find that some open-source LLMs can effectively, accurately annotate SDOH variables with zero-shot prompting without the need for fine-tuning. Crucially, when combined with standard clinical features, the LLM-annotated Neighborhood and Built Environment subset of the SDOH variables shows the best performance predicting 30-day readmission of HF patients., Comment: 36 pages including references and appendix. This is a work in progress
- Published
- 2024
26. Cultural Transmission, Technology, and Treatment of the Elderly
- Author
-
Baker, Matthew J. and Jacobsen, Joyce P.
- Subjects
Economics - General Economics - Abstract
We discuss the interrelationship between the treatment of the elderly, the nature of production, and the transmission of culture. Respect for the elderly is endogenous. Parents cultivate an interest in consuming culture in their children; when they are older, children compensate their elders proportional to the degree to which their interests were previously cultivated. We show that this model is functionally equivalent to one in which cultural goods are transferred across generations. We focus on the relative well-being of the elderly and use the model to explain patterns in their relative well-being across societies. An important theme is that the cultivation of culture and norms for the respect and support of the elderly bear a nonlinear relationship with many economic variables, such as capital and or land intensity in production. We also discuss the interaction of property rights with production, assets such as productive resources, and relative treatment of the elderly. Insecurity of some types of property rights, such as rights over output, may benefit the elderly, while secure rights over productive resources may also benefit the elderly. We discuss how the elderly could be affected by demographic, technological and policy changes in both developing and developed economies.
- Published
- 2024
27. Vision-and-Language Navigation Today and Tomorrow: A Survey in the Era of Foundation Models
- Author
-
Zhang, Yue, Ma, Ziqiao, Li, Jialu, Qiao, Yanyuan, Wang, Zun, Chai, Joyce, Wu, Qi, Bansal, Mohit, and Kordjamshidi, Parisa
- Subjects
Computer Science - Computation and Language ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Vision-and-Language Navigation (VLN) has gained increasing attention over recent years and many approaches have emerged to advance their development. The remarkable achievements of foundation models have shaped the challenges and proposed methods for VLN research. In this survey, we provide a top-down review that adopts a principled framework for embodied planning and reasoning, and emphasizes the current methods and future opportunities leveraging foundation models to address VLN challenges. We hope our in-depth discussions could provide valuable resources and insights: on one hand, to milestone the progress and explore opportunities and potential roles for foundation models in this field, and on the other, to organize different challenges and solutions in VLN to foundation model researchers., Comment: Authors contributed equally to this work, and supervisors contributed equal advising to this work
- Published
- 2024
28. Multi-Object Hallucination in Vision-Language Models
- Author
-
Chen, Xuweiyi, Ma, Ziqiao, Zhang, Xuejun, Xu, Sihan, Qian, Shengyi, Yang, Jianing, Fouhey, David F., and Chai, Joyce
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence ,Computer Science - Computation and Language - Abstract
Large vision language models (LVLMs) often suffer from object hallucination, producing objects not present in the given images. While current benchmarks for object hallucination primarily concentrate on the presence of a single object class rather than individual entities, this work systematically investigates multi-object hallucination, examining how models misperceive (e.g., invent nonexistent objects or become distracted) when tasked with focusing on multiple objects simultaneously. We introduce Recognition-based Object Probing Evaluation (ROPE), an automated evaluation protocol that considers the distribution of object classes within a single image during testing and uses visual referring prompts to eliminate ambiguity. With comprehensive empirical studies and analysis of potential factors leading to multi-object hallucination, we found that (1) LVLMs suffer more hallucinations when focusing on multiple objects compared to a single object. (2) The tested object class distribution affects hallucination behaviors, indicating that LVLMs may follow shortcuts and spurious correlations.(3) Hallucinatory behaviors are influenced by data-specific factors, salience and frequency, and model intrinsic behaviors. We hope to enable LVLMs to recognize and reason about multiple objects that often occur in realistic visual scenes, provide insights, and quantify our progress towards mitigating the issues., Comment: Accepted to ALVR @ ACL 2024 | Project page: https://multi-object-hallucination.github.io/
- Published
- 2024
29. Empirical Investigation of the Relationship Between Design Smells and Role Stereotypes
- Author
-
Ogenrwot, Daniel, Nakatumba-Nabende, Joyce, Businge, John, and Chaudron, Michel R. V.
- Subjects
Computer Science - Software Engineering ,D.2.7 ,K.6.3 - Abstract
During software development, poor design and implementation choices can detrimentally impact software maintainability. Design smells, recurring patterns of poorly designed fragments, signify these issues. Role-stereotypes denote the generic responsibilities that classes assume in system design. Although the concepts of role-stereotypes and design smells differ, both significantly contribute to the design and maintenance of software systems. Understanding the relationship between these aspects is crucial for enhancing software maintainability, code quality, efficient code review, guided refactoring, and the design of role-specific metrics. This paper employs an exploratory approach, combining statistical analysis and unsupervised learning methods, to understand how design smells relate to role-stereotypes across desktop and mobile applications. Analyzing 11,350 classes from 30 GitHub repositories, we identified several design smells that frequently co-occur within certain role-stereotypes. Specifically, three (3) out of six (6) role-stereotypes we studied are more prone to design smells. We also examined the variation of design smells across the two ecosystems, driven by notable differences in their underlying architecture. Findings revealed that design smells are more prevalent in desktop than in mobile applications, especially within the Service Provider and Information Holder role-stereotypes. Additionally, the unsupervised learning method showed that certain pairs or groups of role-stereotypes are prone to similar types of design smells. We believe these relationships are associated with the characteristic and collaborative properties between role-stereotypes. The insights from this research provide valuable guidance for software teams on implementing design smell prevention and correction mechanisms, ensuring conceptual integrity during design and maintenance phases., Comment: 32 pages, 8 figures
- Published
- 2024
30. The Rise of Artificial Intelligence in Educational Measurement: Opportunities and Ethical Challenges
- Author
-
Bulut, Okan, Beiting-Parrish, Maggie, Casabianca, Jodi M., Slater, Sharon C., Jiao, Hong, Song, Dan, Ormerod, Christopher M., Fabiyi, Deborah Gbemisola, Ivan, Rodica, Walsh, Cole, Rios, Oscar, Wilson, Joshua, Yildirim-Erbasli, Seyma N., Wongvorachan, Tarid, Liu, Joyce Xinle, Tan, Bin, and Morilova, Polina
- Subjects
Computer Science - Computers and Society ,Computer Science - Artificial Intelligence - Abstract
The integration of artificial intelligence (AI) in educational measurement has revolutionized assessment methods, enabling automated scoring, rapid content analysis, and personalized feedback through machine learning and natural language processing. These advancements provide timely, consistent feedback and valuable insights into student performance, thereby enhancing the assessment experience. However, the deployment of AI in education also raises significant ethical concerns regarding validity, reliability, transparency, fairness, and equity. Issues such as algorithmic bias and the opacity of AI decision-making processes pose risks of perpetuating inequalities and affecting assessment outcomes. Responding to these concerns, various stakeholders, including educators, policymakers, and organizations, have developed guidelines to ensure ethical AI use in education. The National Council of Measurement in Education's Special Interest Group on AI in Measurement and Education (AIME) also focuses on establishing ethical standards and advancing research in this area. In this paper, a diverse group of AIME members examines the ethical implications of AI-powered tools in educational measurement, explores significant challenges such as automation bias and environmental impact, and proposes solutions to ensure AI's responsible and effective use in education., Comment: 59 pages, 3 figures, a joint work of the Special Interest Group on Artificial Intelligence in Measurement and Education (AIME) from the National Council of Measurement in Education (NCME)
- Published
- 2024
31. Algorithms for College Admissions Decision Support: Impacts of Policy Change and Inherent Variability
- Author
-
Lee, Jinsook, Harvey, Emma, Zhou, Joyce, Garg, Nikhil, Joachims, Thorsten, and Kizilcec, Rene F.
- Subjects
Computer Science - Computers and Society - Abstract
Each year, selective American colleges sort through tens of thousands of applications to identify a first-year class that displays both academic merit and diversity. In the 2023-2024 admissions cycle, these colleges faced unprecedented challenges. First, the number of applications has been steadily growing. Second, test-optional policies that have remained in place since the COVID-19 pandemic limit access to key information historically predictive of academic success. Most recently, longstanding debates over affirmative action culminated in the Supreme Court banning race-conscious admissions. Colleges have explored machine learning (ML) models to address the issues of scale and missing test scores, often via ranking algorithms intended to focus on 'top' applicants. However, the Court's ruling will force changes to these models, which were able to consider race as a factor in ranking. There is currently a poor understanding of how these mandated changes will shape applicant ranking algorithms, and, by extension, admitted classes. We seek to address this by quantifying the impact of different admission policies on the applications prioritized for review. We show that removing race data from a developed applicant ranking algorithm reduces the diversity of the top-ranked pool without meaningfully increasing the academic merit of that pool. We contextualize this impact by showing that excluding data on applicant race has a greater impact than excluding other potentially informative variables like intended majors. Finally, we measure the impact of policy change on individuals by comparing the arbitrariness in applicant rank attributable to policy change to the arbitrariness attributable to randomness. We find that any given policy has a high degree of arbitrariness and that removing race data from the ranking algorithm increases arbitrariness in outcomes for most applicants., Comment: 25 pages, 8 figures
- Published
- 2024
32. SpoT-Mamba: Learning Long-Range Dependency on Spatio-Temporal Graphs with Selective State Spaces
- Author
-
Choi, Jinhyeok, Kim, Heehyeon, An, Minhyeong, and Whang, Joyce Jiyoung
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
Spatio-temporal graph (STG) forecasting is a critical task with extensive applications in the real world, including traffic and weather forecasting. Although several recent methods have been proposed to model complex dynamics in STGs, addressing long-range spatio-temporal dependencies remains a significant challenge, leading to limited performance gains. Inspired by a recently proposed state space model named Mamba, which has shown remarkable capability of capturing long-range dependency, we propose a new STG forecasting framework named SpoT-Mamba. SpoT-Mamba generates node embeddings by scanning various node-specific walk sequences. Based on the node embeddings, it conducts temporal scans to capture long-range spatio-temporal dependencies. Experimental results on the real-world traffic forecasting dataset demonstrate the effectiveness of SpoT-Mamba., Comment: 6 pages, 2 figures, 3 tables. Spatio-Temporal Reasoning and Learning (STRL) Workshop at the 33rd International Joint Conference on Artificial Intelligence (IJCAI 2024)
- Published
- 2024
33. 3D Gaze Tracking for Studying Collaborative Interactions in Mixed-Reality Environments
- Author
-
Davalos, Eduardo, Zhang, Yike, S., Ashwin T., Fonteles, Joyce H., Timalsina, Umesh, and Biswas, Guatam
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence - Abstract
This study presents a novel framework for 3D gaze tracking tailored for mixed-reality settings, aimed at enhancing joint attention and collaborative efforts in team-based scenarios. Conventional gaze tracking, often limited by monocular cameras and traditional eye-tracking apparatus, struggles with simultaneous data synchronization and analysis from multiple participants in group contexts. Our proposed framework leverages state-of-the-art computer vision and machine learning techniques to overcome these obstacles, enabling precise 3D gaze estimation without dependence on specialized hardware or complex data fusion. Utilizing facial recognition and deep learning, the framework achieves real-time, tracking of gaze patterns across several individuals, addressing common depth estimation errors, and ensuring spatial and identity consistency within the dataset. Empirical results demonstrate the accuracy and reliability of our method in group environments. This provides mechanisms for significant advances in behavior and interaction analysis in educational and professional training applications in dynamic and unstructured environments., Comment: 9 pages, 8 figures, conference, submitted to ICMI 2024
- Published
- 2024
34. TACCO: Task-guided Co-clustering of Clinical Concepts and Patient Visits for Disease Subtyping based on EHR Data
- Author
-
Zhang, Ziyang, Cui, Hejie, Xu, Ran, Xie, Yuzhang, Ho, Joyce C., and Yang, Carl
- Subjects
Computer Science - Machine Learning - Abstract
The growing availability of well-organized Electronic Health Records (EHR) data has enabled the development of various machine learning models towards disease risk prediction. However, existing risk prediction methods overlook the heterogeneity of complex diseases, failing to model the potential disease subtypes regarding their corresponding patient visits and clinical concept subgroups. In this work, we introduce TACCO, a novel framework that jointly discovers clusters of clinical concepts and patient visits based on a hypergraph modeling of EHR data. Specifically, we develop a novel self-supervised co-clustering framework that can be guided by the risk prediction task of specific diseases. Furthermore, we enhance the hypergraph model of EHR data with textual embeddings and enforce the alignment between the clusters of clinical concepts and patient visits through a contrastive objective. Comprehensive experiments conducted on the public MIMIC-III dataset and Emory internal CRADLE dataset over the downstream clinical tasks of phenotype classification and cardiovascular risk prediction demonstrate an average 31.25% performance improvement compared to traditional ML baselines and a 5.26% improvement on top of the vanilla hypergraph model without our co-clustering mechanism. In-depth model analysis, clustering results analysis, and clinical case studies further validate the improved utilities and insightful interpretations delivered by TACCO. Code is available at https://github.com/PericlesHat/TACCO., Comment: 11 pages, 5 figures, to be published in Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
- Published
- 2024
- Full Text
- View/download PDF
35. Towards Bidirectional Human-AI Alignment: A Systematic Review for Clarifications, Framework, and Future Directions
- Author
-
Shen, Hua, Knearem, Tiffany, Ghosh, Reshmi, Alkiek, Kenan, Krishna, Kundan, Liu, Yachuan, Ma, Ziqiao, Petridis, Savvas, Peng, Yi-Hao, Qiwei, Li, Rakshit, Sushrita, Si, Chenglei, Xie, Yutong, Bigham, Jeffrey P., Bentley, Frank, Chai, Joyce, Lipton, Zachary, Mei, Qiaozhu, Mihalcea, Rada, Terry, Michael, Yang, Diyi, Morris, Meredith Ringel, Resnick, Paul, and Jurgens, David
- Subjects
Computer Science - Human-Computer Interaction ,Computer Science - Artificial Intelligence ,Computer Science - Computation and Language - Abstract
Recent advancements in general-purpose AI have highlighted the importance of guiding AI systems towards the intended goals, ethical principles, and values of individuals and groups, a concept broadly recognized as alignment. However, the lack of clarified definitions and scopes of human-AI alignment poses a significant obstacle, hampering collaborative efforts across research domains to achieve this alignment. In particular, ML- and philosophy-oriented alignment research often views AI alignment as a static, unidirectional process (i.e., aiming to ensure that AI systems' objectives match humans) rather than an ongoing, mutual alignment problem. This perspective largely neglects the long-term interaction and dynamic changes of alignment. To understand these gaps, we introduce a systematic review of over 400 papers published between 2019 and January 2024, spanning multiple domains such as Human-Computer Interaction (HCI), Natural Language Processing (NLP), Machine Learning (ML). We characterize, define and scope human-AI alignment. From this, we present a conceptual framework of "Bidirectional Human-AI Alignment" to organize the literature from a human-centered perspective. This framework encompasses both 1) conventional studies of aligning AI to humans that ensures AI produces the intended outcomes determined by humans, and 2) a proposed concept of aligning humans to AI, which aims to help individuals and society adjust to AI advancements both cognitively and behaviorally. Additionally, we articulate the key findings derived from literature analysis, including literature gaps and trends, human values, and interaction techniques. To pave the way for future studies, we envision three key challenges and give recommendations for future research., Comment: proposing "bidirectional human-AI alignment" framework after a systematic review of over 400 alignment papers
- Published
- 2024
36. From Basic to Extra Features: Hypergraph Transformer Pretrain-then-Finetuning for Balanced Clinical Predictions on EHR
- Author
-
Xu, Ran, Lu, Yiwen, Liu, Chang, Chen, Yong, Sun, Yan, Hu, Xiao, Ho, Joyce C, and Yang, Carl
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
Electronic Health Records (EHRs) contain rich patient information and are crucial for clinical research and practice. In recent years, deep learning models have been applied to EHRs, but they often rely on massive features, which may not be readily available for all patients. We propose HTP-Star, which leverages hypergraph structures with a pretrain-then-finetune framework for modeling EHR data, enabling seamless integration of additional features. Additionally, we design two techniques, namely (1) Smoothness-inducing Regularization and (2) Group-balanced Reweighting, to enhance the model's robustness during fine-tuning. Through experiments conducted on two real EHR datasets, we demonstrate that HTP-Star consistently outperforms various baselines while striking a balance between patients with basic and extra features., Comment: CHIL 2024
- Published
- 2024
37. 3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination
- Author
-
Yang, Jianing, Chen, Xuweiyi, Madaan, Nikhil, Iyengar, Madhavan, Qian, Shengyi, Fouhey, David F., and Chai, Joyce
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence ,Computer Science - Computation and Language ,Computer Science - Machine Learning ,Computer Science - Robotics - Abstract
The integration of language and 3D perception is crucial for developing embodied agents and robots that comprehend and interact with the physical world. While large language models (LLMs) have demonstrated impressive language understanding and generation capabilities, their adaptation to 3D environments (3D-LLMs) remains in its early stages. A primary challenge is the absence of large-scale datasets that provide dense grounding between language and 3D scenes. In this paper, we introduce 3D-GRAND, a pioneering large-scale dataset comprising 40,087 household scenes paired with 6.2 million densely-grounded scene-language instructions. Our results show that instruction tuning with 3D-GRAND significantly enhances grounding capabilities and reduces hallucinations in 3D-LLMs. As part of our contributions, we propose a comprehensive benchmark 3D-POPE to systematically evaluate hallucination in 3D-LLMs, enabling fair comparisons among future models. Our experiments highlight a scaling effect between dataset size and 3D-LLM performance, emphasizing the critical role of large-scale 3D-text datasets in advancing embodied AI research. Notably, our results demonstrate early signals for effective sim-to-real transfer, indicating that models trained on large synthetic data can perform well on real-world 3D scans. Through 3D-GRAND and 3D-POPE, we aim to equip the embodied AI community with essential resources and insights, setting the stage for more reliable and better-grounded 3D-LLMs. Project website: https://3d-grand.github.io, Comment: Project website: https://3d-grand.github.io
- Published
- 2024
38. LinkGPT: Teaching Large Language Models To Predict Missing Links
- Author
-
He, Zhongmou, Zhu, Jing, Qian, Shengyi, Chai, Joyce, and Koutra, Danai
- Subjects
Computer Science - Machine Learning - Abstract
Large Language Models (LLMs) have shown promising results on various language and vision tasks. Recently, there has been growing interest in applying LLMs to graph-based tasks, particularly on Text-Attributed Graphs (TAGs). However, most studies have focused on node classification, while the use of LLMs for link prediction (LP) remains understudied. In this work, we propose a new task on LLMs, where the objective is to leverage LLMs to predict missing links between nodes in a graph. This task evaluates an LLM's ability to reason over structured data and infer new facts based on learned patterns. This new task poses two key challenges: (1) How to effectively integrate pairwise structural information into the LLMs, which is known to be crucial for LP performance, and (2) how to solve the computational bottleneck when teaching LLMs to perform LP. To address these challenges, we propose LinkGPT, the first end-to-end trained LLM for LP tasks. To effectively enhance the LLM's ability to understand the underlying structure, we design a two-stage instruction tuning approach where the first stage fine-tunes the pairwise encoder, projector, and node projector, and the second stage further fine-tunes the LLMs to predict links. To address the efficiency challenges at inference time, we introduce a retrieval-reranking scheme. Experiments show that LinkGPT can achieve state-of-the-art performance on real-world graphs as well as superior generalization in zero-shot and few-shot learning, surpassing existing benchmarks. At inference time, it can achieve $10\times$ speedup while maintaining high LP accuracy.
- Published
- 2024
39. DriVLMe: Enhancing LLM-based Autonomous Driving Agents with Embodied and Social Experiences
- Author
-
Huang, Yidong, Sansom, Jacob, Ma, Ziqiao, Gervits, Felix, and Chai, Joyce
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence ,Computer Science - Computation and Language - Abstract
Recent advancements in foundation models (FMs) have unlocked new prospects in autonomous driving, yet the experimental settings of these studies are preliminary, over-simplified, and fail to capture the complexity of real-world driving scenarios in human environments. It remains under-explored whether FM agents can handle long-horizon navigation tasks with free-from dialogue and deal with unexpected situations caused by environmental dynamics or task changes. To explore the capabilities and boundaries of FMs faced with the challenges above, we introduce DriVLMe, a video-language-model-based agent to facilitate natural and effective communication between humans and autonomous vehicles that perceive the environment and navigate. We develop DriVLMe from both embodied experiences in a simulated environment and social experiences from real human dialogue. While DriVLMe demonstrates competitive performance in both open-loop benchmarks and closed-loop human studies, we reveal several limitations and challenges, including unacceptable inference time, imbalanced training data, limited visual understanding, challenges with multi-turn interactions, simplified language generation from robotic experiences, and difficulties in handling on-the-fly unexpected situations like environmental dynamics and task changes., Comment: First Vision and Language for Autonomous Driving and Robotics Workshop (VLADR @ CVPR 2024)
- Published
- 2024
40. An Expanded Set of Los Alamos OPLIB Tables in MESA: Type-1 Rosseland-mean Opacities and Solar Models
- Author
-
Farag, Ebraheem, Fontes, Christopher J., Timmes, F. X., Bellinger, Earl P., Guzik, Joyce A., Bauer, Evan B., Wood, Suzannah R., Mussack, Katie, Hakel, Peter, Colgan, James, Kilcrease, David P., Sherrill, Manolo E., Raecke, Tryston C., and Chidester, Morgan T.
- Subjects
Astrophysics - Solar and Stellar Astrophysics - Abstract
We present a set of 1194 Type-1 Rosseland-mean opacity tables for four different metallicity mixtures. These new Los Alamos OPLIB atomic radiative opacity tables are an order of magnitude larger in number than any previous opacity table release, and span regimes where previous opacity tables have not existed. For example, the new set of opacity tables expands the metallicity range to $Z$\,=\,10$^{-6}$ to $Z$\,=\,0.2 which allows improved accuracy of opacities at low and high metallicity, increases the table density in the metallicity range $Z$\,=\,10$^{-4}$ to $Z$\,=\,0.1 to enhance the accuracy of opacities drawn from interpolations across neighboring metallicities, and adds entries for hydrogen mass fractions between $X$\,=\,0 and $X$\,=\,0.1 including $X$\,=\,$10^{-2}, 10^{-3}, 10^{-4}, 10^{-5}, 10^{-6}$ that can improve stellar models of hydrogen deficient stars. We implement these new OPLIB radiative opacity tables in \MESA, and find that calibrated solar models agree broadly with previously published helioseismic and solar neutrino results. We find differences between using the new 1194 OPLIB opacity tables and the 126 OPAL opacity tables range from $\approx$\,20--80\% across individual chemical mixtures, up to $\approx$\,8\% and $\approx$\,15\% at the bottom and top of the solar convection zone respectively, and $\approx$\,7\% in the solar core. We also find differences between standard solar models using different opacity table sources that are on par with altering the initial abundance mixture. We conclude that this new, open-access set of OPLIB opacity tables does not solve the solar modeling problem, and suggest the investigation of physical mechanisms other than the atomic radiative opacity., Comment: 28 pages, 12 (13) figures. Accepted for Publication in ApJ
- Published
- 2024
- Full Text
- View/download PDF
41. A Staged Approach using Machine Learning and Uncertainty Quantification to Predict the Risk of Hip Fracture
- Author
-
Shaik, Anjum, Larsen, Kristoffer, Lane, Nancy E., Zhao, Chen, Su, Kuan-Jui, Keyak, Joyce H., Tian, Qing, Sha, Qiuying, Shen, Hui, Deng, Hong-Wen, and Zhou, Weihua
- Subjects
Physics - Medical Physics ,Computer Science - Machine Learning - Abstract
Despite advancements in medical care, hip fractures impose a significant burden on individuals and healthcare systems. This paper focuses on the prediction of hip fracture risk in older and middle-aged adults, where falls and compromised bone quality are predominant factors. We propose a novel staged model that combines advanced imaging and clinical data to improve predictive performance. By using CNNs to extract features from hip DXA images, along with clinical variables, shape measurements, and texture features, our method provides a comprehensive framework for assessing fracture risk. A staged machine learning-based model was developed using two ensemble models: Ensemble 1 (clinical variables only) and Ensemble 2 (clinical variables and DXA imaging features). This staged approach used uncertainty quantification from Ensemble 1 to decide if DXA features are necessary for further prediction. Ensemble 2 exhibited the highest performance, achieving an AUC of 0.9541, an accuracy of 0.9195, a sensitivity of 0.8078, and a specificity of 0.9427. The staged model also performed well, with an AUC of 0.8486, an accuracy of 0.8611, a sensitivity of 0.5578, and a specificity of 0.9249, outperforming Ensemble 1, which had an AUC of 0.5549, an accuracy of 0.7239, a sensitivity of 0.1956, and a specificity of 0.8343. Furthermore, the staged model suggested that 54.49% of patients did not require DXA scanning. It effectively balanced accuracy and specificity, offering a robust solution when DXA data acquisition is not always feasible. Statistical tests confirmed significant differences between the models, highlighting the advantages of the advanced modeling strategies. Our staged approach could identify individuals at risk with a high accuracy but reduce the unnecessary DXA scanning. It has great promise to guide interventions to prevent hip fractures with reduced cost and radiation., Comment: 29 pages, 5 figures, 6 tables
- Published
- 2024
42. Barium stars as tracers of s-process nucleosynthesis in AGB stars III. Systematic deviations from the AGB models
- Author
-
Világos, B., Cseh, B., López, A. Yagüe, Joyce, M., Karakas, A., Tagliente, G., and Lugaro, M.
- Subjects
Astrophysics - Solar and Stellar Astrophysics ,Astrophysics - Astrophysics of Galaxies - Abstract
Barium (Ba) stars help to verify asymptotic giant branch (AGB) star nucleosynthesis models since they experienced pollution from an AGB binary companion and thus their spectra carry the signatures of the slow neutron capture process (s process). For 180 Ba stars, we searched for AGB stellar models that match the observed abundance patterns. We employed three machine learning algorithms as classifiers: a Random Forest method, developed for this work, and the two classifiers used in our previous study. We studied the statistical behaviour of the s-process elements in the observational sample to investigate if the AGB models systematically under- or overpredict the abundances observed in the Ba stars and show the results in the form of violin plots of the residuals between spectroscopic abundances and model predictions. We find a significant trend in the residuals that implies an underproduction of the elements Nb, Mo, and Ru in the models relative to the observations. This may originate from a process (e.g. the intermediate neutron-capture process, i process) at the metallicity of the Ba stars not yet included in the AGB models. Correlations are found between the residuals of these elements, suggesting a common origin for the deviations. In addition, there is a weak metallicity dependence of their residuals. The s-process temperatures derived with the [Zr/Fe] - [Nb/Fe] thermometer have an unrealistic value for the majority of our stars. The most likely explanation is that at least a fraction of these elements are not produced in a steady-state s process, and instead may be due to processes not included in the AGB models. The mass distribution of the identified models confirms that our sample of Ba stars was polluted by low-mass AGB stars. Most of the matching AGB models require low accreted mass, but a few systems with high accreted mass are needed to explain the observations. (abridged), Comment: 20 pages, 20 figures
- Published
- 2024
- Full Text
- View/download PDF
43. X-ray and UV radiation in the planet-forming T-Tauri system PDS 70. Signs of accretion and coronal activity
- Author
-
Joyce, Simon R. G., Pye, John P., Nichols, Jonathan D., Alexander, Richard, Gudel, Manuel, and Barrado, David
- Subjects
Astrophysics - Solar and Stellar Astrophysics ,Astrophysics - Earth and Planetary Astrophysics ,Astrophysics - High Energy Astrophysical Phenomena - Abstract
Planet formation takes place in protoplanetary discs around young T-Tauri stars. PDS 70 is one of the first confirmed examples of a system where the planets are currently forming in gaps in the disc, and can be directly imaged. One of the main early influences on planet formation is the lifetime of the protoplanetary disk, which is limited by the intense stellar X-ray and UV radiation. Stellar coronal activity and accretion of material onto the star are both potential sources of XUV radiation. Previous \textit{Swift} observations detected UV emission, which were consistent with a low rate of accretion. We present follow up observations with the XMM-Newton observatory, which observed PDS 70 simultaneously in X-ray and UV in order to determine intensity of XUV radiation in the system, and identify if the source is coronal, accretion, or both. We detect a strong source in both X-ray and UV, with an average X-ray 0.2-12 keV luminosity of $1.37\times10^{30}\ \mathrm{erg\ s}^{-1}$, and a possible flare which increased the luminosity to $2.8\times10^{30}\ \mathrm{erg\ s}^{-1}$. The UV flux density is in excess of what would be expected from chromospheric emission, and supports the interpretation that PDS 70 has continuing weak accretion less than $\sim10^{-10}\ \mathrm{M_{\odot}\ yr^{-1}}$. The implications of the detected X-ray and UV radiation are that the disc is likely to be in the final stages of dispersal, and will be completely evaporated in the next million years, bringing an end to the primary planet formation process., Comment: 17 pages, Published in MNRAS
- Published
- 2024
- Full Text
- View/download PDF
44. Babysit A Language Model From Scratch: Interactive Language Learning by Trials and Demonstrations
- Author
-
Ma, Ziqiao, Wang, Zekun, and Chai, Joyce
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
Humans are efficient language learners and inherently social creatures. Our language development is largely shaped by our social interactions, for example, the demonstration and feedback from caregivers. Contrary to human language learning, recent advancements in large language models have primarily adopted a non-interactive training paradigm, and refined pre-trained models through feedback afterward. In this work, we aim to examine how corrective feedback from interactions influences neural language acquisition from the ground up through systematically controlled experiments, assessing whether it contributes to learning efficiency in language models. We introduce a trial-and-demonstration (TnD) learning framework that incorporates three components: student trials, teacher demonstrations, and a reward conditioned on language competence at various developmental stages. Our experiments reveal that the TnD approach accelerates word acquisition for student models of equal and smaller numbers of parameters, and we highlight the significance of both trials and demonstrations. We further show that the teacher's choices of words influence students' word-specific learning efficiency, and a practice-makes-perfect effect is evident by a strong correlation between the frequency of words in trials and their respective learning curves. Our findings suggest that interactive language learning, with teacher demonstrations and student trials, can facilitate efficient word learning in language models.
- Published
- 2024
45. Data-driven Discovery for Robust Optimization of Semiconductor Nanowire Lasers
- Author
-
Church, Stephen A, Vitale, Francesco, Gopakumar, Aswani, Gagrani, Nikita, Zhang, Yunyan, Jiang, Nian, Tan, Hark Hoe, Jagadish, Chennupati, Liu, Huiyun, Joyce, Hannah, Ronning, Carsten, and Parkinson, Patrick
- Subjects
Physics - Optics ,Condensed Matter - Materials Science - Abstract
Active wavelength-scale optoelectronic components are widely used in photonic integrated circuitry, however coherent sources of light -- namely optical lasers -- remain the most challenging component to integrate. Semiconductor nanowire lasers represent a flexible class of light source where each nanowire is both gain material and cavity; however, strong coupling between these properties and the performance leads to inhomogeneity across the population. While this has been studied and optimized for individual material systems, no architecture-wide insight is available. Here, nine nanowire laser material systems are studied and compared using 55,516 nanowire lasers to provide statistically robust insight into performance. These results demonstrate that, while it may be important to optimise internal quantum efficiency for certain materials, cavity effects are always critical. Our study provides a roadmap to optimize the performance of nanowire lasers made from any material: this can be achieved by ensuring a narrow spread of lengths and end-facet reflectivities.
- Published
- 2024
46. A New Asteroseismic $\textit{Kepler}$ Benchmark Constrains the Onset of Weakened Magnetic Braking in Mature Sun-Like Stars
- Author
-
Bhalotia, Vanshree, Huber, Daniel, van Saders, Jennifer L., Metcalfe, Travis S., Stassun, Keivan G., White, Timothy R., Børsen-Koch, Víctor Aguirre, Ball, Warrick H., Basu, Sarbani, Serenelli, Aldo M., Sawczynec, Erica, Guzik, Joyce A., Howard, Andrew W., and Isaacson, Howard
- Subjects
Astrophysics - Solar and Stellar Astrophysics - Abstract
Stellar spin down is a critical yet poorly understood component of stellar evolution. In particular, results from the Kepler Mission imply that mature age, solar-type stars have inefficient magnetic braking, resulting in a stalled spin down rate. However, a large number of precise asteroseismic ages are needed for mature ($\geq$ 3Gyr) stars in order to probe the regime where traditional and stalled spin-down models differ. In this paper, we present a new asteroseismic benchmark star for gyrochronology discovered using reprocessed Kepler short cadence data. KIC 11029516 (Papayu) is a bright ($K_{p}$ = 9.6 mag) solar-type star with well-measured rotation period (21.1$\pm$0.8 days) from spot modulation using 4 years of Kepler long cadence data. We combine asteroseismology and spectroscopy to obtain $T_{eff}=5888\pm100$ K, $\rm{[Fe/H]} = 0.30 \pm 0.06\,$ dex, $M = 1.24 \pm 0.05 M_{\odot}$, $R = 1.34 \pm 0.02 R_{\odot}$ and age of 4.0 $\pm$ 0.4 Gyr, making Papayu one of the most similar stars to the Sun in terms of temperature and radius with an asteroseismic age and a rotation period measured from spot modulation. We find that Papayu sits at the transition of where traditional and weakened spin-down models diverge. A comparison with stars of similar zero-age main-sequence temperatures supports previous findings that weakened spin-down models are required to explain the ages and rotation periods of old solar-type stars.
- Published
- 2024
47. Neural Optimization with Adaptive Heuristics for Intelligent Marketing System
- Author
-
Wei, Changshuai, Zelditch, Benjamin, Chen, Joyce, Ribeiro, Andre Assuncao Silva T, Tay, Jingyi Kenneth, Elizondo, Borja Ocejo, Selvaraj, Keerthi, Gupta, Aman, and De Almeida, Licurgo Benemann
- Subjects
Statistics - Methodology ,Computer Science - Artificial Intelligence ,Computer Science - Information Retrieval ,Computer Science - Machine Learning ,Mathematics - Optimization and Control ,G.3 ,G.1.6 ,I.2 - Abstract
Computational marketing has become increasingly important in today's digital world, facing challenges such as massive heterogeneous data, multi-channel customer journeys, and limited marketing budgets. In this paper, we propose a general framework for marketing AI systems, the Neural Optimization with Adaptive Heuristics (NOAH) framework. NOAH is the first general framework for marketing optimization that considers both to-business (2B) and to-consumer (2C) products, as well as both owned and paid channels. We describe key modules of the NOAH framework, including prediction, optimization, and adaptive heuristics, providing examples for bidding and content optimization. We then detail the successful application of NOAH to LinkedIn's email marketing system, showcasing significant wins over the legacy ranking system. Additionally, we share details and insights that are broadly useful, particularly on: (i) addressing delayed feedback with lifetime value, (ii) performing large-scale linear programming with randomization, (iii) improving retrieval with audience expansion, (iv) reducing signal dilution in targeting tests, and (v) handling zero-inflated heavy-tail metrics in statistical testing., Comment: KDD 2024
- Published
- 2024
- Full Text
- View/download PDF
48. Building a Luganda Text-to-Speech Model From Crowdsourced Data
- Author
-
Kagumire, Sulaiman, Katumba, Andrew, Nakatumba-Nabende, Joyce, and Quinn, John
- Subjects
Computer Science - Sound ,Computer Science - Computation and Language ,Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
Text-to-speech (TTS) development for African languages such as Luganda is still limited, primarily due to the scarcity of high-quality, single-speaker recordings essential for training TTS models. Prior work has focused on utilizing the Luganda Common Voice recordings of multiple speakers aged between 20-49. Although the generated speech is intelligible, it is still of lower quality than the model trained on studio-grade recordings. This is due to the insufficient data preprocessing methods applied to improve the quality of the Common Voice recordings. Furthermore, speech convergence is more difficult to achieve due to varying intonations, as well as background noise. In this paper, we show that the quality of Luganda TTS from Common Voice can improve by training on multiple speakers of close intonation in addition to further preprocessing of the training data. Specifically, we selected six female speakers with close intonation determined by subjectively listening and comparing their voice recordings. In addition to trimming out silent portions from the beginning and end of the recordings, we applied a pre-trained speech enhancement model to reduce background noise and enhance audio quality. We also utilized a pre-trained, non-intrusive, self-supervised Mean Opinion Score (MOS) estimation model to filter recordings with an estimated MOS over 3.5, indicating high perceived quality. Subjective MOS evaluations from nine native Luganda speakers demonstrate that our TTS model achieves a significantly better MOS of 3.55 compared to the reported 2.5 MOS of the existing model. Moreover, for a fair comparison, our model trained on six speakers outperforms models trained on a single-speaker (3.13 MOS) or two speakers (3.22 MOS). This showcases the effectiveness of compensating for the lack of data from one speaker with data from multiple speakers of close intonation to improve TTS quality., Comment: Presented at the AfricaNLP workshop at ICLR 2024
- Published
- 2024
49. PromptLink: Leveraging Large Language Models for Cross-Source Biomedical Concept Linking
- Author
-
Xie, Yuzhang, Lu, Jiaying, Ho, Joyce, Nahab, Fadi, Hu, Xiao, and Yang, Carl
- Subjects
Computer Science - Information Retrieval ,Computer Science - Artificial Intelligence ,Computer Science - Computation and Language - Abstract
Linking (aligning) biomedical concepts across diverse data sources enables various integrative analyses, but it is challenging due to the discrepancies in concept naming conventions. Various strategies have been developed to overcome this challenge, such as those based on string-matching rules, manually crafted thesauri, and machine learning models. However, these methods are constrained by limited prior biomedical knowledge and can hardly generalize beyond the limited amounts of rules, thesauri, or training samples. Recently, large language models (LLMs) have exhibited impressive results in diverse biomedical NLP tasks due to their unprecedentedly rich prior knowledge and strong zero-shot prediction abilities. However, LLMs suffer from issues including high costs, limited context length, and unreliable predictions. In this research, we propose PromptLink, a novel biomedical concept linking framework that leverages LLMs. It first employs a biomedical-specialized pre-trained language model to generate candidate concepts that can fit in the LLM context windows. Then it utilizes an LLM to link concepts through two-stage prompts, where the first-stage prompt aims to elicit the biomedical prior knowledge from the LLM for the concept linking task and the second-stage prompt enforces the LLM to reflect on its own predictions to further enhance their reliability. Empirical results on the concept linking task between two EHR datasets and an external biomedical KG demonstrate the effectiveness of PromptLink. Furthermore, PromptLink is a generic framework without reliance on additional prior knowledge, context, or training data, making it well-suited for concept linking across various types of data sources. The source code is available at https://github.com/constantjxyz/PromptLink.
- Published
- 2024
- Full Text
- View/download PDF
50. Robust Online Convex Optimization for Disturbance Rejection
- Author
-
Lai, Joyce and Seiler, Peter
- Subjects
Electrical Engineering and Systems Science - Systems and Control ,Mathematics - Optimization and Control - Abstract
Online convex optimization (OCO) is a powerful tool for learning sequential data, making it ideal for high precision control applications where the disturbances are arbitrary and unknown in advance. However, the ability of OCO-based controllers to accurately learn the disturbance while maintaining closed-loop stability relies on having an accurate model of the plant. This paper studies the performance of OCO-based controllers for linear time-invariant (LTI) systems subject to disturbance and model uncertainty. The model uncertainty can cause the closed-loop to become unstable. We provide a sufficient condition for robust stability based on the small gain theorem. This condition is easily incorporated as an on-line constraint in the OCO controller. Finally, we verify via numerical simulations that imposing the robust stability condition on the OCO controller ensures closed-loop stability.
- Published
- 2024
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.