Author: "Stuhlmüller, Andreas" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Stuhlmüller, Andreas"' showing total 15 results

Start Over Author "Stuhlmüller, Andreas"

15 results on '"Stuhlmüller, Andreas"'

1. Elicit: Language models as research tools

Author: Byun, Jungwon, primary and Stuhlmüller, Andreas, additional
Published: 2023
Full Text: View/download PDF

2. Iterated Decomposition: Improving Science Q&A by Supervising Reasoning Processes

Author: Reppert, Justin, Rachbach, Ben, George, Charlie, Stebbing, Luke, Byun, Jungwon, Appleton, Maggie, Stuhlmüller, Andreas, Reppert, Justin, Rachbach, Ben, George, Charlie, Stebbing, Luke, Byun, Jungwon, Appleton, Maggie, and Stuhlmüller, Andreas
Abstract: Language models (LMs) can perform complex reasoning either end-to-end, with hidden latent state, or compositionally, with transparent intermediate state. Composition offers benefits for interpretability and safety, but may need workflow support and infrastructure to remain competitive. We describe iterated decomposition, a human-in-the-loop workflow for developing and refining compositional LM programs. We improve the performance of compositions by zooming in on failing components and refining them through decomposition, additional context, chain of thought, etc. To support this workflow, we develop ICE, an open-source tool for visualizing the execution traces of LM programs. We apply iterated decomposition to three real-world tasks and improve the accuracy of LM programs over less compositional baselines: describing the placebo used in a randomized controlled trial (25% to 65%), evaluating participant adherence to a medical intervention (53% to 70%), and answering NLP questions on the Qasper dataset (38% to 69%). These applications serve as case studies for a workflow that, if automated, could keep ML systems interpretable and safe even as they scale to increasingly complex tasks.
Published: 2023

3. Factored Verification: Detecting and Reducing Hallucination in Summaries of Academic Papers

Author: George, Charlie, Stuhlmüller, Andreas, George, Charlie, and Stuhlmüller, Andreas
Abstract: Hallucination plagues even frontier LLMs--but how bad is it really for summarizing academic papers? We evaluate Factored Verification, a simple automated method for detecting hallucinations in abstractive summaries. This method sets a new SotA on hallucination detection in the summarization task of the HaluEval benchmark, achieving 76.2% accuracy. We then use this method to estimate how often language models hallucinate when summarizing across multiple academic papers and find 0.62 hallucinations in the average ChatGPT (16k) summary, 0.84 for GPT-4, and 1.55 for Claude 2. We ask models to self-correct using Factored Critiques and find that this lowers the number of hallucinations to 0.49 for ChatGPT, 0.46 for GPT-4, and 0.95 for Claude 2. The hallucinations we find are often subtle, so we advise caution when using models to synthesize academic papers., Comment: Second Workshop on Information Extraction from Scientific Publications (WIESP) at IJCNLP-AACL 2023
Published: 2023

4. Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

Author: Srivastava, Aarohi, Rastogi, Abhinav, Rao, Abhishek, Shoeb, Abu Awal Md, Abid, Abubakar, Fisch, Adam, Brown, Adam R., Santoro, Adam, Gupta, Aditya, Garriga-Alonso, Adrià, Kluska, Agnieszka, Lewkowycz, Aitor, Agarwal, Akshat, Power, Alethea, Ray, Alex, Warstadt, Alex, Kocurek, Alexander W., Safaya, Ali, Tazarv, Ali, Xiang, Alice, Parrish, Alicia, Nie, Allen, Hussain, Aman, Askell, Amanda, Dsouza, Amanda, Slone, Ambrose, Rahane, Ameet, Iyer, Anantharaman S., Andreassen, Anders, Madotto, Andrea, Santilli, Andrea, Stuhlmüller, Andreas, Dai, Andrew, La, Andrew, Lampinen, Andrew, Zou, Andy, Jiang, Angela, Chen, Angelica, Vuong, Anh, Gupta, Animesh, Gottardi, Anna, Norelli, Antonio, Venkatesh, Anu, Gholamidavoodi, Arash, Tabassum, Arfa, Menezes, Arul, Kirubarajan, Arun, Mullokandov, Asher, Sabharwal, Ashish, Herrick, Austin, Efrat, Avia, Erdem, Aykut, Karakaş, Ayla, Roberts, B. Ryan, Loe, Bao Sheng, Zoph, Barret, Bojanowski, Bartłomiej, Özyurt, Batuhan, Hedayatnia, Behnam, Neyshabur, Behnam, Inden, Benjamin, Stein, Benno, Ekmekci, Berk, Lin, Bill Yuchen, Howald, Blake, Orinion, Bryan, Diao, Cameron, Dour, Cameron, Stinson, Catherine, Argueta, Cedrick, Ramírez, César Ferri, Singh, Chandan, Rathkopf, Charles, Meng, Chenlin, Baral, Chitta, Wu, Chiyu, Callison-Burch, Chris, Waites, Chris, Voigt, Christian, Manning, Christopher D., Potts, Christopher, Ramirez, Cindy, Rivera, Clara E., Siro, Clemencia, Raffel, Colin, Ashcraft, Courtney, Garbacea, Cristina, Sileo, Damien, Garrette, Dan, Hendrycks, Dan, Kilman, Dan, Roth, Dan, Freeman, Daniel, Khashabi, Daniel, Levy, Daniel, González, Daniel Moseguí, Perszyk, Danielle, Hernandez, Danny, Chen, Danqi, Ippolito, Daphne, Gilboa, Dar, Dohan, David, Drakard, David, Jurgens, David, Datta, Debajyoti, Ganguli, Deep, Emelin, Denis, Kleyko, Denis, Yuret, Deniz, Chen, Derek, Tam, Derek, Hupkes, Dieuwke, Misra, Diganta, Buzan, Dilyar, Mollo, Dimitri Coelho, Yang, Diyi, Lee, Dong-Ho, Schrader, Dylan, Shutova, Ekaterina, Cubuk, Ekin Dogus, Segal, Elad, Hagerman, Eleanor, Barnes, Elizabeth, Donoway, Elizabeth, Pavlick, Ellie, Rodola, Emanuele, Lam, Emma, Chu, Eric, Tang, Eric, Erdem, Erkut, Chang, Ernie, Chi, Ethan A., Dyer, Ethan, Jerzak, Ethan, Kim, Ethan, Manyasi, Eunice Engefu, Zheltonozhskii, Evgenii, Xia, Fanyue, Siar, Fatemeh, Martínez-Plumed, Fernando, Happé, Francesca, Chollet, Francois, Rong, Frieda, Mishra, Gaurav, Winata, Genta Indra, de Melo, Gerard, Kruszewski, Germán, Parascandolo, Giambattista, Mariani, Giorgio, Wang, Gloria, Jaimovitch-López, Gonzalo, Betz, Gregor, Gur-Ari, Guy, Galijasevic, Hana, Kim, Hannah, Rashkin, Hannah, Hajishirzi, Hannaneh, Mehta, Harsh, Bogar, Hayden, Shevlin, Henry, Schütze, Hinrich, Yakura, Hiromu, Zhang, Hongming, Wong, Hugh Mee, Ng, Ian, Noble, Isaac, Jumelet, Jaap, Geissinger, Jack, Kernion, Jackson, Hilton, Jacob, Lee, Jaehoon, Fisac, Jaime Fernández, Simon, James B., Koppel, James, Zheng, James, Zou, James, Kocoń, Jan, Thompson, Jana, Wingfield, Janelle, Kaplan, Jared, Radom, Jarema, Sohl-Dickstein, Jascha, Phang, Jason, Wei, Jason, Yosinski, Jason, Novikova, Jekaterina, Bosscher, Jelle, Marsh, Jennifer, Kim, Jeremy, Taal, Jeroen, Engel, Jesse, Alabi, Jesujoba, Xu, Jiacheng, Song, Jiaming, Tang, Jillian, Waweru, Joan, Burden, John, Miller, John, Balis, John U., Batchelder, Jonathan, Berant, Jonathan, Frohberg, Jörg, Rozen, Jos, Hernandez-Orallo, Jose, Boudeman, Joseph, Guerr, Joseph, Jones, Joseph, Tenenbaum, Joshua B., Rule, Joshua S., Chua, Joyce, Kanclerz, Kamil, Livescu, Karen, Krauth, Karl, Gopalakrishnan, Karthik, Ignatyeva, Katerina, Markert, Katja, Dhole, Kaustubh D., Gimpel, Kevin, Omondi, Kevin, Mathewson, Kory, Chiafullo, Kristen, Shkaruta, Ksenia, Shridhar, Kumar, McDonell, Kyle, Richardson, Kyle, Reynolds, Laria, Gao, Leo, Zhang, Li, Dugan, Liam, Qin, Lianhui, Contreras-Ochando, Lidia, Morency, Louis-Philippe, Moschella, Luca, Lam, Lucas, Noble, Lucy, Schmidt, Ludwig, He, Luheng, Colón, Luis Oliveros, Metz, Luke, Şenel, Lütfi Kerem, Bosma, Maarten, Sap, Maarten, ter Hoeve, Maartje, Farooqi, Maheen, Faruqui, Manaal, Mazeika, Mantas, Baturan, Marco, Marelli, Marco, Maru, Marco, Quintana, Maria Jose Ramírez, Tolkiehn, Marie, Giulianelli, Mario, Lewis, Martha, Potthast, Martin, Leavitt, Matthew L., Hagen, Matthias, Schubert, Mátyás, Baitemirova, Medina Orduna, Arnaud, Melody, McElrath, Melvin, Yee, Michael A., Cohen, Michael, Gu, Michael, Ivanitskiy, Michael, Starritt, Michael, Strube, Michael, Swędrowski, Michał, Bevilacqua, Michele, Yasunaga, Michihiro, Kale, Mihir, Cain, Mike, Xu, Mimee, Suzgun, Mirac, Walker, Mitch, Tiwari, Mo, Bansal, Mohit, Aminnaseri, Moin, Geva, Mor, Gheini, Mozhdeh, T, Mukund Varma, Peng, Nanyun, Chi, Nathan A., Lee, Nayeon, Krakover, Neta Gur-Ari, Cameron, Nicholas, Roberts, Nicholas, Doiron, Nick, Martinez, Nicole, Nangia, Nikita, Deckers, Niklas, Muennighoff, Niklas, Keskar, Nitish Shirish, Iyer, Niveditha S., Constant, Noah, Fiedel, Noah, Wen, Nuan, Zhang, Oliver, Agha, Omar, Elbaghdadi, Omar, Levy, Omer, Evans, Owain, Casares, Pablo Antonio Moreno, Doshi, Parth, Fung, Pascale, Liang, Paul Pu, Vicol, Paul, Alipoormolabashi, Pegah, Liao, Peiyuan, Liang, Percy, Chang, Peter, Eckersley, Peter, Htut, Phu Mon, Hwang, Pinyu, Miłkowski, Piotr, Patil, Piyush, Pezeshkpour, Pouya, Oli, Priti, Mei, Qiaozhu, Lyu, Qing, Chen, Qinlang, Banjade, Rabin, Rudolph, Rachel Etta, Gabriel, Raefer, Habacker, Rahel, Risco, Ramon, Millière, Raphaël, Garg, Rhythm, Barnes, Richard, Saurous, Rif A., Arakawa, Riku, Raymaekers, Robbe, Frank, Robert, Sikand, Rohan, Novak, Roman, Sitelew, Roman, LeBras, Ronan, Liu, Rosanne, Jacobs, Rowan, Zhang, Rui, Salakhutdinov, Ruslan, Chi, Ryan, Lee, Ryan, Stovall, Ryan, Teehan, Ryan, Yang, Rylan, Singh, Sahib, Mohammad, Saif M., Anand, Sajant, Dillavou, Sam, Shleifer, Sam, Wiseman, Sam, Gruetter, Samuel, Bowman, Samuel R., Schoenholz, Samuel S., Han, Sanghyun, Kwatra, Sanjeev, Rous, Sarah A., Ghazarian, Sarik, Ghosh, Sayan, Casey, Sean, Bischoff, Sebastian, Gehrmann, Sebastian, Schuster, Sebastian, Sadeghi, Sepideh, Hamdan, Shadi, Zhou, Sharon, Srivastava, Shashank, Shi, Sherry, Singh, Shikhar, Asaadi, Shima, Gu, Shixiang Shane, Pachchigar, Shubh, Toshniwal, Shubham, Upadhyay, Shyam, Shyamolima, Debnath, Shakeri, Siamak, Thormeyer, Simon, Melzi, Simone, Reddy, Siva, Makini, Sneha Priscilla, Lee, Soo-Hwan, Torene, Spencer, Hatwar, Sriharsha, Dehaene, Stanislas, Divic, Stefan, Ermon, Stefano, Biderman, Stella, Lin, Stephanie, Prasad, Stephen, Piantadosi, Steven T., Shieber, Stuart M., Misherghi, Summer, Kiritchenko, Svetlana, Mishra, Swaroop, Linzen, Tal, Schuster, Tal, Li, Tao, Yu, Tao, Ali, Tariq, Hashimoto, Tatsu, Wu, Te-Lin, Desbordes, Théo, Rothschild, Theodore, Phan, Thomas, Wang, Tianle, Nkinyili, Tiberius, Schick, Timo, Kornev, Timofei, Tunduny, Titus, Gerstenberg, Tobias, Chang, Trenton, Neeraj, Trishala, Khot, Tushar, Shultz, Tyler, Shaham, Uri, Misra, Vedant, Demberg, Vera, Nyamai, Victoria, Raunak, Vikas, Ramasesh, Vinay, Prabhu, Vinay Uday, Padmakumar, Vishakh, Srikumar, Vivek, Fedus, William, Saunders, William, Zhang, William, Vossen, Wout, Ren, Xiang, Tong, Xiaoyu, Zhao, Xinran, Wu, Xinyi, Shen, Xudong, Yaghoobzadeh, Yadollah, Lakretz, Yair, Song, Yangqiu, Bahri, Yasaman, Choi, Yejin, Yang, Yichi, Hao, Yiding, Chen, Yifu, Belinkov, Yonatan, Hou, Yu, Hou, Yufang, Bai, Yuntao, Seid, Zachary, Zhao, Zhuoye, Wang, Zijian, Wang, Zijie J., Wang, Zirui, Wu, Ziyi, and Department of Philosophy
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Computer Science - Computation and Language, Computer Science - Artificial Intelligence, cs.LG, cs.CL, Machine Learning (stat.ML), cs.AI, stat.ML, Machine Learning (cs.LG), Computer Science - Computers and Society, Artificial Intelligence (cs.AI), Statistics - Machine Learning, Computers and Society (cs.CY), Computation and Language (cs.CL), cs.CY
Abstract: Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-future capabilities and limitations of language models. To address this challenge, we introduce the Beyond the Imitation Game benchmark (BIG-bench). BIG-bench currently consists of 204 tasks, contributed by 450 authors across 132 institutions. Task topics are diverse, drawing problems from linguistics, childhood development, math, common-sense reasoning, biology, physics, social bias, software development, and beyond. BIG-bench focuses on tasks that are believed to be beyond the capabilities of current language models. We evaluate the behavior of OpenAI's GPT models, Google-internal dense transformer architectures, and Switch-style sparse transformers on BIG-bench, across model sizes spanning millions to hundreds of billions of parameters. In addition, a team of human expert raters performed all tasks in order to provide a strong baseline. Findings include: model performance and calibration both improve with scale, but are poor in absolute terms (and when compared with rater performance); performance is remarkably similar across model classes, though with benefits from sparsity; tasks that improve gradually and predictably commonly involve a large knowledge or memorization component, whereas tasks that exhibit "breakthrough" behavior at a critical scale often involve multiple steps or components, or brittle metrics; social bias typically increases with scale in settings with ambiguous context, but this can be improved with prompting., 27 pages, 17 figures + references and appendices, repo: https://github.com/google/BIG-bench
Published: 2022

5. RAFT: A Real-World Few-Shot Text Classification Benchmark

Author: Alex, Neel, Lifland, Eli, Tunstall, Lewis, Thakur, Abhishek, Maham, Pegah, Riedel, C. Jess, Hine, Emmie, Ashurst, Carolyn, Sedille, Paul, Carlier, Alexis, Noetel, Michael, Stuhlmüller, Andreas, Alex, Neel, Lifland, Eli, Tunstall, Lewis, Thakur, Abhishek, Maham, Pegah, Riedel, C. Jess, Hine, Emmie, Ashurst, Carolyn, Sedille, Paul, Carlier, Alexis, Noetel, Michael, and Stuhlmüller, Andreas
Abstract: Large pre-trained language models have shown promise for few-shot learning, completing text-based tasks given only a few task-specific examples. Will models soon solve classification tasks that have so far been reserved for human research assistants? Existing benchmarks are not designed to measure progress in applied settings, and so don't directly answer this question. The RAFT benchmark (Real-world Annotated Few-shot Tasks) focuses on naturally occurring tasks and uses an evaluation setup that mirrors deployment. Baseline evaluations on RAFT reveal areas current techniques struggle with: reasoning over long texts and tasks with many classes. Human baselines show that some classification tasks are difficult for non-expert humans, reflecting that real-world value sometimes depends on domain expertise. Yet even non-expert human baseline F1 scores exceed GPT-3 by an average of 0.11. The RAFT datasets and leaderboard will track which model improvements translate into real-world benefits at https://raft.elicit.org ., Comment: Dataset, submission instructions, code and leaderboard available at https://raft.elicit.org
Published: 2021

6. Learning physical parameters from dynamic scenes

Author: Massachusetts Institute of Technology. Department of Brain and Cognitive Sciences, Ullman, Tomer D., Stuhlmüller, Andreas, Goodman, Noah D., Tenenbaum, Joshua B., Massachusetts Institute of Technology. Department of Brain and Cognitive Sciences, Ullman, Tomer D., Stuhlmüller, Andreas, Goodman, Noah D., and Tenenbaum, Joshua B.
Abstract: Humans acquire their most basic physical concepts early in development, and continue to enrich and expand their intuitive physics throughout life as they are exposed to more and varied dynamical environments. We introduce a hierarchical Bayesian framework to explain how people can learn physical parameters at multiple levels. In contrast to previous Bayesian models of theory acquisition (Tenenbaum, Kemp, Griffiths, & Goodman, 2011), we work with more expressive probabilistic program representations suitable for learning the forces and properties that govern how objects interact in dynamic scenes unfolding over time. We compare our model to human learners on a challenging task of estimating multiple physical parameters in novel microworlds given short movies. This task requires people to reason simultaneously about multiple interacting physical laws and properties. People are generally able to learn in this setting and are consistent in their judgments. Yet they also make systematic errors indicative of the approximations people might make in solving this computationally demanding problem with limited computational resources. We propose two approximations that complement the top-down Bayesian approach. One approximation model relies on a more bottom-up feature-based inference scheme. The second approximation combines the strengths of the bottom-up and top-down approaches, by taking the feature-based inference as its point of departure for a search in physical-parameter space. Keywords: Learning; Intuitive physics; Probabilistic inference; Physical reasoning; Intuitive theory, National Science Foundation (U.S.) (Grant CCF-1231216), National Science Foundation (U.S.) (Grant N00014-13-1-0333)
Published: 2020

7. Evaluating Compositionality in Sentence Embeddings

Author: Dasgupta, Ishita, Guo, Demi, Stuhlmüller, Andreas, Gershman, Samuel J., Goodman, Noah D., Dasgupta, Ishita, Guo, Demi, Stuhlmüller, Andreas, Gershman, Samuel J., and Goodman, Noah D.
Abstract: An important challenge for human-like AI is compositional semantics. Recent research has attempted to address this by using deep neural networks to learn vector space embeddings of sentences, which then serve as input to other tasks. We present a new dataset for one such task, `natural language inference' (NLI), that cannot be solved using only word-level knowledge and requires some compositionality. We find that the performance of state of the art sentence embeddings (InferSent; Conneau et al., 2017) on our new dataset is poor. We analyze the decision rules learned by InferSent and find that they are consistent with simple heuristics that are ecologically valid in its training dataset. Further, we find that augmenting training with our dataset improves test performance on our dataset without loss of performance on the original training dataset. This highlights the importance of structured datasets in better understanding and improving AI systems.
Published: 2018

8. Learning physical parameters from dynamic scenes

Author: Ullman, Tomer D., primary, Stuhlmüller, Andreas, additional, Goodman, Noah D., additional, and Tenenbaum, Joshua B., additional
Published: 2018
Full Text: View/download PDF

9. Agent-Agnostic Human-in-the-Loop Reinforcement Learning

Author: Abel, David, Salvatier, John, Stuhlmüller, Andreas, Evans, Owain, Abel, David, Salvatier, John, Stuhlmüller, Andreas, and Evans, Owain
Abstract: Providing Reinforcement Learning agents with expert advice can dramatically improve various aspects of learning. Prior work has developed teaching protocols that enable agents to learn efficiently in complex environments; many of these methods tailor the teacher's guidance to agents with a particular representation or underlying learning scheme, offering effective but specialized teaching procedures. In this work, we explore protocol programs, an agent-agnostic schema for Human-in-the-Loop Reinforcement Learning. Our goal is to incorporate the beneficial properties of a human teacher into Reinforcement Learning without making strong assumptions about the inner workings of the agent. We show how to represent existing approaches such as action pruning, reward shaping, and training in simulation as special cases of our schema and conduct preliminary experiments on simple domains., Comment: Presented at the NIPS Workshop on the Future of Interactive Learning Machines, 2016
Published: 2017

10. Coarse-to-Fine Sequential Monte Carlo for Probabilistic Programs

Author: Stuhlmüller, Andreas, Hawkins, Robert X. D., Siddharth, N., and Goodman, Noah D.
Subjects: FOS: Computer and information sciences, Artificial Intelligence (cs.AI), Computer Science - Artificial Intelligence, Statistics - Machine Learning, Machine Learning (stat.ML)
Abstract: Many practical techniques for probabilistic inference require a sequence of distributions that interpolate between a tractable distribution and an intractable distribution of interest. Usually, the sequences used are simple, e.g., based on geometric averages between distributions. When models are expressed as probabilistic programs, the models themselves are highly structured objects that can be used to derive annealing sequences that are more sensitive to domain structure. We propose an algorithm for transforming probabilistic programs to coarse-to-fine programs which have the same marginal distribution as the original programs, but generate the data at increasing levels of detail, from coarse to fine. We apply this algorithm to an Ising model, its depth-from-disparity variation, and a factorial hidden Markov model. We show preliminary evidence that the use of coarse-to-fine models can make existing generic inference algorithms more efficient.
Published: 2015
Full Text: View/download PDF

11. C3: Lightweight Incrementalized MCMC for Probabilistic Programs using Continuations and Callsite Caching

Author: Ritchie, Daniel, Stuhlmüller, Andreas, and Goodman, Noah D.
Subjects: FOS: Computer and information sciences, Computer Science - Programming Languages, Artificial Intelligence (cs.AI), Computer Science - Artificial Intelligence, Programming Languages (cs.PL)
Abstract: Lightweight, source-to-source transformation approaches to implementing MCMC for probabilistic programming languages are popular for their simplicity, support of existing deterministic code, and ability to execute on existing fast runtimes. However, they are also slow, requiring a complete re-execution of the program on every Metropolis Hastings proposal. We present a new extension to the lightweight approach, C3, which enables efficient, incrementalized re-execution of MH proposals. C3 is based on two core ideas: transforming probabilistic programs into continuation passing style (CPS), and caching the results of function calls. We show that on several common models, C3 reduces proposal runtime by 20-100x, in some cases reducing runtime complexity from linear in model size to constant. We also demonstrate nearly an order of magnitude speedup on a complex inverse procedural modeling application., Comment: Fix typo in author name
Published: 2015
Full Text: View/download PDF

12. Modeling cognition with probabilistic programs : representations and algorithms

Author: Noah D. Goodman and Joshua B. Tenenbaum., Massachusetts Institute of Technology. Department of Brain and Cognitive Sciences., Stuhlmüller, Andreas, Noah D. Goodman and Joshua B. Tenenbaum., Massachusetts Institute of Technology. Department of Brain and Cognitive Sciences., and Stuhlmüller, Andreas
Abstract: Thesis: Ph. D., Massachusetts Institute of Technology, Department of Brain and Cognitive Sciences, 2015., This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections., Cataloged from student-submitted PDF version of thesis., Includes bibliographical references (pages 167-176)., This thesis develops probabilistic programming as a productive metaphor for understanding cognition, both with respect to mental representations and the manipulation of such representations. In the first half of the thesis, I demonstrate the representational power of probabilistic programs in the domains of concept learning and social reasoning. I provide examples of richly structured concepts, defined in terms of systems of relations, subparts, and recursive embeddings, that are naturally expressed as programs and show initial experimental evidence that they match human generalization patterns. I then proceed to models of reasoning about reasoning, a domain where the expressive power of probabilistic programs is necessary to formalize our intuitive domain understanding due to the fact that, unlike previous formalisms, probabilistic programs allow conditioning to be represented in a model, not just applied to a model. I illustrate this insight with programs that model nested reasoning in game theory, artificial intelligence, and linguistics. In the second half, I develop three inference algorithms with the dual intent of showing how to efficiently compute the marginal distributions defined by probabilistic programs, and providing building blocks for process-level accounts of human cognition. First, I describe a Dynamic Programming algorithm for computing the marginal distribution of discrete probabilistic programs by compiling to systems of equations and show that it can make inference in models of "reasoning about reasoning" tractable by merging and reusing subcomputations. Second, I introduce the setting of amortized inference and show how learning inverse models lets us leverage samples generated by other inference algorithms to compile probabilistic models into fast recognition functions. Third, I develop a generic approach to coarse-to-fine inference in probabilistic programs and provide evidence that it can speed up inference in models with large state spaces, by Andreas Stuhlmüller., Ph. D.
Published: 2016

13. A Dynamic Programming Algorithm for Inference in Recursive Probabilistic Programs

Author: Stuhlmüller, Andreas and Goodman, Noah D.
Subjects: FOS: Computer and information sciences, Artificial Intelligence (cs.AI), Computer Science - Artificial Intelligence, Computer Science - Data Structures and Algorithms, Data Structures and Algorithms (cs.DS)
Abstract: We describe a dynamic programming algorithm for computing the marginal distribution of discrete probabilistic programs. This algorithm takes a functional interpreter for an arbitrary probabilistic programming language and turns it into an efficient marginalizer. Because direct caching of sub-distributions is impossible in the presence of recursion, we build a graph of dependencies between sub-distributions. This factored sum-product network makes (potentially cyclic) dependencies between subproblems explicit, and corresponds to a system of equations for the marginal distribution. We solve these equations by fixed-point iteration in topological order. We illustrate this algorithm on examples used in teaching probabilistic models, computational cognitive science research, and game theory., Comment: Second Statistical Relational AI workshop at UAI 2012 (StaRAI-12)
Published: 2012
Full Text: View/download PDF

14. Inducing Probabilistic Programs by Bayesian Program Merging

Author: Hwang, Irvin, Stuhlmüller, Andreas, and Goodman, Noah D.
Subjects: FOS: Computer and information sciences, Computer Science - Learning, Artificial Intelligence (cs.AI), Computer Science - Artificial Intelligence, Machine Learning (cs.LG)
Abstract: This report outlines an approach to learning generative models from data. We express models as probabilistic programs, which allows us to capture abstract patterns within the examples. By choosing our language for programs to be an extension of the algebraic data type of the examples, we can begin with a program that generates all and only the examples. We then introduce greater abstraction, and hence generalization, incrementally to the extent that it improves the posterior probability of the examples given the program. Motivated by previous approaches to model merging and program induction, we search for such explanatory abstractions using program transformations. We consider two types of transformation: Abstraction merges common subexpressions within a program into new functions (a form of anti-unification). Deargumentation simplifies functions by reducing the number of arguments. We demonstrate that this approach finds key patterns in the domain of nested lists, including parameterized sub-functions and stochastic recursion.
Published: 2011
Full Text: View/download PDF

15. Knowledge and Implicature: Modeling Language Understanding as Social Cognition

Author: Goodman, Noah D., primary and Stuhlmüller, Andreas, additional
Published: 2013
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

15 results on '"Stuhlmüller, Andreas"'

1. Elicit: Language models as research tools

2. Iterated Decomposition: Improving Science Q&A by Supervising Reasoning Processes

3. Factored Verification: Detecting and Reducing Hallucination in Summaries of Academic Papers

4. Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

5. RAFT: A Real-World Few-Shot Text Classification Benchmark

6. Learning physical parameters from dynamic scenes

7. Evaluating Compositionality in Sentence Embeddings

8. Learning physical parameters from dynamic scenes

9. Agent-Agnostic Human-in-the-Loop Reinforcement Learning

10. Coarse-to-Fine Sequential Monte Carlo for Probabilistic Programs

11. C3: Lightweight Incrementalized MCMC for Probabilistic Programs using Continuations and Callsite Caching

12. Modeling cognition with probabilistic programs : representations and algorithms

13. A Dynamic Programming Algorithm for Inference in Recursive Probabilistic Programs

14. Inducing Probabilistic Programs by Bayesian Program Merging

15. Knowledge and Implicature: Modeling Language Understanding as Social Cognition

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

15 results on '"Stuhlmüller, Andreas"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources