Author: "Hewitt, John" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Hewitt, John"' showing total 2,478 results

Start Over Author "Hewitt, John"

Sorry, I don't understand your search. ×

2,478 results on '"Hewitt, John"'

1. Instruction Following without Instruction Tuning

Author: Hewitt, John, Liu, Nelson F., Liang, Percy, and Manning, Christopher D.
Subjects: Computer Science - Computation and Language
Abstract: Instruction tuning commonly means finetuning a language model on instruction-response pairs. We discover two forms of adaptation (tuning) that are deficient compared to instruction tuning, yet still yield instruction following; we call this implicit instruction tuning. We first find that instruction-response pairs are not necessary: training solely on responses, without any corresponding instructions, yields instruction following. This suggests pretrained models have an instruction-response mapping which is revealed by teaching the model the desired distribution of responses. However, we then find it's not necessary to teach the desired distribution of responses: instruction-response training on narrow-domain data like poetry still leads to broad instruction-following behavior like recipe generation. In particular, when instructions are very different from those in the narrow finetuning domain, models' responses do not adhere to the style of the finetuning domain. To begin to explain implicit instruction tuning, we hypothesize that very simple changes to a language model's distribution yield instruction following. We support this by hand-writing a rule-based language model which yields instruction following in a product-of-experts with a pretrained model. The rules are to slowly increase the probability of ending the sequence, penalize repetition, and uniformly change 15 words' probabilities. In summary, adaptations made without being designed to yield instruction following can do so implicitly.
Published: 2024

2. Learning Translations via Matrix Completion

Author: Wijaya, Derry, Callahan, Brendan, Hewitt, John, Gao, Jie, Ling, Xiao, Apidianaki, Marianna, and Callison-Burch, Chris
Subjects: Computer Science - Computation and Language, I.2.7
Abstract: Bilingual Lexicon Induction is the task of learning word translations without bilingual parallel corpora. We model this task as a matrix completion problem, and present an effective and extendable framework for completing the matrix. This method harnesses diverse bilingual and monolingual signals, each of which may be incomplete or noisy. Our model achieves state-of-the-art performance for both high and low resource languages., Comment: This is a late posting of an old paper as Google Scholar somehow misses indexing the ACL anthology version of the paper
Published: 2024
Full Text: View/download PDF

3. Model Editing with Canonical Examples

Author: Hewitt, John, Chen, Sarah, Xie, Lanruo Lora, Adams, Edward, Liang, Percy, and Manning, Christopher D.
Subjects: Computer Science - Computation and Language
Abstract: We introduce model editing with canonical examples, a setting in which (1) a single learning example is provided per desired behavior, (2) evaluation is performed exclusively out-of-distribution, and (3) deviation from an initial model is strictly limited. A canonical example is a simple instance of good behavior, e.g., The capital of Mauritius is Port Louis) or bad behavior, e.g., An aspect of researchers is coldhearted). The evaluation set contains more complex examples of each behavior (like a paragraph in which the capital of Mauritius is called for.) We create three datasets and modify three more for model editing with canonical examples, covering knowledge-intensive improvements, social bias mitigation, and syntactic edge cases. In our experiments on Pythia language models, we find that LoRA outperforms full finetuning and MEMIT. We then turn to the Backpack language model architecture because it is intended to enable targeted improvement. The Backpack defines a large bank of sense vectors--a decomposition of the different uses of each word--which are weighted and summed to form the output logits of the model. We propose sense finetuning, which selects and finetunes a few ($\approx$ 10) sense vectors for each canonical example, and find that it outperforms other finetuning methods, e.g., 4.8% improvement vs 0.3%. Finally, we improve GPT-J-6B by an inference-time ensemble with just the changes from sense finetuning of a 35x smaller Backpack, in one setting outperforming editing GPT-J itself (4.1% vs 1.0%).
Published: 2024

4. Character-level Chinese Backpack Language Models

Author: Sun, Hao and Hewitt, John
Subjects: Computer Science - Computation and Language
Abstract: The Backpack is a Transformer alternative shown to improve interpretability in English language modeling by decomposing predictions into a weighted sum of token sense components. However, Backpacks' reliance on token-defined meaning raises questions as to their potential for languages other than English, a language for which subword tokenization provides a reasonable approximation for lexical items. In this work, we train, evaluate, interpret, and control Backpack language models in character-tokenized Chinese, in which words are often composed of many characters. We find that our (134M parameter) Chinese Backpack language model performs comparably to a (104M parameter) Transformer, and learns rich character-level meanings that log-additively compose to form word meanings. In SimLex-style lexical semantic evaluations, simple averages of Backpack character senses outperform input embeddings from a Transformer. We find that complex multi-character meanings are often formed by using the same per-character sense weights consistently across context. Exploring interpretability-through control, we show that we can localize a source of gender bias in our Backpacks to specific character senses and intervene to reduce the bias., Comment: BlackboxNLP 2023 Camera-Ready
Published: 2023

5. Closing the Curious Case of Neural Text Degeneration

Author: Finlayson, Matthew, Hewitt, John, Koller, Alexander, Swayamdipta, Swabha, and Sabharwal, Ashish
Subjects: Computer Science - Computation and Language, 68T50, I.2.7
Abstract: Despite their ubiquity in language generation, it remains unknown why truncation sampling heuristics like nucleus sampling are so effective. We provide a theoretical explanation for the effectiveness of the truncation sampling by proving that truncation methods that discard tokens below some probability threshold (the most common type of truncation) can guarantee that all sampled tokens have nonzero true probability. However, thresholds are a coarse heuristic, and necessarily discard some tokens with nonzero true probability as well. In pursuit of a more precise sampling strategy, we show that we can leverage a known source of model errors, the softmax bottleneck, to prove that certain tokens have nonzero true probability, without relying on a threshold. Based on our findings, we develop an experimental truncation strategy and the present pilot studies demonstrating the promise of this type of algorithm. Our evaluations show that our method outperforms its threshold-based counterparts under automatic and human evaluation metrics for low-entropy (i.e., close to greedy) open-ended text generation. Our theoretical findings and pilot experiments provide both insight into why truncation sampling works, and make progress toward more expressive sampling algorithms that better surface the generative capabilities of large language models.
Published: 2023

6. The Third Fermi Large Area Telescope Catalog of Gamma-ray Pulsars

Author: Smith, David A., Bruel, Philippe, Clark, Colin J., Guillemot, Lucas, Kerr, Matthew T., Ray, Paul, Abdollahi, Soheila, Ajello, Marco, Baldini, Luca, Ballet, Jean, Baring, Matthew, Bassa, Cees, Gonzalez, Josefa Becerra, Bellazzini, Ronaldo, Berretta, Alessandra, Bhattacharyya, Bhaswati, Bissaldi, Elisabetta, Bonino, Raffaella, Bottacini, Eugenio, Bregeon, Johan, Burgay, Marta, Burnett, Toby, Cameron, Rob, Camilo, Fernando, Caputo, Regina, Caraveo, Patrizia, Cavazzuti, Elisabetta, Chiaro, Graziano, Ciprini, Stefano, Cognard, Ismael, Orestano, Paolo Cristarella, Crnogorcevic, Milena, Cuoco, Alessandro, Cutini, Sara, D'Ammando, Filippo, de Angelis, Alessandro, De Gaetano, Salvatore, de Menezes, Raniere, de Palma, Francesco, DeCesar, Megan, Deneva, Julia, Di Lalla, Niccola, Di Venere, Leonardo, Dirirsa, Feraol Fana, Dominguez, Alberto, Dumora, Denis, Fegan, Stephen, Ferrara, Elizabeth, Fiori, Alessio, Fleischhack, Henrike, Flynn, Chris, Franckowiak, Anna, Freire, Paulo, Fukazawa, Yasushi, Fusco, Piergiorgio, Galanti, Giorgio, Gammaldi, Viviana, Gargano, Fabio, Gasparrini, Dario, Giacchino, Federica, Giglietto, Nico, Giordano, Francesco, Giroletti, Marcello, Green, David, Grenier, Isabelle, Guiriec, Sylvain, Gustafsson, Michael, Harding, Alice, Hays, Liz, Hewitt, John, Horan, Deirdre, Hou, Xian, Jankowski, Fabian, Johnson, Tyrel, Johnson, Robert, Johnston, Simon, Kataoka, Jun, Keith, Michael J., Kramer, Michael, Kuss, Michael, Latronico, Luca, Lee, Shiu-Hang, Li, Di, Li, Jian, Limyansky, Brent, Longo, Francesco, Loparco, Francesco, Lorusso, Leonarda, Lovellette, Michael, Lower, Marcus, Lubrano, Pasquale, Lyne, Andrew, Maldera, Simone, Manchester, Richard, Manfreda, Alberto, Marelli, Martino, Marta-Devesa, Guillem, Mazziotta, Mario Nicola, McEnery, Julie, Mereu, Isabella, Michelson, Peter, Mitthumsiri, Warit, Mizuno, Tsunefumi, Moiseev, Alex, Monzani, Maria Elena, Morselli, Aldo, Negro, Michela, Nemmen, Rodrigo, Nieder, Lars, Nuss, Eric, Omodei, Nicola, Orienti, Monica, Orlando, Elena, Ormes, Jonathan F., Palatiello, Michele, Paneque, David, Panzarini, Giuliana, Persic, Massimo, Pesce-Rollins, Melissa, Pillera, Roberta, Poon, Helen, Porter, Troy, Principe, Giacomo, Raino, Silvia, Rando, Riccardo, Ransom, Scott, Razzano, Massimiliano, Razzaque, Soebur, Reimer, Anita, Reimer, Olaf, Renault-Tinacci, Nicolas, Romani, Roger, Sanchez-Conde, Miguel A., Parkinson, Pablo Saz, Scotton, Lorenzo, Serini, Davide, Sgro, Carmelo, Shannon, Ryan, Sharma, Vidushi, Siskind, Eric J., Spandre, Gloria, Spinelli, Paolo, Stappers, Ben, Stephens, Tom, Suson, Dan, Tajima, Hiro, Tak, Dongguen, Theureau, Gilles, Thompson, David, Tibolla, Omar, Torres, Diego F., Valverde, Janeth, Venter, Christo, Wadiasingh, Zorawar, Wang, Nina, Wang, Pei, Weltevrede, Patrick, Wood, Kent, and Zaharijas, Gabrijela
Subjects: Astrophysics - High Energy Astrophysical Phenomena
Abstract: We present 294 pulsars found in GeV data from the Large Area Telescope (LAT) on the Fermi Gamma-ray Space Telescope. Another 33 millisecond pulsars (MSPs) discovered in deep radio searches of LAT sources will likely reveal pulsations once phase-connected rotation ephemerides are achieved. A further dozen optical and/or X-ray binary systems co-located with LAT sources also likely harbor gamma-ray MSPs. This catalog thus reports roughly 340 gamma-ray pulsars and candidates, 10% of all known pulsars, compared to $\leq 11$ known before Fermi. Half of the gamma-ray pulsars are young. Of these, the half that are undetected in radio have a broader Galactic latitude distribution than the young radio-loud pulsars. The others are MSPs, with 6 undetected in radio. Overall, >235 are bright enough above 50 MeV to fit the pulse profile, the energy spectrum, or both. For the common two-peaked profiles, the gamma-ray peak closest to the magnetic pole crossing generally has a softer spectrum. The spectral energy distributions tend to narrow as the spindown power $\dot E$ decreases to its observed minimum near $10^{33}$ erg s$^{-1}$, approaching the shape for synchrotron radiation from monoenergetic electrons. We calculate gamma-ray luminosities when distances are available. Our all-sky gamma-ray sensitivity map is useful for population syntheses. The electronic catalog version provides gamma-ray pulsar ephemerides, properties and fit results to guide and be compared with modeling results., Comment: 142 pages. Accepted by the Astrophysical Journal Supplement
Published: 2023

7. Lost in the Middle: How Language Models Use Long Contexts

Author: Liu, Nelson F., Lin, Kevin, Hewitt, John, Paranjape, Ashwin, Bevilacqua, Michele, Petroni, Fabio, and Liang, Percy
Subjects: Computer Science - Computation and Language
Abstract: While recent language models have the ability to take long contexts as input, relatively little is known about how well they use longer context. We analyze the performance of language models on two tasks that require identifying relevant information in their input contexts: multi-document question answering and key-value retrieval. We find that performance can degrade significantly when changing the position of relevant information, indicating that current language models do not robustly make use of information in long input contexts. In particular, we observe that performance is often highest when relevant information occurs at the beginning or end of the input context, and significantly degrades when models must access relevant information in the middle of long contexts, even for explicitly long-context models. Our analysis provides a better understanding of how language models use their input context and provides new evaluation protocols for future long-context language models., Comment: 18 pages, 16 figures. Accepted for publication in Transactions of the Association for Computational Linguistics (TACL), 2023
Published: 2023

8. Backpack Language Models

Author: Hewitt, John, Thickstun, John, Manning, Christopher D., and Liang, Percy
Subjects: Computer Science - Computation and Language
Abstract: We present Backpacks: a new neural architecture that marries strong modeling performance with an interface for interpretability and control. Backpacks learn multiple non-contextual sense vectors for each word in a vocabulary, and represent a word in a sequence as a context-dependent, non-negative linear combination of sense vectors in this sequence. We find that, after training, sense vectors specialize, each encoding a different aspect of a word. We can interpret a sense vector by inspecting its (non-contextual, linear) projection onto the output space, and intervene on these interpretable hooks to change the model's behavior in predictable ways. We train a 170M-parameter Backpack language model on OpenWebText, matching the loss of a GPT-2 small (124Mparameter) Transformer. On lexical similarity evaluations, we find that Backpack sense vectors outperform even a 6B-parameter Transformer LM's word embeddings. Finally, we present simple algorithms that intervene on sense vectors to perform controllable text generation and debiasing. For example, we can edit the sense vocabulary to tend more towards a topic, or localize a source of gender bias to a sense vector and globally suppress that sense., Comment: ACL 2023 Camera-Ready
Published: 2023

9. Two Decades of Accomplishment and Progress in Behavior Genetics

Author: Hewitt, John K
Published: 2024
Full Text: View/download PDF

10. Announcement of the Editors’ Choice Award (Formerly the Fulker Award) for a Paper Published in Behavior Genetics, Volume 53, 2023

Author: Hewitt, John K
Published: 2024
Full Text: View/download PDF

11. An Examination of the Protective Role of Internalizing Symptoms in Adolescent Substance Use

Author: Rieselbach, Maya M., Gresko, Shelley, Corley, Robin P., Hewitt, John K., and Rhee, Soo Hyun
Published: 2024
Full Text: View/download PDF

12. Negotiating Roles and Relationships: Stepping Through the Minefield of Co-Authors and Textbook Publishers

Author: Hewitt, John D. and Regoli, Robert M.
Published: 2010
Full Text: View/download PDF

13. JamPatoisNLI: A Jamaican Patois Natural Language Inference Dataset

Author: Armstrong, Ruth-Ann, Hewitt, John, and Manning, Christopher
Subjects: Computer Science - Computation and Language, Computer Science - Machine Learning, I.2.7
Abstract: JamPatoisNLI provides the first dataset for natural language inference in a creole language, Jamaican Patois. Many of the most-spoken low-resource languages are creoles. These languages commonly have a lexicon derived from a major world language and a distinctive grammar reflecting the languages of the original speakers and the process of language birth by creolization. This gives them a distinctive place in exploring the effectiveness of transfer from large monolingual or multilingual pretrained models. While our work, along with previous work, shows that transfer from these models to low-resource languages that are unrelated to languages in their training set is not very effective, we would expect stronger results from transfer to creoles. Indeed, our experiments show considerably better results from few-shot learning of JamPatoisNLI than for such unrelated languages, and help us begin to understand how the unique relationship between creoles and their high-resource base languages affect cross-lingual transfer. JamPatoisNLI, which consists of naturally-occurring premises and expert-written hypotheses, is a step towards steering research into a traditionally underserved language and a useful benchmark for understanding cross-lingual NLP., Comment: 14 pages, 3 figures, Findings of EMNLP 2022
Published: 2022

14. Mercator - mapping Marco Polo's hearsay

Author: Hewitt, John
Published: 2022

15. Truncation Sampling as Language Model Desmoothing

Author: Hewitt, John, Manning, Christopher D., and Liang, Percy
Subjects: Computer Science - Computation and Language
Abstract: Long samples of text from neural language models can be of poor quality. Truncation sampling algorithms--like top-$p$ or top-$k$ -- address this by setting some words' probabilities to zero at each step. This work provides framing for the aim of truncation, and an improved algorithm for that aim. We propose thinking of a neural language model as a mixture of a true distribution and a smoothing distribution that avoids infinite perplexity. In this light, truncation algorithms aim to perform desmoothing, estimating a subset of the support of the true distribution. Finding a good subset is crucial: we show that top-$p$ unnecessarily truncates high-probability words, for example causing it to truncate all words but Trump for a document that starts with Donald. We introduce $\eta$-sampling, which truncates words below an entropy-dependent probability threshold. Compared to previous algorithms, $\eta$-sampling generates more plausible long English documents according to humans, is better at breaking out of repetition, and behaves more reasonably on a battery of test distributions., Comment: Findings of EMNLP, + small fixes
Published: 2022

16. Jean Mallard's world map (ca. 1538-39)

Author: Hewitt, John
Published: 2016

17. Genetic diversity fuels gene discovery for tobacco and alcohol use

Author: Saunders, Gretchen RB, Wang, Xingyan, Chen, Fang, Jang, Seon-Kyeong, Liu, Mengzhen, Wang, Chen, Gao, Shuang, Jiang, Yu, Khunsriraksakul, Chachrit, Otto, Jacqueline M, Addison, Clifton, Akiyama, Masato, Albert, Christine M, Aliev, Fazil, Alonso, Alvaro, Arnett, Donna K, Ashley-Koch, Allison E, Ashrani, Aneel A, Barnes, Kathleen C, Barr, R Graham, Bartz, Traci M, Becker, Diane M, Bielak, Lawrence F, Benjamin, Emelia J, Bis, Joshua C, Bjornsdottir, Gyda, Blangero, John, Bleecker, Eugene R, Boardman, Jason D, Boerwinkle, Eric, Boomsma, Dorret I, Boorgula, Meher Preethi, Bowden, Donald W, Brody, Jennifer A, Cade, Brian E, Chasman, Daniel I, Chavan, Sameer, Chen, Yii-Der Ida, Chen, Zhengming, Cheng, Iona, Cho, Michael H, Choquet, Hélène, Cole, John W, Cornelis, Marilyn C, Cucca, Francesco, Curran, Joanne E, de Andrade, Mariza, Dick, Danielle M, Docherty, Anna R, Duggirala, Ravindranath, Eaton, Charles B, Ehringer, Marissa A, Esko, Tõnu, Faul, Jessica D, Fernandes Silva, Lilian, Fiorillo, Edoardo, Fornage, Myriam, Freedman, Barry I, Gabrielsen, Maiken E, Garrett, Melanie E, Gharib, Sina A, Gieger, Christian, Gillespie, Nathan, Glahn, David C, Gordon, Scott D, Gu, Charles C, Gu, Dongfeng, Gudbjartsson, Daniel F, Guo, Xiuqing, Haessler, Jeffrey, Hall, Michael E, Haller, Toomas, Harris, Kathleen Mullan, He, Jiang, Herd, Pamela, Hewitt, John K, Hickie, Ian, Hidalgo, Bertha, Hokanson, John E, Hopfer, Christian, Hottenga, JoukeJan, Hou, Lifang, Huang, Hongyan, Hung, Yi-Jen, Hunter, David J, Hveem, Kristian, Hwang, Shih-Jen, Hwu, Chii-Min, Iacono, William, Irvin, Marguerite R, Jee, Yon Ho, Johnson, Eric O, Joo, Yoonjung Y, Jorgenson, Eric, Justice, Anne E, Kamatani, Yoichiro, Kaplan, Robert C, Kaprio, Jaakko, Kardia, Sharon LR, and Keller, Matthew C
Subjects: Genetics, Alcoholism, Alcohol Use and Health, Prevention, Human Genome, Substance Misuse, Aetiology, 2.1 Biological and endogenous factors, Cancer, Good Health and Well Being, Humans, Genetic Predisposition to Disease, Genetic Variation, Genome-Wide Association Study, Multifactorial Inheritance, Risk Factors, Tobacco Use, Alcohol Drinking, Transcriptome, Sample Size, Genetic Loci, Internationality, Europe, 23andMe Research Team, Biobank Japan Project, General Science & Technology
Abstract: Tobacco and alcohol use are heritable behaviours associated with 15% and 5.3% of worldwide deaths, respectively, due largely to broad increased risk for disease and injury1-4. These substances are used across the globe, yet genome-wide association studies have focused largely on individuals of European ancestries5. Here we leveraged global genetic diversity across 3.4 million individuals from four major clines of global ancestry (approximately 21% non-European) to power the discovery and fine-mapping of genomic loci associated with tobacco and alcohol use, to inform function of these loci via ancestry-aware transcriptome-wide association studies, and to evaluate the genetic architecture and predictive power of polygenic risk within and across populations. We found that increases in sample size and genetic diversity improved locus identification and fine-mapping resolution, and that a large majority of the 3,823 associated variants (from 2,143 loci) showed consistent effect sizes across ancestry dimensions. However, polygenic risk scores developed in one ancestry performed poorly in others, highlighting the continued need to increase sample sizes of diverse ancestries to realize any potential benefit of polygenic prediction.
Published: 2022

18. Two Indian ocean voyages by the great ship Sao Paulo

Author: Hewitt, John
Published: 2024

19. Incremental Fermi Large Area Telescope Fourth Source Catalog

Author: collaboration, Fermi-LAT, Abdollahi, Soheila, Acero, Fabio, Baldini, Luca, Ballet, Jean, Bastieri, Denis, Bellazzini, Ronaldo, Berenji, Bijan, Berretta, Alessandra, Bissaldi, Elisabetta, Blandford, Roger D., Bloom, Elliott, Bonino, Raffaella, Brill, Ari, Britto, Richard J., Bruel, Philippe, Burnett, Toby H., Buson, Sara, Cameron, Rob A., Caputo, Regina, Caraveo, Patrizia A., Castro, Daniel, Chaty, Sylvain, Cheung, Teddy C., Chiaro, Graziano, Cibrario, Nicolo, Ciprini, Stefano, Coronado-Blazquez, Javier, Crnogorcevic, Milena, Cutini, Sara, D'Ammando, Filippo, De Gaetano, Salvatore, Digel, Seth W., Di Lalla, Niccolo, Dirirsa, Feraol F., Di Venere, Leonardo, Dominguez, Alberto, Ramazani, Vandad Fallah, Fegan, Stephen J., Ferrara, Elizabeth C., Fiori, Alessio, Fleischhack, Henrike, Franckowiak, Anna, Fukazawa, Yasushi, Funk, Stefan, Fusco, Piergiorgio, Galanti, Giorgio, Gammaldi, Viviana, Gargano, Fabio, Garrappa, Simone, Gasparrini, Dario, Giacchino, Federica, Giglietto, Nico, Giordano, Francesco, Giroletti, Marcello, Glanzman, Thomas, Green, David, Grenier, Isabelle A., Grondin, Marie-Helene, Guillemot, Lucas, Guiriec, Sylvain, Gustafsson, Michael, Harding, Alice K., Hays, Liz, Hewitt, John W., Horan, Deirdre, Hou, Xian, Johannesson, Gudlaugur, Karwin, Christopher M., Kayanoki, Taishu, Kerr, Matthew T., Kuss, Michael, Landriu, David, Larsson, Stefan, Latronico, Luca, Lemoine-Goumard, Marianne, Li, Jian, Liodakis, Ioannis, Longo, Francesco, Loparco, Francesco, Lott, Benoit, Lubrano, Pasquale, Maldera, Simone, Malyshev, Dmitry, Manfreda, Alberto, Marti-Devesa, Guillem, Mazziotta, Mario N., Mereu, Isabella, Meyer, Manuel, Michelson, Peter F., Mirabal, Nestor, Mitthumsiri, Warit, Mizuno, Tsunefumi, Moiseev, Alex A., Monzani, Maria E., Morselli, Aldo, Moskalenko, Igor V., Negro, Michela, Nuss, Eric, Omodei, Nicola, Orienti, Monica, Orlando, Elena, Paneque, David, Pei, Zhiyuan, Perkins, Jeremy S., Persic, Massimo, Pesce-Rollins, Melissa, Petrosian, Vahe, Pillera, Roberta, Poon, Helen, Porter, Troy A., Principe, Giacomo, Raino, Silvia, Rando, Riccardo, Rani, Bindu, Razzano, Massimiliano, Razzaque, Soebur, Reimer, Anita, Reimer, Olaf, Reposeur, Thierry, Sanchez-Conde, Miguel A., Parkinson, Pablo M. Saz, Scotton, Lorenzo, Serini, Davide, Sgro, Carmelo, Siskind, Eric J., Smith, David A., Spandre, Gloria, Spinelli, Paolo, Sueoka, Kohei, Suson, Dan J., Tajima, Hiro, Tak, Dongguen, Thayer, Jana B., Thompson, David J., Torres, Diego F., Troja, Eleonora, Valverde, Janeth, Wood, Kent, and Zaharijas, Gabrijela
Subjects: Astrophysics - High Energy Astrophysical Phenomena
Abstract: We present an incremental version (4FGL-DR3, for Data Release 3) of the fourth Fermi-LAT catalog of gamma-ray sources. Based on the first twelve years of science data in the energy range from 50 MeV to 1 TeV, it contains 6658 sources. The analysis improves on that used for the 4FGL catalog over eight years of data: more sources are fit with curved spectra, we introduce a more robust spectral parameterization for pulsars, and we extend the spectral points to 1 TeV. The spectral parameters, spectral energy distributions, and associations are updated for all sources. Light curves are rebuilt for all sources with 1 yr intervals (not 2 month intervals). Among the 5064 original 4FGL sources, 16 were deleted, 112 are formally below the detection threshold over 12 yr (but are kept in the list), while 74 are newly associated, 10 have an improved association, and seven associations were withdrawn. Pulsars are split explicitly between young and millisecond pulsars. Pulsars and binaries newly detected in LAT sources, as well as more than 100 newly classified blazars, are reported. We add three extended sources and 1607 new point sources, mostly just above the detection threshold, among which eight are considered identified, and 699 have a plausible counterpart at other wavelengths. We discuss degree-scale residuals to the global sky model and clusters of soft unassociated point sources close to the Galactic plane, which are possibly related to limitations of the interstellar emission model and missing extended sources., Comment: accepted in ApJS; follow-up paper to 1902.10045
Published: 2022
Full Text: View/download PDF

20. Genome-wide Association Meta-analysis of Childhood and Adolescent Internalizing Symptoms

Author: Jami, Eshim S, Hammerschlag, Anke R, Ip, Hill F, Allegrini, Andrea G, Benyamin, Beben, Border, Richard, Diemer, Elizabeth W, Jiang, Chang, Karhunen, Ville, Lu, Yi, Lu, Qing, Mallard, Travis T, Mishra, Pashupati P, Nolte, Ilja M, Palviainen, Teemu, Peterson, Roseann E, Sallis, Hannah M, Shabalin, Andrey A, Tate, Ashley E, Thiering, Elisabeth, Vilor-Tejedor, Natàlia, Wang, Carol, Zhou, Ang, Adkins, Daniel E, Alemany, Silvia, Ask, Helga, Chen, Qi, Corley, Robin P, Ehli, Erik A, Evans, Luke M, Havdahl, Alexandra, Hagenbeek, Fiona A, Hakulinen, Christian, Henders, Anjali K, Hottenga, Jouke Jan, Korhonen, Tellervo, Mamun, Abdullah, Marrington, Shelby, Neumann, Alexander, Rimfeld, Kaili, Rivadeneira, Fernando, Silberg, Judy L, van Beijsterveldt, Catharina E, Vuoksimaa, Eero, Whipp, Alyce M, Tong, Xiaoran, Andreassen, Ole A, Boomsma, Dorret I, Brown, Sandra A, Burt, S Alexandra, Copeland, William, Dick, Danielle M, Harden, K Paige, Harris, Kathleen Mullan, Hartman, Catharina A, Heinrich, Joachim, Hewitt, John K, Hopfer, Christian, Hypponen, Elina, Jarvelin, Marjo-Riitta, Kaprio, Jaakko, Keltikangas-Järvinen, Liisa, Klump, Kelly L, Krauter, Kenneth, Kuja-Halkola, Ralf, Larsson, Henrik, Lehtimäki, Terho, Lichtenstein, Paul, Lundström, Sebastian, Maes, Hermine H, Magnus, Per, Munafò, Marcus R, Najman, Jake M, Njølstad, Pål R, Oldehinkel, Albertine J, Pennell, Craig E, Plomin, Robert, Reichborn-Kjennerud, Ted, Reynolds, Chandra, Rose, Richard J, Smolen, Andrew, Snieder, Harold, Stallings, Michael, Standl, Marie, Sunyer, Jordi, Tiemeier, Henning, Wadsworth, Sally J, Wall, Tamara L, Whitehouse, Andrew JO, Williams, Gail M, Ystrøm, Eivind, Nivard, Michel G, Bartels, Meike, and Middeldorp, Christel M
Subjects: Biological Psychology, Psychology, Serious Mental Illness, Brain Disorders, Pediatric, Human Genome, Genetics, Behavioral and Social Science, Depression, Mental Health, Mental Illness, 2.1 Biological and endogenous factors, 2.3 Psychological, social and economic factors, Mental health, Adolescent, Adult, Aggression, Anxiety, Attention Deficit Disorder with Hyperactivity, Autistic Disorder, Bipolar Disorder, Child, Child, Preschool, Genome-Wide Association Study, Humans, Loneliness, Polymorphism, Single Nucleotide, Schizophrenia, Sleep Initiation and Maintenance Disorders, depression, anxiety, repeated measures, genetic epidemiology, molecular genetics, Medical and Health Sciences, Psychology and Cognitive Sciences, Developmental & Child Psychology, Clinical sciences, Paediatrics, Applied and developmental psychology
Abstract: ObjectiveTo investigate the genetic architecture of internalizing symptoms in childhood and adolescence.MethodIn 22 cohorts, multiple univariate genome-wide association studies (GWASs) were performed using repeated assessments of internalizing symptoms, in a total of 64,561 children and adolescents between 3 and 18 years of age. Results were aggregated in meta-analyses that accounted for sample overlap, first using all available data, and then using subsets of measurements grouped by rater, age, and instrument.ResultsThe meta-analysis of overall internalizing symptoms (INToverall) detected no genome-wide significant hits and showed low single nucleotide polymorphism (SNP) heritability (1.66%, 95% CI = 0.84-2.48%, neffective = 132,260). Stratified analyses indicated rater-based heterogeneity in genetic effects, with self-reported internalizing symptoms showing the highest heritability (5.63%, 95% CI = 3.08%-8.18%). The contribution of additive genetic effects on internalizing symptoms appeared to be stable over age, with overlapping estimates of SNP heritability from early childhood to adolescence. Genetic correlations were observed with adult anxiety, depression, and the well-being spectrum (|rg| > 0.70), as well as with insomnia, loneliness, attention-deficit/hyperactivity disorder, autism, and childhood aggression (range |rg| = 0.42-0.60), whereas there were no robust associations with schizophrenia, bipolar disorder, obsessive-compulsive disorder, or anorexia nervosa.ConclusionGenetic correlations indicate that childhood and adolescent internalizing symptoms share substantial genetic vulnerabilities with adult internalizing disorders and other childhood psychiatric traits, which could partially explain both the persistence of internalizing symptoms over time and the high comorbidity among childhood psychiatric traits. Reducing phenotypic heterogeneity in childhood samples will be key in paving the way to future GWAS success.
Published: 2022

21. Within-sibship genome-wide association analyses decrease bias in estimates of direct genetic effects

Author: Howe, Laurence J, Nivard, Michel G, Morris, Tim T, Hansen, Ailin F, Rasheed, Humaira, Cho, Yoonsu, Chittoor, Geetha, Ahlskog, Rafael, Lind, Penelope A, Palviainen, Teemu, van der Zee, Matthijs D, Cheesman, Rosa, Mangino, Massimo, Wang, Yunzhang, Li, Shuai, Klaric, Lucija, Ratliff, Scott M, Bielak, Lawrence F, Nygaard, Marianne, Giannelis, Alexandros, Willoughby, Emily A, Reynolds, Chandra A, Balbona, Jared V, Andreassen, Ole A, Ask, Helga, Baras, Aris, Bauer, Christopher R, Boomsma, Dorret I, Campbell, Archie, Campbell, Harry, Chen, Zhengming, Christofidou, Paraskevi, Corfield, Elizabeth, Dahm, Christina C, Dokuru, Deepika R, Evans, Luke M, de Geus, Eco JC, Giddaluru, Sudheer, Gordon, Scott D, Harden, K Paige, Hill, W David, Hughes, Amanda, Kerr, Shona M, Kim, Yongkang, Kweon, Hyeokmoon, Latvala, Antti, Lawlor, Deborah A, Li, Liming, Lin, Kuang, Magnus, Per, Magnusson, Patrik KE, Mallard, Travis T, Martikainen, Pekka, Mills, Melinda C, Njølstad, Pål Rasmus, Overton, John D, Pedersen, Nancy L, Porteous, David J, Reid, Jeffrey, Silventoinen, Karri, Southey, Melissa C, Stoltenberg, Camilla, Tucker-Drob, Elliot M, Wright, Margaret J, Hewitt, John K, Keller, Matthew C, Stallings, Michael C, Lee, James J, Christensen, Kaare, Kardia, Sharon LR, Peyser, Patricia A, Smith, Jennifer A, Wilson, James F, Hopper, John L, Hägg, Sara, Spector, Tim D, Pingault, Jean-Baptiste, Plomin, Robert, Havdahl, Alexandra, Bartels, Meike, Martin, Nicholas G, Oskarsson, Sven, Justice, Anne E, Millwood, Iona Y, Hveem, Kristian, Naess, Øyvind, Willer, Cristen J, Åsvold, Bjørn Olav, Koellinger, Philipp D, Kaprio, Jaakko, Medland, Sarah E, Walters, Robin G, Benjamin, Daniel J, Turley, Patrick, Evans, David M, Davey Smith, George, Hayward, Caroline, Brumpton, Ben, Hemani, Gibran, and Davies, Neil M
Subjects: Human Genome, Genetics, Pediatric, 2.1 Biological and endogenous factors, Aetiology, Mental health, Generic health relevance, Genome-Wide Association Study, Humans, Mendelian Randomization Analysis, Multifactorial Inheritance, Phenotype, Polymorphism, Single Nucleotide, Social Science Genetic Association Consortium, Within Family Consortium, Biological Sciences, Medical and Health Sciences, Developmental Biology
Abstract: Estimates from genome-wide association studies (GWAS) of unrelated individuals capture effects of inherited variation (direct effects), demography (population stratification, assortative mating) and relatives (indirect genetic effects). Family-based GWAS designs can control for demographic and indirect genetic effects, but large-scale family datasets have been lacking. We combined data from 178,086 siblings from 19 cohorts to generate population (between-family) and within-sibship (within-family) GWAS estimates for 25 phenotypes. Within-sibship GWAS estimates were smaller than population estimates for height, educational attainment, age at first birth, number of children, cognitive ability, depressive symptoms and smoking. Some differences were observed in downstream SNP heritability, genetic correlations and Mendelian randomization analyses. For example, the within-sibship genetic correlation between educational attainment and body mass index attenuated towards zero. In contrast, analyses of most molecular phenotypes (for example, low-density lipoprotein-cholesterol) were generally consistent. We also found within-sibship evidence of polygenic adaptation on taller height. Here, we illustrate the importance of family-based GWAS data for phenotypes influenced by demographic and indirect genetic effects.
Published: 2022

22. Frontmatter

Author: Hewitt, John
Published: 1991

23. Cover

Author: Hewitt, John
Published: 1991

24. Index

Author: Hewitt, John
Published: 1991

25. Bibliography

Author: Hewitt, John
Published: 1991

26. VI. Strategies of Self-Construction

Author: Hewitt, John
Published: 1991

27. V. A Theory of Identity

Author: Hewitt, John
Published: 1991

28. IV. Modernity, Society, and Community

Author: Hewitt, John
Published: 1991

29. VII. In the Last Analysis

Author: Hewitt, John
Published: 1991

30. Notes

Author: Hewitt, John
Published: 1991

31. II. Social Theory as Cultural Text

Author: Hewitt, John
Published: 1991

32. I. The Ubiquity of the Self

Author: Hewitt, John
Published: 1991

33. III. A View of American Culture

Author: Hewitt, John
Published: 1991

34. Preface

Author: Hewitt, John
Published: 1991

35. Conditional probing: measuring usable information beyond a baseline

Author: Hewitt, John, Ethayarajh, Kawin, Liang, Percy, and Manning, Christopher D.
Subjects: Computer Science - Computation and Language
Abstract: Probing experiments investigate the extent to which neural representations make properties -- like part-of-speech -- predictable. One suggests that a representation encodes a property if probing that representation produces higher accuracy than probing a baseline representation like non-contextual word embeddings. Instead of using baselines as a point of comparison, we're interested in measuring information that is contained in the representation but not in the baseline. For example, current methods can detect when a representation is more useful than the word identity (a baseline) for predicting part-of-speech; however, they cannot detect when the representation is predictive of just the aspects of part-of-speech not explainable by the word identity. In this work, we extend a theory of usable information called $\mathcal{V}$-information and propose conditional probing, which explicitly conditions on the information in the baseline. In a case study, we find that after conditioning on non-contextual word embeddings, properties like part-of-speech are accessible at deeper layers of a network than previously thought., Comment: EMNLP 2021 + typo fixes
Published: 2021

36. On the Opportunities and Risks of Foundation Models

Author: Bommasani, Rishi, Hudson, Drew A., Adeli, Ehsan, Altman, Russ, Arora, Simran, von Arx, Sydney, Bernstein, Michael S., Bohg, Jeannette, Bosselut, Antoine, Brunskill, Emma, Brynjolfsson, Erik, Buch, Shyamal, Card, Dallas, Castellon, Rodrigo, Chatterji, Niladri, Chen, Annie, Creel, Kathleen, Davis, Jared Quincy, Demszky, Dora, Donahue, Chris, Doumbouya, Moussa, Durmus, Esin, Ermon, Stefano, Etchemendy, John, Ethayarajh, Kawin, Fei-Fei, Li, Finn, Chelsea, Gale, Trevor, Gillespie, Lauren, Goel, Karan, Goodman, Noah, Grossman, Shelby, Guha, Neel, Hashimoto, Tatsunori, Henderson, Peter, Hewitt, John, Ho, Daniel E., Hong, Jenny, Hsu, Kyle, Huang, Jing, Icard, Thomas, Jain, Saahil, Jurafsky, Dan, Kalluri, Pratyusha, Karamcheti, Siddharth, Keeling, Geoff, Khani, Fereshte, Khattab, Omar, Koh, Pang Wei, Krass, Mark, Krishna, Ranjay, Kuditipudi, Rohith, Kumar, Ananya, Ladhak, Faisal, Lee, Mina, Lee, Tony, Leskovec, Jure, Levent, Isabelle, Li, Xiang Lisa, Li, Xuechen, Ma, Tengyu, Malik, Ali, Manning, Christopher D., Mirchandani, Suvir, Mitchell, Eric, Munyikwa, Zanele, Nair, Suraj, Narayan, Avanika, Narayanan, Deepak, Newman, Ben, Nie, Allen, Niebles, Juan Carlos, Nilforoshan, Hamed, Nyarko, Julian, Ogut, Giray, Orr, Laurel, Papadimitriou, Isabel, Park, Joon Sung, Piech, Chris, Portelance, Eva, Potts, Christopher, Raghunathan, Aditi, Reich, Rob, Ren, Hongyu, Rong, Frieda, Roohani, Yusuf, Ruiz, Camilo, Ryan, Jack, Ré, Christopher, Sadigh, Dorsa, Sagawa, Shiori, Santhanam, Keshav, Shih, Andy, Srinivasan, Krishnan, Tamkin, Alex, Taori, Rohan, Thomas, Armin W., Tramèr, Florian, Wang, Rose E., Wang, William, Wu, Bohan, Wu, Jiajun, Wu, Yuhuai, Xie, Sang Michael, Yasunaga, Michihiro, You, Jiaxuan, Zaharia, Matei, Zhang, Michael, Zhang, Tianyi, Zhang, Xikun, Zhang, Yuhui, Zheng, Lucia, Zhou, Kaitlyn, and Liang, Percy
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computers and Society
Abstract: AI is undergoing a paradigm shift with the rise of models (e.g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks. We call these models foundation models to underscore their critically central yet incomplete character. This report provides a thorough account of the opportunities and risks of foundation models, ranging from their capabilities (e.g., language, vision, robotics, reasoning, human interaction) and technical principles(e.g., model architectures, training procedures, data, systems, security, evaluation, theory) to their applications (e.g., law, healthcare, education) and societal impact (e.g., inequity, misuse, economic and environmental impact, legal and ethical considerations). Though foundation models are based on standard deep learning and transfer learning, their scale results in new emergent capabilities,and their effectiveness across so many tasks incentivizes homogenization. Homogenization provides powerful leverage but demands caution, as the defects of the foundation model are inherited by all the adapted models downstream. Despite the impending widespread deployment of foundation models, we currently lack a clear understanding of how they work, when they fail, and what they are even capable of due to their emergent properties. To tackle these questions, we believe much of the critical research on foundation models will require deep interdisciplinary collaboration commensurate with their fundamentally sociotechnical nature., Comment: Authored by the Center for Research on Foundation Models (CRFM) at the Stanford Institute for Human-Centered Artificial Intelligence (HAI). Report page with citation guidelines: https://crfm.stanford.edu/report.html
Published: 2021

37. Hybrid cosmic ray measurements using the IceAct telescopes in coincidence with the IceCube and IceTop detectors

Author: Paul, Larissa, Plum, Matthias, Schaufel, Merlin, Bretz, Thomas, Do, Giang, Hewitt, John W., Maslowski, Frank, Rehbein, Florian, Schäfer, Johannes, and Zink, Adrian
Subjects: Astrophysics - High Energy Astrophysical Phenomena, Astrophysics - Instrumentation and Methods for Astrophysics
Abstract: IceAct is a proposed surface array of compact (50 cm diameter) and cost-effective Imaging Air Cherenkov Telescopes installed at the site of the IceCube Neutrino Observatory at the geographic South Pole. Since January 2019, two IceAct telescope demonstrators, featuring 61 silicon pho- tomultiplier (SiPM) pixels have been taking data in the center of the IceTop surface array during the austral winter. We present the first analysis of hybrid cosmic ray events detected by the IceAct imaging air-Cherenkov telescopes in coincidence with the IceCube Neutrino Observatory, includ- ing the IceTop surface array and the IceCube in-ice array. By featuring an energy threshold of about 10 TeV and a wide field-of-view, the IceAct telescopes show promising capabilities of im- proving current cosmic ray composition studies: measuring the Cherenkov light emissions in the atmosphere adds new information about the shower development not accessible with the current detectors, enabling significantly better primary particle type discrimination on a statistical basis. The hybrid measurement also allows for detailed feasibility studies of detector cross-calibration and of cosmic ray veto capabilities for neutrino analyses. We present the performance of the telescopes, the results from the analysis of two years of data, and an outlook of a hybrid simulation for a future telescope array., Comment: Presented at the 37th International Cosmic Ray Conference (ICRC 2021). See arXiv:2107.06966 for all IceCube contributions
Published: 2021

38. Exploring Relationships Between Internalizing Problems and Risky Sexual Behavior: A Twin Study

Author: Paulich, Katie N., Freis, Samantha M., Dokuru, Deepika R., Alexander, Jordan D., Vrieze, Scott I., Corley, Robin P., McGue, Matt, Hewitt, John K., and Stallings, Michael C.
Published: 2023
Full Text: View/download PDF

39. Genotype Data and Derived Genetic Instruments of Adolescent Brain Cognitive Development Study® for Better Understanding of Human Brain Development

Author: Fan, Chun Chieh, Loughnan, Robert, Wilson, Sylia, and Hewitt, John K.
Published: 2023
Full Text: View/download PDF

40. ABCD Behavior Genetics: Twin, Family, and Genomic Studies Using the Adolescent Brain Cognitive Development (ABCD) Study Dataset

Author: Wilson, Sylia, Fan, Chun Chieh, and Hewitt, John
Published: 2023
Full Text: View/download PDF

41. Refining Targeted Syntactic Evaluation of Language Models

Author: Newman, Benjamin, Ang, Kai-Siang, Gong, Julia, and Hewitt, John
Subjects: Computer Science - Computation and Language, I.2.7
Abstract: Targeted syntactic evaluation of subject-verb number agreement in English (TSE) evaluates language models' syntactic knowledge using hand-crafted minimal pairs of sentences that differ only in the main verb's conjugation. The method evaluates whether language models rate each grammatical sentence as more likely than its ungrammatical counterpart. We identify two distinct goals for TSE. First, evaluating the systematicity of a language model's syntactic knowledge: given a sentence, can it conjugate arbitrary verbs correctly? Second, evaluating a model's likely behavior: given a sentence, does the model concentrate its probability mass on correctly conjugated verbs, even if only on a subset of the possible verbs? We argue that current implementations of TSE do not directly capture either of these goals, and propose new metrics to capture each goal separately. Under our metrics, we find that TSE overestimates systematicity of language models, but that models score up to 40% better on verbs that they predict are likely in context., Comment: 14 pages, 5 figures, 3 tables. To appear at NAACL 2021
Published: 2021

42. Probing artificial neural networks: insights from neuroscience

Author: Ivanova, Anna A., Hewitt, John, and Zaslavsky, Noga
Subjects: Computer Science - Machine Learning, Computer Science - Computation and Language
Abstract: A major challenge in both neuroscience and machine learning is the development of useful tools for understanding complex information processing systems. One such tool is probes, i.e., supervised models that relate features of interest to activation patterns arising in biological or artificial neural networks. Neuroscience has paved the way in using such models through numerous studies conducted in recent decades. In this work, we draw insights from neuroscience to help guide probing research in machine learning. We highlight two important design choices for probes $-$ direction and expressivity $-$ and relate these choices to research goals. We argue that specific research goals play a paramount role when designing a probe and encourage future probing studies to be explicit in stating these goals., Comment: ICLR 2021 Workshop: How Can Findings About The Brain Improve AI Systems?
Published: 2021

43. Our native land.

Author: Hewitt, John Hill and Hewitt, John Hill
Subjects: Broadsides 19th century. United States, Songs Texts. 19th century United States, Popular music Texts. 19th century United States, Patriotic music Texts. 19th century United States, Fratricide Songs and music Texts., Flags Songs and music Texts. United States, Musique populaire Textes. 19e siècle États-Unis, Musique patriotique Textes. 19e siècle États-Unis, Broadsides., Flags., Fratricide., Patriotic music., Popular music., Songs., United States Songs and music Texts. History Civil War, 1861-1865, United States.
Published: 2024

44. Yes! We think of the at home.

Author: Hewitt, John Hill and Hewitt, John Hill
Subjects: Broadsides 19th century. United States, Songs Texts. 19th century United States, Popular music Texts. 19th century United States, Reconciliation Songs and music Texts., Loneliness Songs and music Texts., Musique populaire Textes. 19e siècle États-Unis, Broadsides., Loneliness., Popular music., Reconciliation., Songs., United States.
Published: 2024

45. The Mountain bugle.

Author: Hewitt, John Hill and Hewitt, John Hill
Subjects: Broadsides 19th century. United States, Songs Texts. 19th century United States, Popular music Texts. 19th century United States, Bugle calls Songs and music Texts., Hunting Songs and music Texts., Musique populaire Textes. 19e siècle États-Unis, Broadsides., Bugle calls., Hunting., Popular music., Songs., United States.
Published: 2024

46. Genetic and Environmental Variation in Continuous Phenotypes in the ABCD Study®

Author: Maes, Hermine H. M., Lapato, Dana M., Schmitt, J. Eric, Luciana, Monica, Banich, Marie T., Bjork, James M., Hewitt, John K., Madden, Pamela A., Heath, Andrew C., Barch, Deanna M., Thompson, Wes K., Iacono, William G., and Neale, Michael C.
Published: 2023
Full Text: View/download PDF

47. RNNs can generate bounded hierarchical languages with optimal memory

Author: Hewitt, John, Hahn, Michael, Ganguli, Surya, Liang, Percy, and Manning, Christopher D.
Subjects: Computer Science - Computation and Language
Abstract: Recurrent neural networks empirically generate natural language with high syntactic fidelity. However, their success is not well-understood theoretically. We provide theoretical insight into this success, proving in a finite-precision setting that RNNs can efficiently generate bounded hierarchical languages that reflect the scaffolding of natural language syntax. We introduce Dyck-($k$,$m$), the language of well-nested brackets (of $k$ types) and $m$-bounded nesting depth, reflecting the bounded memory needs and long-distance dependencies of natural language syntax. The best known results use $O(k^{\frac{m}{2}})$ memory (hidden units) to generate these languages. We prove that an RNN with $O(m \log k)$ hidden units suffices, an exponential reduction in memory, by an explicit construction. Finally, we show that no algorithm, even with unbounded computation, can suffice with $o(m \log k)$ hidden units., Comment: EMNLP2020 + appendix typo fixes
Published: 2020

48. The EOS Decision and Length Extrapolation

Author: Newman, Benjamin, Hewitt, John, Liang, Percy, and Manning, Christopher D.
Subjects: Computer Science - Computation and Language
Abstract: Extrapolation to unseen sequence lengths is a challenge for neural generative models of language. In this work, we characterize the effect on length extrapolation of a modeling decision often overlooked: predicting the end of the generative process through the use of a special end-of-sequence (EOS) vocabulary item. We study an oracle setting - forcing models to generate to the correct sequence length at test time - to compare the length-extrapolative behavior of networks trained to predict EOS (+EOS) with networks not trained to (-EOS). We find that -EOS substantially outperforms +EOS, for example extrapolating well to lengths 10 times longer than those seen at training time in a bracket closing task, as well as achieving a 40% improvement over +EOS in the difficult SCAN dataset length generalization task. By comparing the hidden states and dynamics of -EOS and +EOS models, we observe that +EOS models fail to generalize because they (1) unnecessarily stratify their hidden states by their linear position is a sequence (structures we call length manifolds) or (2) get stuck in clusters (which we refer to as length attractors) once the EOS token is the highest-probability prediction., Comment: 16 page, 7 Figures, 9 Tables, Blackbox NLP Workshop at EMNLP 2020
Published: 2020

49. High-energy gamma-ray study of the dynamically young SNR G150.3+4.5

Author: Devin, Justine, Lemoine-Goumard, Marianne, Grondin, Marie-Hélène, Castro, Daniel, Ballet, Jean, Cohen, Jamie, and Hewitt, John W.
Subjects: Astrophysics - High Energy Astrophysical Phenomena
Abstract: The supernova remnant (SNR) G150.3+4.5 was recently discovered in the radio band; it exhibits a shell-like morphology with an angular size of $\sim 3^{\circ}$, suggesting either an old or a nearby SNR. Extended $\gamma$-ray emission spatially coincident with the SNR was reported in the Fermi Galactic Extended Source Catalog, with a power-law spectral index of $\Gamma$ = 1.91 $\pm$ 0.09. Studying particle acceleration in SNRs through their $\gamma$-ray emission is of primary concern to assess the nature of accelerated particles and the maximum energy they can reach. Using more than ten years of Fermi-LAT data, we investigate the morphological and spectral properties of the SNR G150.3+4.5 from 300 MeV to 3 TeV. We use the latest releases of the Fermi-LAT catalog, the instrument response functions and the Galactic and isotropic diffuse emissions. We use ROSAT all-sky survey data to assess any thermal and nonthermal X-ray emission, and we derive minimum and maximum distance to G150.3+4.5. We describe the $\gamma$-ray emission of G150.3+4.5 by an extended component which is found to be spatially coincident with the radio SNR. The spectrum is hard and the detection of photons up to hundreds of GeV points towards an emission from a dynamically young SNR. The lack of X-ray emission gives a tight constraint on the ambient density $n_0 \leq 3.6 \times 10^{-3}$ cm$^{-3}$. Since G150.3+4.5 is not reported as a historical SNR, we impose a lower limit on its age of $t$ = 1 kyr. We estimate its distance to be between 0.7 and 4.5 kpc. We find that G150.3+4.5 is spectrally similar to other dynamically young and shell-type SNRs, such as RX J1713.7$-$3946 or Vela Junior. The broadband nonthermal emission is explained with a leptonic scenario, implying a downstream magnetic field of $B = 5$ $\mu$G and acceleration of particles up to few TeV energies.
Published: 2020
Full Text: View/download PDF

50. Finding Universal Grammatical Relations in Multilingual BERT

Author: Chi, Ethan A., Hewitt, John, and Manning, Christopher D.
Subjects: Computer Science - Computation and Language, Computer Science - Machine Learning, I.2.7
Abstract: Recent work has found evidence that Multilingual BERT (mBERT), a transformer-based multilingual masked language model, is capable of zero-shot cross-lingual transfer, suggesting that some aspects of its representations are shared cross-lingually. To better understand this overlap, we extend recent work on finding syntactic trees in neural networks' internal representations to the multilingual setting. We show that subspaces of mBERT representations recover syntactic tree distances in languages other than English, and that these subspaces are approximately shared across languages. Motivated by these results, we present an unsupervised analysis method that provides evidence mBERT learns representations of syntactic dependency labels, in the form of clusters which largely agree with the Universal Dependencies taxonomy. This evidence suggests that even without explicit supervision, multilingual masked language models learn certain linguistic universals., Comment: To appear in ACL 2020; Farsi typo corrected
Published: 2020

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

2,478 results on '"Hewitt, John"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources