44 results on '"Kartsaklis, Dimitri"'
Search Results
2. lambeq: An Efficient High-Level Python Library for Quantum NLP
- Author
-
Kartsaklis, Dimitri, Fan, Ian, Yeung, Richie, Pearson, Anna, Lorenz, Robin, Toumi, Alexis, de Felice, Giovanni, Meichanetzidis, Konstantinos, Clark, Stephen, and Coecke, Bob
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Quantum Physics - Abstract
We present lambeq, the first high-level Python library for Quantum Natural Language Processing (QNLP). The open-source toolkit offers a detailed hierarchy of modules and classes implementing all stages of a pipeline for converting sentences to string diagrams, tensor networks, and quantum circuits ready to be used on a quantum computer. lambeq supports syntactic parsing, rewriting and simplification of string diagrams, ansatz creation and manipulation, as well as a number of compositional models for preparing quantum-friendly representations of sentences, employing various degrees of syntax sensitivity. We present the generic architecture and describe the most important modules in detail, demonstrating the usage with illustrative examples. Further, we test the toolkit in practice by using it to perform a number of experiments on simple NLP tasks, implementing both classical and quantum pipelines.
- Published
- 2021
3. A CCG-Based Version of the DisCoCat Framework
- Author
-
Yeung, Richie and Kartsaklis, Dimitri
- Subjects
Computer Science - Computation and Language ,Mathematics - Category Theory - Abstract
While the DisCoCat model (Coecke et al., 2010) has been proved a valuable tool for studying compositional aspects of language at the level of semantics, its strong dependency on pregroup grammars poses important restrictions: first, it prevents large-scale experimentation due to the absence of a pregroup parser; and second, it limits the expressibility of the model to context-free grammars. In this paper we solve these problems by reformulating DisCoCat as a passage from Combinatory Categorial Grammar (CCG) to a category of semantics. We start by showing that standard categorial grammars can be expressed as a biclosed category, where all rules emerge as currying/uncurrying the identity; we then proceed to model permutation-inducing rules by exploiting the symmetry of the compact closed category encoding the word meaning. We provide a proof of concept for our method, converting "Alice in Wonderland" into DisCoCat form, a corpus that we make available to the community., Comment: SemSpace 2021: Semantic Spaces at the Intersection of NLP, Physics, and Cognitive Science
- Published
- 2021
4. QNLP in Practice: Running Compositional Models of Meaning on a Quantum Computer
- Author
-
Lorenz, Robin, Pearson, Anna, Meichanetzidis, Konstantinos, Kartsaklis, Dimitri, and Coecke, Bob
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning ,Quantum Physics - Abstract
Quantum Natural Language Processing (QNLP) deals with the design and implementation of NLP models intended to be run on quantum hardware. In this paper, we present results on the first NLP experiments conducted on Noisy Intermediate-Scale Quantum (NISQ) computers for datasets of size greater than 100 sentences. Exploiting the formal similarity of the compositional model of meaning by Coecke, Sadrzadeh and Clark (2010) with quantum theory, we create representations for sentences that have a natural mapping to quantum circuits. We use these representations to implement and successfully train NLP models that solve simple sentence classification tasks on quantum hardware. We conduct quantum simulations that compare the syntax-sensitive model of Coecke et al. with two baselines that use less or no syntax; specifically, we implement the quantum analogues of a "bag-of-words" model, where syntax is not taken into account at all, and of a word-sequence model, where only word order is respected. We demonstrate that all models converge smoothly both in simulations and when run on quantum hardware, and that the results are the expected ones based on the nature of the tasks and the datasets used. Another important goal of this paper is to describe in a way accessible to AI and NLP researchers the main principles, process and challenges of experiments on quantum hardware. Our aim in doing this is to take the first small steps in this unexplored research territory and pave the way for practical Quantum Natural Language Processing., Comment: 38 pages
- Published
- 2021
- Full Text
- View/download PDF
5. Conversational Semantic Parsing for Dialog State Tracking
- Author
-
Cheng, Jianpeng, Agrawal, Devang, Alonso, Hector Martinez, Bhargava, Shruti, Driesen, Joris, Flego, Federico, Ghosh, Shaona, Kaplan, Dain, Kartsaklis, Dimitri, Li, Lin, Piraviperumal, Dhivya, Williams, Jason D, Yu, Hong, Seaghdha, Diarmuid O, and Johannsen, Anders
- Subjects
Computer Science - Computation and Language - Abstract
We consider a new perspective on dialog state tracking (DST), the task of estimating a user's goal through the course of a dialog. By formulating DST as a semantic parsing task over hierarchical representations, we can incorporate semantic compositionality, cross-domain knowledge sharing and co-reference. We present TreeDST, a dataset of 27k conversations annotated with tree-structured dialog states and system acts. We describe an encoder-decoder framework for DST with hierarchical representations, which leads to 20% improvement over state-of-the-art DST approaches that operate on a flat meaning space of slot-value pairs., Comment: Publish as a conference paper at EMNLP 2020
- Published
- 2020
6. Unseen Word Representation by Aligning Heterogeneous Lexical Semantic Spaces
- Author
-
Prokhorov, Victor, Pilehvar, Mohammad Taher, Kartsaklis, Dimitri, Lio, Pietro, and Collier, Nigel
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning - Abstract
Word embedding techniques heavily rely on the abundance of training data for individual words. Given the Zipfian distribution of words in natural language texts, a large number of words do not usually appear frequently or at all in the training data. In this paper we put forward a technique that exploits the knowledge encoded in lexical resources, such as WordNet, to induce embeddings for unseen words. Our approach adapts graph embedding and cross-lingual vector space transformation techniques in order to merge lexical knowledge encoded in ontologies with that derived from corpus statistics. We show that the approach can provide consistent performance improvements across multiple evaluation benchmarks: in-vitro, on multiple rare word similarity datasets, and in-vivo, in two downstream text classification tasks., Comment: Accepted for presentation at AAAI 2019
- Published
- 2018
7. Proceedings of the 2018 Workshop on Compositional Approaches in Physics, NLP, and Social Sciences
- Author
-
Lewis, Martha, Coecke, Bob, Hedges, Jules, Kartsaklis, Dimitri, and Marsden, Dan
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computer Science - Computer Science and Game Theory - Abstract
The ability to compose parts to form a more complex whole, and to analyze a whole as a combination of elements, is desirable across disciplines. This workshop bring together researchers applying compositional approaches to physics, NLP, cognitive science, and game theory. Within NLP, a long-standing aim is to represent how words can combine to form phrases and sentences. Within the framework of distributional semantics, words are represented as vectors in vector spaces. The categorical model of Coecke et al. [2010], inspired by quantum protocols, has provided a convincing account of compositionality in vector space models of NLP. There is furthermore a history of vector space models in cognitive science. Theories of categorization such as those developed by Nosofsky [1986] and Smith et al. [1988] utilise notions of distance between feature vectors. More recently G\"ardenfors [2004, 2014] has developed a model of concepts in which conceptual spaces provide geometric structures, and information is represented by points, vectors and regions in vector spaces. The same compositional approach has been applied to this formalism, giving conceptual spaces theory a richer model of compositionality than previously [Bolt et al., 2018]. Compositional approaches have also been applied in the study of strategic games and Nash equilibria. In contrast to classical game theory, where games are studied monolithically as one global object, compositional game theory works bottom-up by building large and complex games from smaller components. Such an approach is inherently difficult since the interaction between games has to be considered. Research into categorical compositional methods for this field have recently begun [Ghani et al., 2018]. Moreover, the interaction between the three disciplines of cognitive science, linguistics and game theory is a fertile ground for research. Game theory in cognitive science is a well-established area [Camerer, 2011]. Similarly game theoretic approaches have been applied in linguistics [J\"ager, 2008]. Lastly, the study of linguistics and cognitive science is intimately intertwined [Smolensky and Legendre, 2006, Jackendoff, 2007]. Physics supplies compositional approaches via vector spaces and categorical quantum theory, allowing the interplay between the three disciplines to be examined.
- Published
- 2018
- Full Text
- View/download PDF
8. Card-660: Cambridge Rare Word Dataset - a Reliable Benchmark for Infrequent Word Representation Models
- Author
-
Pilehvar, Mohammad Taher, Kartsaklis, Dimitri, Prokhorov, Victor, and Collier, Nigel
- Subjects
Computer Science - Computation and Language - Abstract
Rare word representation has recently enjoyed a surge of interest, owing to the crucial role that effective handling of infrequent words can play in accurate semantic understanding. However, there is a paucity of reliable benchmarks for evaluation and comparison of these techniques. We show in this paper that the only existing benchmark (the Stanford Rare Word dataset) suffers from low-confidence annotations and limited vocabulary; hence, it does not constitute a solid comparison framework. In order to fill this evaluation gap, we propose CAmbridge Rare word Dataset (Card-660), an expert-annotated word similarity dataset which provides a highly reliable, yet challenging, benchmark for rare word representation techniques. Through a set of experiments we show that even the best mainstream word embeddings, with millions of words in their vocabularies, are unable to achieve performances higher than 0.43 (Pearson correlation) on the dataset, compared to a human-level upperbound of 0.90. We release the dataset and the annotation materials at https://pilehvar.github.io/card-660/., Comment: EMNLP 2018
- Published
- 2018
9. Mapping Text to Knowledge Graph Entities using Multi-Sense LSTMs
- Author
-
Kartsaklis, Dimitri, Pilehvar, Mohammad Taher, and Collier, Nigel
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning - Abstract
This paper addresses the problem of mapping natural language text to knowledge base entities. The mapping process is approached as a composition of a phrase or a sentence into a point in a multi-dimensional entity space obtained from a knowledge graph. The compositional model is an LSTM equipped with a dynamic disambiguation mechanism on the input word embeddings (a Multi-Sense LSTM), addressing polysemy issues. Further, the knowledge base space is prepared by collecting random walks from a graph enhanced with textual features, which act as a set of semantic bridges between text and knowledge base entities. The ideas of this work are demonstrated on large-scale text-to-entity mapping and entity classification tasks, with state of the art results., Comment: Accepted for presentation at EMNLP 2018 (main conference)
- Published
- 2018
10. Learning Rare Word Representations using Semantic Bridging
- Author
-
Prokhorov, Victor, Pilehvar, Mohammad Taher, Kartsaklis, Dimitri, Lió, Pietro, and Collier, Nigel
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
We propose a methodology that adapts graph embedding techniques (DeepWalk (Perozzi et al., 2014) and node2vec (Grover and Leskovec, 2016)) as well as cross-lingual vector space mapping approaches (Least Squares and Canonical Correlation Analysis) in order to merge the corpus and ontological sources of lexical knowledge. We also perform comparative analysis of the used algorithms in order to identify the best combination for the proposed system. We then apply this to the task of enhancing the coverage of an existing word embedding's vocabulary with rare and unseen words. We show that our technique can provide considerable extra coverage (over 99%), leading to consistent performance gain (around 10% absolute gain is achieved with w2v-gn-500K cf.\S 3.3) on the Rare Word Similarity dataset.
- Published
- 2017
11. Distributional Inclusion Hypothesis for Tensor-based Composition
- Author
-
Kartsaklis, Dimitri and Sadrzadeh, Mehrnoosh
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
According to the distributional inclusion hypothesis, entailment between words can be measured via the feature inclusions of their distributional vectors. In recent work, we showed how this hypothesis can be extended from words to phrases and sentences in the setting of compositional distributional semantics. This paper focuses on inclusion properties of tensors; its main contribution is a theoretical and experimental analysis of how feature inclusion works in different concrete models of verb tensors. We present results for relational, Frobenius, projective, and holistic methods and compare them to the simple vector addition, multiplication, min, and max models. The degrees of entailment thus obtained are evaluated via a variety of existing word-based measures, such as Weed's and Clarke's, KL-divergence, APinc, balAPinc, and two of our previously proposed metrics at the phrase/sentence level. We perform experiments on three entailment datasets, investigating which version of tensor-based composition achieves the highest performance when combined with the sentence-level measures., Comment: To appear in COLING 2016
- Published
- 2016
12. Coordination in Categorical Compositional Distributional Semantics
- Author
-
Kartsaklis, Dimitri
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Mathematics - Category Theory - Abstract
An open problem with categorical compositional distributional semantics is the representation of words that are considered semantically vacuous from a distributional perspective, such as determiners, prepositions, relative pronouns or coordinators. This paper deals with the topic of coordination between identical syntactic types, which accounts for the majority of coordination cases in language. By exploiting the compact closed structure of the underlying category and Frobenius operators canonically induced over the fixed basis of finite-dimensional vector spaces, we provide a morphism as representation of a coordinator tensor, and we show how it lifts from atomic types to compound types. Linguistic intuitions are provided, and the importance of the Frobenius operators as an addition to the compact closed setting with regard to language is discussed., Comment: In Proceedings SLPCS 2016, arXiv:1608.01018
- Published
- 2016
- Full Text
- View/download PDF
13. Sentence Entailment in Compositional Distributional Semantics
- Author
-
Balkir, Esma, Kartsaklis, Dimitri, and Sadrzadeh, Mehrnoosh
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Mathematics - Category Theory ,03B65 ,I.2.7 - Abstract
Distributional semantic models provide vector representations for words by gathering co-occurrence frequencies from corpora of text. Compositional distributional models extend these from words to phrases and sentences. In categorical compositional distributional semantics, phrase and sentence representations are functions of their grammatical structure and representations of the words therein. In this setting, grammatical structures are formalised by morphisms of a compact closed category and meanings of words are formalised by objects of the same category. These can be instantiated in the form of vectors or density matrices. This paper concerns the applications of this model to phrase and sentence level entailment. We argue that entropy-based distances of vectors and density matrices provide a good candidate to measure word-level entailment, show the advantage of density matrices over vectors for word level entailments, and prove that these distances extend compositionally from words to phrases and sentences. We exemplify our theoretical constructions on real data and a toy entailment dataset and provide preliminary experimental evidence., Comment: 8 pages, 1 figure, 2 tables, short version presented in the International Symposium on Artificial Intelligence and Mathematics (ISAIM), 2016
- Published
- 2015
- Full Text
- View/download PDF
14. Syntax-Aware Multi-Sense Word Embeddings for Deep Compositional Models of Meaning
- Author
-
Cheng, Jianpeng and Kartsaklis, Dimitri
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computer Science - Neural and Evolutionary Computing - Abstract
Deep compositional models of meaning acting on distributional representations of words in order to produce vectors of larger text constituents are evolving to a popular area of NLP research. We detail a compositional distributional framework based on a rich form of word embeddings that aims at facilitating the interactions between words in the context of a sentence. Embeddings and composition layers are jointly learned against a generic objective that enhances the vectors with syntactic information from the surrounding context. Furthermore, each word is associated with a number of senses, the most plausible of which is selected dynamically during the composition process. We evaluate the produced vectors qualitatively and quantitatively with positive results. At the sentence level, the effectiveness of the framework is demonstrated on the MSRPar task, for which we report results within the state-of-the-art range., Comment: Accepted for presentation at EMNLP 2015
- Published
- 2015
15. A Frobenius Model of Information Structure in Categorical Compositional Distributional Semantics
- Author
-
Kartsaklis, Dimitri and Sadrzadeh, Mehrnoosh
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Mathematics - Category Theory ,Mathematics - Rings and Algebras - Abstract
The categorical compositional distributional model of Coecke, Sadrzadeh and Clark provides a linguistically motivated procedure for computing the meaning of a sentence as a function of the distributional meaning of the words therein. The theoretical framework allows for reasoning about compositional aspects of language and offers structural ways of studying the underlying relationships. While the model so far has been applied on the level of syntactic structures, a sentence can bring extra information conveyed in utterances via intonational means. In the current paper we extend the framework in order to accommodate this additional information, using Frobenius algebraic structures canonically induced over the basis of finite-dimensional vector spaces. We detail the theory, provide truth-theoretic and distributional semantics for meanings of intonationally-marked utterances, and present justifications and extensive examples., Comment: Accepted for presentation in the 14th Meeting on Mathematics of Language (2015)
- Published
- 2015
16. Compositional Distributional Semantics with Compact Closed Categories and Frobenius Algebras
- Author
-
Kartsaklis, Dimitri
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Mathematics - Category Theory ,Mathematics - Quantum Algebra ,Quantum Physics - Abstract
This thesis contributes to ongoing research related to the categorical compositional model for natural language of Coecke, Sadrzadeh and Clark in three ways: Firstly, I propose a concrete instantiation of the abstract framework based on Frobenius algebras (joint work with Sadrzadeh). The theory improves shortcomings of previous proposals, extends the coverage of the language, and is supported by experimental work that improves existing results. The proposed framework describes a new class of compositional models that find intuitive interpretations for a number of linguistic phenomena. Secondly, I propose and evaluate in practice a new compositional methodology which explicitly deals with the different levels of lexical ambiguity (joint work with Pulman). A concrete algorithm is presented, based on the separation of vector disambiguation from composition in an explicit prior step. Extensive experimental work shows that the proposed methodology indeed results in more accurate composite representations for the framework of Coecke et al. in particular and every other class of compositional models in general. As a last contribution, I formalize the explicit treatment of lexical ambiguity in the context of the categorical framework by resorting to categorical quantum mechanics (joint work with Coecke). In the proposed extension, the concept of a distributional vector is replaced with that of a density matrix, which compactly represents a probability distribution over the potential different meanings of the specific word. Composition takes the form of quantum measurements, leading to interesting analogies between quantum physics and linguistics., Comment: Ph.D. Dissertation, University of Oxford
- Published
- 2015
17. Open System Categorical Quantum Semantics in Natural Language Processing
- Author
-
Piedeleu, Robin, Kartsaklis, Dimitri, Coecke, Bob, and Sadrzadeh, Mehrnoosh
- Subjects
Computer Science - Computation and Language ,Computer Science - Logic in Computer Science ,Mathematics - Category Theory ,Mathematics - Quantum Algebra - Abstract
Originally inspired by categorical quantum mechanics (Abramsky and Coecke, LiCS'04), the categorical compositional distributional model of natural language meaning of Coecke, Sadrzadeh and Clark provides a conceptually motivated procedure to compute the meaning of a sentence, given its grammatical structure within a Lambek pregroup and a vectorial representation of the meaning of its parts. The predictions of this first model have outperformed that of other models in mainstream empirical language processing tasks on large scale data. Moreover, just like CQM allows for varying the model in which we interpret quantum axioms, one can also vary the model in which we interpret word meaning. In this paper we show that further developments in categorical quantum mechanics are relevant to natural language processing too. Firstly, Selinger's CPM-construction allows for explicitly taking into account lexical ambiguity and distinguishing between the two inherently different notions of homonymy and polysemy. In terms of the model in which we interpret word meaning, this means a passage from the vector space model to density matrices. Despite this change of model, standard empirical methods for comparing meanings can be easily adopted, which we demonstrate by a small-scale experiment on real-world data. This experiment moreover provides preliminary evidence of the validity of our proposed new model for word meaning. Secondly, commutative classical structures as well as their non-commutative counterparts that arise in the image of the CPM-construction allow for encoding relative pronouns, verbs and adjectives, and finally, iteration of the CPM-construction, something that has no counterpart in the quantum realm, enables one to accommodate both entailment and ambiguity.
- Published
- 2015
18. Investigating the Role of Prior Disambiguation in Deep-learning Compositional Models of Meaning
- Author
-
Cheng, Jianpeng, Kartsaklis, Dimitri, and Grefenstette, Edward
- Subjects
Computer Science - Computation and Language ,Computer Science - Learning ,Computer Science - Neural and Evolutionary Computing - Abstract
This paper aims to explore the effect of prior disambiguation on neural network- based compositional models, with the hope that better semantic representations for text compounds can be produced. We disambiguate the input word vectors before they are fed into a compositional deep net. A series of evaluations shows the positive effect of prior disambiguation for such deep models., Comment: NIPS 2014
- Published
- 2014
19. Resolving Lexical Ambiguity in Tensor Regression Models of Meaning
- Author
-
Kartsaklis, Dimitri, Kalchbrenner, Nal, and Sadrzadeh, Mehrnoosh
- Subjects
Computer Science - Computation and Language - Abstract
This paper provides a method for improving tensor-based compositional distributional models of meaning by the addition of an explicit disambiguation step prior to composition. In contrast with previous research where this hypothesis has been successfully tested against relatively simple compositional models, in our work we use a robust model trained with linear regression. The results we get in two experiments show the superiority of the prior disambiguation method and suggest that the effectiveness of this approach is model-independent.
- Published
- 2014
20. Evaluating Neural Word Representations in Tensor-Based Compositional Settings
- Author
-
Milajevs, Dmitrijs, Kartsaklis, Dimitri, Sadrzadeh, Mehrnoosh, and Purver, Matthew
- Subjects
Computer Science - Computation and Language - Abstract
We provide a comparative study between neural word representations and traditional vector spaces based on co-occurrence counts, in a number of compositional tasks. We use three different semantic spaces and implement seven tensor-based compositional models, which we then test (together with simpler additive and multiplicative approaches) in tasks involving verb disambiguation and sentence similarity. To check their scalability, we additionally evaluate the spaces using simple compositional methods on larger-scale tasks with less constrained language: paraphrase detection and dialogue act tagging. In the more constrained tasks, co-occurrence vectors are competitive, although choice of compositional method is important; on the larger-scale tasks, they are outperformed by neural word embeddings, which show robust, stable performance across the tasks., Comment: To be published in EMNLP 2014
- Published
- 2014
21. A Study of Entanglement in a Categorical Framework of Natural Language
- Author
-
Kartsaklis, Dimitri and Sadrzadeh, Mehrnoosh
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Mathematics - Category Theory ,Quantum Physics - Abstract
In both quantum mechanics and corpus linguistics based on vector spaces, the notion of entanglement provides a means for the various subsystems to communicate with each other. In this paper we examine a number of implementations of the categorical framework of Coecke, Sadrzadeh and Clark (2010) for natural language, from an entanglement perspective. Specifically, our goal is to better understand in what way the level of entanglement of the relational tensors (or the lack of it) affects the compositional structures in practical situations. Our findings reveal that a number of proposals for verb construction lead to almost separable tensors, a fact that considerably simplifies the interactions between the words. We examine the ramifications of this fact, and we show that the use of Frobenius algebras mitigates the potential problems to a great extent. Finally, we briefly examine a machine learning method that creates verb tensors exhibiting a sufficient level of entanglement., Comment: In Proceedings QPL 2014, arXiv:1412.8102
- Published
- 2014
- Full Text
- View/download PDF
22. Reasoning about Meaning in Natural Language with Compact Closed Categories and Frobenius Algebras
- Author
-
Kartsaklis, Dimitri, Sadrzadeh, Mehrnoosh, Pulman, Stephen, and Coecke, Bob
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Mathematics - Category Theory - Abstract
Compact closed categories have found applications in modeling quantum information protocols by Abramsky-Coecke. They also provide semantics for Lambek's pregroup algebras, applied to formalizing the grammatical structure of natural language, and are implicit in a distributional model of word meaning based on vector spaces. Specifically, in previous work Coecke-Clark-Sadrzadeh used the product category of pregroups with vector spaces and provided a distributional model of meaning for sentences. We recast this theory in terms of strongly monoidal functors and advance it via Frobenius algebras over vector spaces. The former are used to formalize topological quantum field theories by Atiyah and Baez-Dolan, and the latter are used to model classical data in quantum protocols by Coecke-Pavlovic-Vicary. The Frobenius algebras enable us to work in a single space in which meanings of words, phrases, and sentences of any structure live. Hence we can compare meanings of different language constructs and enhance the applicability of the theory. We report on experimental results on a number of language tasks and verify the theoretical predictions.
- Published
- 2014
23. Compositional Operators in Distributional Semantics
- Author
-
Kartsaklis, Dimitri
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Mathematics - Category Theory - Abstract
This survey presents in some detail the main advances that have been recently taking place in Computational Linguistics towards the unification of the two prominent semantic paradigms: the compositional formal semantics view and the distributional models of meaning based on vector spaces. After an introduction to these two approaches, I review the most important models that aim to provide compositionality in distributional semantics. Then I proceed and present in more detail a particular framework by Coecke, Sadrzadeh and Clark (2010) based on the abstract mathematical setting of category theory, as a more complete example capable to demonstrate the diversity of techniques and scientific disciplines that this kind of research can draw from. This paper concludes with a discussion about important open issues that need to be addressed by the researchers in the future.
- Published
- 2014
- Full Text
- View/download PDF
24. Non-commutative Logic for Compositional Distributional Semantics
- Author
-
Cvetko-Vah, Karin, Sadrzadeh, Mehrnoosh, Kartsaklis, Dimitri, Blundell, Benjamin, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Kennedy, Juliette, editor, and de Queiroz, Ruy J.G.B., editor
- Published
- 2017
- Full Text
- View/download PDF
25. A Compositional Distributional Inclusion Hypothesis
- Author
-
Kartsaklis, Dimitri, Sadrzadeh, Mehrnoosh, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Amblard, Maxime, editor, de Groote, Philippe, editor, Pogodalla, Sylvain, editor, and Retoré, Christian, editor
- Published
- 2016
- Full Text
- View/download PDF
26. Sentence entailment in compositional distributional semantics
- Author
-
Sadrzadeh, Mehrnoosh, Kartsaklis, Dimitri, and Balkır, Esma
- Published
- 2018
- Full Text
- View/download PDF
27. QNLP in Practice: Running Compositional Models of Meaning on a Quantum Computer
- Author
-
Lorenz, Robin, primary, Pearson, Anna, additional, Meichanetzidis, Konstantinos, additional, Kartsaklis, Dimitri, additional, and Coecke, Bob, additional
- Published
- 2023
- Full Text
- View/download PDF
28. Peptide Binding Classification on Quantum Computers
- Author
-
London, Charles, Brown, Douglas, Xu, Wenduan, Vatansever, Sezen, Langmead, Christopher James, Kartsaklis, Dimitri, Clark, Stephen, Meichanetzidis, Konstantinos, London, Charles, Brown, Douglas, Xu, Wenduan, Vatansever, Sezen, Langmead, Christopher James, Kartsaklis, Dimitri, Clark, Stephen, and Meichanetzidis, Konstantinos
- Abstract
We conduct an extensive study on using near-term quantum computers for a task in the domain of computational biology. By constructing quantum models based on parameterised quantum circuits we perform sequence classification on a task relevant to the design of therapeutic proteins, and find competitive performance with classical baselines of similar scale. To study the effect of noise, we run some of the best-performing quantum models with favourable resource requirements on emulators of state-of-the-art noisy quantum processors. We then apply error mitigation methods to improve the signal. We further execute these quantum models on the Quantinuum H1-1 trapped-ion quantum processor and observe very close agreement with noiseless exact simulation. Finally, we perform feature attribution methods and find that the quantum models indeed identify sensible relationships, at least as well as the classical baselines. This work constitutes the first proof-of-concept application of near-term quantum computing to a task critical to the design of therapeutic proteins, opening the route toward larger-scale applications in this and related fields, in line with the hardware development roadmaps of near-term quantum technologies.
- Published
- 2023
29. Non-commutative Logic for Compositional Distributional Semantics
- Author
-
Cvetko-Vah, Karin, primary, Sadrzadeh, Mehrnoosh, additional, Kartsaklis, Dimitri, additional, and Blundell, Benjamin, additional
- Published
- 2017
- Full Text
- View/download PDF
30. A Compositional Distributional Inclusion Hypothesis
- Author
-
Kartsaklis, Dimitri, primary and Sadrzadeh, Mehrnoosh, additional
- Published
- 2016
- Full Text
- View/download PDF
31. Conversational Semantic Parsing for Dialog State Tracking
- Author
-
Cheng, Jianpeng, primary, Agrawal, Devang, additional, Martínez Alonso, Héctor, additional, Bhargava, Shruti, additional, Driesen, Joris, additional, Flego, Federico, additional, Kaplan, Dain, additional, Kartsaklis, Dimitri, additional, Li, Lin, additional, Piraviperumal, Dhivya, additional, Williams, Jason D., additional, Yu, Hong, additional, Ó Séaghdha, Diarmuid, additional, and Johannsen, Anders, additional
- Published
- 2020
- Full Text
- View/download PDF
32. Unseen Word Representation by Aligning Heterogeneous Lexical Semantic Spaces
- Author
-
Prokhorov, Victor, primary, Pilehvar, Mohammad Taher, additional, Kartsaklis, Dimitri, additional, Lio, Pietro, additional, and Collier, Nigel, additional
- Published
- 2019
- Full Text
- View/download PDF
33. Proceedings of the 2018 Workshop on Compositional Approaches in Physics, NLP, and Social Sciences
- Author
-
Lewis, Martha, primary, Coecke, Bob, additional, Hedges, Jules, additional, Kartsaklis, Dimitri, additional, and Marsden, Dan, additional
- Published
- 2018
- Full Text
- View/download PDF
34. Mapping Text to Knowledge Graph Entities using Multi-Sense LSTMs
- Author
-
Kartsaklis, Dimitri, primary, Pilehvar, Mohammad Taher, additional, and Collier, Nigel, additional
- Published
- 2018
- Full Text
- View/download PDF
35. Compositional distributional semantics with compact closed categories and Frobenius algebras
- Author
-
Kartsaklis, Dimitri, Sadrzadeh, M, Coecke, B, and Pulman, S
- Subjects
FOS: Computer and information sciences ,Quantum Physics ,Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computing ,FOS: Physical sciences ,Mathematics - Category Theory ,Computer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing) ,Physics and CS ,Computer science (mathematics) ,Computational Linguistics ,Artificial Intelligence (cs.AI) ,Quantum theory (mathematics) ,Artificial Intelligence ,Mathematics - Quantum Algebra ,FOS: Mathematics ,Quantum information processing ,Quantum Algebra (math.QA) ,Category Theory (math.CT) ,Quantum Physics (quant-ph) ,Computation and Language (cs.CL) ,Natural Language Processing - Abstract
This thesis contributes to ongoing research related to the categorical compositional model for natural language of Coecke, Sadrzadeh and Clark in three ways: Firstly, I propose a concrete instantiation of the abstract framework based on Frobenius algebras (joint work with Sadrzadeh). The theory improves shortcomings of previous proposals, extends the coverage of the language, and is supported by experimental work that improves existing results. The proposed framework describes a new class of compositional models that find intuitive interpretations for a number of linguistic phenomena. Secondly, I propose and evaluate in practice a new compositional methodology which explicitly deals with the different levels of lexical ambiguity (joint work with Pulman). A concrete algorithm is presented, based on the separation of vector disambiguation from composition in an explicit prior step. Extensive experimental work shows that the proposed methodology indeed results in more accurate composite representations for the framework of Coecke et al. in particular and every other class of compositional models in general. As a last contribution, I formalize the explicit treatment of lexical ambiguity in the context of the categorical framework by resorting to categorical quantum mechanics (joint work with Coecke). In the proposed extension, the concept of a distributional vector is replaced with that of a density matrix, which compactly represents a probability distribution over the potential different meanings of the specific word. Composition takes the form of quantum measurements, leading to interesting analogies between quantum physics and linguistics., Comment: Ph.D. Dissertation, University of Oxford
- Published
- 2016
36. Coordination in Categorical Compositional Distributional Semantics
- Author
-
Kartsaklis, Dimitri, primary
- Published
- 2016
- Full Text
- View/download PDF
37. Open System Categorical Quantum Semantics in Natural Language Processing
- Author
-
Robin Piedeleu and Dimitri Kartsaklis and Bob Coecke and Mehrnoosh Sadrzadeh, Piedeleu, Robin, Kartsaklis, Dimitri, Coecke, Bob, Sadrzadeh, Mehrnoosh, Robin Piedeleu and Dimitri Kartsaklis and Bob Coecke and Mehrnoosh Sadrzadeh, Piedeleu, Robin, Kartsaklis, Dimitri, Coecke, Bob, and Sadrzadeh, Mehrnoosh
- Abstract
Originally inspired by categorical quantum mechanics (Abramsky and Coecke, LiCS'04), the categorical compositional distributional model of natural language meaning of Coecke, Sadrzadeh and Clark provides a conceptually motivated procedure to compute the meaning of a sentence, given its grammatical structure within a Lambek pregroup and a vectorial representation of the meaning of its parts. Moreover, just like CQM allows for varying the model in which we interpret quantum axioms, one can also vary the model in which we interpret word meaning. In this paper we show that further developments in categorical quantum mechanics are relevant to natural language processing too. Firstly, Selinger's CPM-construction allows for explicitly taking into account lexical ambiguity and distinguishing between the two inherently different notions of homonymy and polysemy. In terms of the model in which we interpret word meaning, this means a passage from the vector space model to density matrices. Despite this change of model, standard empirical methods for comparing meanings can be easily adopted, which we demonstrate by a small-scale experiment on real-world data. Secondly, commutative classical structures as well as their non-commutative counterparts that arise in the image of the CPM-construction allow for encoding relative pronouns, verbs and adjectives, and finally, iteration of the CPM-construction, something that has no counterpart in the quantum realm, enables one to accommodate both entailment and ambiguity.
- Published
- 2015
- Full Text
- View/download PDF
38. Syntax-Aware Multi-Sense Word Embeddings for Deep Compositional Models of Meaning
- Author
-
Cheng, Jianpeng, primary and Kartsaklis, Dimitri, additional
- Published
- 2015
- Full Text
- View/download PDF
39. A Frobenius Model of Information Structure in Categorical Compositional Distributional Semantics
- Author
-
Kartsaklis, Dimitri, primary and Sadrzadeh, Mehrnoosh, additional
- Published
- 2015
- Full Text
- View/download PDF
40. A Study of Entanglement in a Categorical Framework of Natural Language
- Author
-
Kartsaklis, Dimitri, primary and Sadrzadeh, Mehrnoosh, additional
- Published
- 2014
- Full Text
- View/download PDF
41. Reasoning about meaning in natural language with compact closed categories and Frobenius algebras
- Author
-
Kartsaklis, Dimitri, primary, Sadrzadeh, Mehrnoosh, additional, Pulman, Stephen, additional, and Coecke, Bob, additional
- Full Text
- View/download PDF
42. Compositional Operators in Distributional Semantics
- Author
-
Kartsaklis, Dimitri, primary
- Published
- 2014
- Full Text
- View/download PDF
43. Evaluating Neural Word Representations in Tensor-Based Compositional Settings
- Author
-
Milajevs, Dmitrijs, primary, Kartsaklis, Dimitri, additional, Sadrzadeh, Mehrnoosh, additional, and Purver, Matthew, additional
- Published
- 2014
- Full Text
- View/download PDF
44. Resolving Lexical Ambiguity in Tensor Regression Models of Meaning
- Author
-
Kartsaklis, Dimitri, primary, Kalchbrenner, Nal, additional, and Sadrzadeh, Mehrnoosh, additional
- Published
- 2014
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.