22 results on '"automatic morphological analysis"'
Search Results
2. The morpholexicon of the Uzbek language as a source for the Corpus
- Author
-
Kholiyorov, Ural Menglievich
- Published
- 2021
- Full Text
- View/download PDF
3. Kladenští type as a problem of automatic morphological analysis.
- Author
-
Osolsobě, Klára and Žižková, Hana
- Subjects
- *
PARTS of speech , *ADJECTIVES (Grammar) - Abstract
The aim of our paper is to demonstrate the procedures by which the data needed to refine tools for automatic morphological analysis of Czech can be obtained using a corpus, namely the Araneum Bohemicum IV Maximum (Czech, 20.03) 7.10 G web corpus of the ARANEA series and Araneum Bohemicum Maximum (Czech, 15.04) 3,20 G (hereinafter Araneum). Particularly, we will focus on propria of the Kladenští type, i.e., substantivized adjectives of denoting groups of persons according to affiliation. The goal of the probe into the Aranea web corpus is: 1) a corpus-based description of frequented properties of the Kladenští type, which can be used as a starting point for rule disambiguation; 2) creating a list of the most frequent lemmas belonging to the Kladenští type, which can then be included into dictionaries of automatic morphological analyzers (e.g. the MorfFlex dictionary by Hajič and Hlaváčová). We believe that the probe can help improve the results of tools for automatic morphological analysis of Czech. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
4. IMPROVING NOMINALIZED ADJECTIVES TAGGING.
- Author
-
OSOLSOBĚ, KLÁRA and ŽIŽKOVÁ, HANA
- Subjects
- *
PARTS of speech , *DATA analysis - Abstract
Part of speech transitions represent an interesting issue in terms of Automatic Morphological Analysis (AMA). In these cases, two parts of speech have to be considered: initial and final. However, their automatic recognition is complicated by the same form. This article presents the results of a corpus study aimed at mapping nominalized adjectives tagging with a focus on detecting candidates for nominalization among frequent adjectives. Analysis of the data obtained from the ČNK SYN v5 corpus shows different reasons for incorrect tagging. Taking into account these reasons, we propose three solutions for the improvement nominalized adjectives tagging. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
5. Nástroj na tvaroslovnou analýzu staré angličtiny : Morphological Analyser of Old English
- Author
-
Ondřej Tichý
- Subjects
Old English ,historical linguistics ,computational linguistics ,automatic morphological analysis ,morphology ,forms generator ,stará angličtina ,historická lingvistika ,komputační lingvistika ,automatická morfologická analýza ,morfologie ,generátor tvarů ,Philology. Linguistics ,P1-1091 - Abstract
The paper describes the construction and testing of an electronic application for semi-automatic morphological analysis of Old English. It introduces the state of the art in the field of electronic analysis of Old English, provides a brief overview of Old English morphology and discusses the reasoning behind our theoretical framework. An account of the chosen methodology is offered and a specific description of its implementation is provided: from the acquisition and preparation of the lexical input data, through the programming of the forms generator to the testing of the results by analysing Old English text. The resulting recall of 95% is a success; however, the paper also hints at how it may be improved. It also discusses further use and development of the analyser, especially the disambiguation of its results. The paper makes a future semi-automatic morphological tagging of Old English texts a real possibility.
- Published
- 2017
6. Korpusy jako zdroje dat pro úpravy nástrojů automatické morfologické analýzy (Slovotvorné varianty adjektiv na [(ou)|í]cí z hlediska morfologického značkování) : Corpora as Data Sources for the Up-Grading of Morphological Tagging
- Author
-
Osolsobě, Klára
- Subjects
gerund/deverbal adjective ,pos tagging ,automatic morphological analysis ,variant ,derivational ,morphology ,verbální adjektivum ,morfologické značkování ,automatická morfologická analýza ,Philology. Linguistics ,P1-1091 - Abstract
Adjectives ending with -oucí/-ící are regularly derived from verbs and hence are not usually listed in any of the Czech monolingual dictionaries. On the level of automatic morphological analysis (the dictionary) of Czech they should be generated from verbal roots and tagged as verbal adjectives (pos tag: AG.*). The data from Czech corpora prove a) inconsistencies in tagging and b) gaps in the dictionary. The main cause of both kinds of insufficiency is the existence of variants on the level of verbal forms from which the verbal adjectives are potentially derived. Consequently, text corpora are a significant sourceof knowledge about the formation and use of adjectives with endings -oucí/-ící that can be important for both a) automatic morphological analysis of Czech and b) theoretical description of Czech grammar(derivational morphology). Our goal is to present a corpus-based study of the Czech gerund, i.e. verbaladjectives with -oucí/-ící. The link between the inflected and the word-formation variants will bedemonstrated using material from the SYN corpus (2,6 billion tokens of written Czech) and the large web corpus czTenTen12 (5,2 billion tokens of Czech text from the Internet — cleaned and deduplicated).
- Published
- 2015
7. On the Root-Based Lexicon for Polish
- Author
-
Rabiega-Wiśniewska, Joanna, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Nierstrasz, Oscar, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Sudan, Madhu, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Vardi, Moshe Y., Series editor, Weikum, Gerhard, Series editor, Marciniak, Małgorzata, editor, and Mykowiecka, Agnieszka, editor
- Published
- 2009
- Full Text
- View/download PDF
8. COMPOUND ADVERBS AS AN ISSUE IN MACHINE ANALYSIS OF CZECH LANGUAGE.
- Author
-
ŽIŽKOVÁ, HANA
- Subjects
- *
ADVERBS (Grammar) , *CZECH language , *MORPHOLOGY (Grammar) , *PARTS of speech , *CORPORA - Abstract
Compound adverbs represent an interesting issue in terms of Automatic Morphological Analysis (AMA). The reason is that compound adverbs in Czech are expressions formed by compounding existing words that are different parts of speech without any change in their form. An indicative sign of compound adverbs is that they can always be decomposed again. Compound adverbs may be written as one word but sometimes a multiword form coexists. A word that is originally a different part of speech gains an adverbial meaning and becomes an adverb. This article presents the results of a corpus probe aimed at mapping expressions that are demonstrably compound adverbs and were not recognized by AMA or were incorrectly tagged by AMA as another part of speech. Analysis of data obtained from the Czech National Corpus (ČNK) SYN v3 show that the unrecognized and incorrectly tagged units can be divided into several groups. Based on knowledge of these groups it is possible to refine part of speech tagging by AMA. The corpus probe examined units written in accordance with the current codification as well as substandard units. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
9. Nová automatická morfologická analýza češtiny.
- Author
-
OSOLSOBĚ, KLÁRA, HLAVÁČOVÁ, JAROSLAVA, PETKEVIČ, VLADIMÍR, SVÁŠEK, MARTIN, and ŠIMANDL, JOSEF
- Abstract
A detailed morphological description of word forms in any language is one of the necessary conditions for the successful automatic processing of linguistic data. The aim of this paper is to present a project aimed at a new description of Czech morphology, especially the planned changes in the tagset. The key changes are as follows: 1) the unambiguous description of variants; 2) the concept of a multiple lemma; 3) the revision of part-of-speech definitions. [ABSTRACT FROM AUTHOR]
- Published
- 2017
10. Nástroj na tvaroslovnou analýzu staré angličtiny.
- Author
-
Tichý, Ondřej
- Abstract
The paper describes the construction and testing of an electronic application for semi-automatic morphological analysis of Old English. It introduces the state of the art in the field of electronic analysis of Old English, provides a brief overview of Old English morphology and discusses the reasoning behind our theoretical framework. An account of the chosen methodology is offered and a specific description of its implementation is provided: from the acquisition and preparation of the lexical input data, through the programming of the forms generator to the testing of the results by analysing Old English text. The resulting recall of 95% is a success; however, the paper also hints at how it may be improved. It also discusses further use and development of the analyser, especially the disambiguation of its results. The paper makes a future semi-automatic morphological tagging of Old English texts a real possibility. [ABSTRACT FROM AUTHOR]
- Published
- 2017
11. Compound Adverbs as an Issue in Machine Analysis of Czech language
- Author
-
Hana Žižková
- Subjects
Czech ,Linguistics and Language ,automatic morphological analysis ,Computer science ,business.industry ,corpus ,02 engineering and technology ,Part of speech ,computer.software_genre ,multiword expression ,Language and Linguistics ,language.human_language ,lcsh:Philology. Linguistics ,nominal form ,lcsh:P1-1091 ,tag ,0202 electrical engineering, electronic engineering, information engineering ,language ,020201 artificial intelligence & image processing ,compound adverb ,Artificial intelligence ,business ,computer ,Natural language processing - Abstract
Compound adverbs represent an interesting issue in terms of Automatic Morphological Analysis (AMA). The reason is that compound adverbs in Czech are expressions formed by compounding existing words that are different parts of speech without any change in their form. An indicative sign of compound adverbs is that they can always be decomposed again. Compound adverbs may be written as one word but sometimes a multiword form coexists. A word that is originally a different part of speech gains an adverbial meaning and becomes an adverb. This article presents the results of a corpus probe aimed at mapping expressions that are demonstrably compound adverbs and were not recognized by AMA or were incorrectly tagged by AMA as another part of speech. Analysis of data obtained from the Czech National Corpus (ČNK) SYN v3 show that the unrecognized and incorrectly tagged units can be divided into several groups. Based on knowledge of these groups it is possible to refine part of speech tagging by AMA. The corpus probe examined units written in accordance with the current codification as well as substandard units.
- Published
- 2017
12. ANÁLISIS MORFOLÓGICO CON HERRAMIENTAS INFORMÁTICAS. RECONOCIMIENTO DE NOMBRES EN TEXTOS DE ESPAÑOL CON EL SISTEMA NOOJ.
- Author
-
Tramallino, Carolina Paola
- Subjects
COMPUTATIONAL linguistics ,SEMANTICS ,NOUNS ,GRAMMAR - Abstract
Copyright of Lingüística y Literatura is the property of Universidad de Antioquia, Facultad de Comunicaciones and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2013
13. Morphological Analyser of Old English
- Author
-
Tichý, Ondřej
- Subjects
automatická morfologická analýza ,computational linguistics ,forms generator ,historical linguistics ,stará angličtina ,automatic morphological analysis ,historická lingvistika ,komputační lingvistika ,morphology ,generátor tvarů ,Old English ,morfologie - Published
- 2017
14. Corpora as Data Sources for the Up-Grading of Morphological Tagging
- Author
-
Osolsobě, Klára and Čermák, Petr
- Subjects
automatická morfologická analýza ,varianta ,variant ,pos tagging ,automatic morphological analysis ,morphology ,slovotvorba ,derivational ,morfologické značkování ,gerund/deverbal adjective ,verbální adjektivum - Published
- 2015
15. Morphological Analyser of Old English
- Author
-
Tichý, Ondřej, Čermák, Jan, Petkevič, Vladimír, and Kučera, Karel
- Subjects
komputační lingvistika ,automatická morfologická analýza ,Olde Enlish ,morphology ,computational linguistics ,historická lingvistika ,morfologie ,historical linguistics ,stará angličtina ,automatic morphological analysis - Abstract
The paper describes the construction and testing of an electronic application for automatic morphological analysis of Old English. It introduces resources and methodologies at our disposal based on the state of the art in the field of electronic analysis of Old English and on an overview of Old English morphology. A detailed account of the chosen methodology is offered and a specific description of the implementation is provided: from the acquisition and preparation of the input data and choice of technology to the programming and testing of the results. The resulting recall of 95% can be seen as a success of the project, however, the paper also shows how the recall may be improved. It also discusses further use of the analyser, especially the disambiguation of its results. The paper makes a future semi-automatic morphological tagging of Old English texts a real possibility. Powered by TCPDF (www.tcpdf.org)
- Published
- 2014
16. Análisis morfológico con herramientas informáticas: reconocimiento de nombres en textos de español con el sistema Nooj
- Abstract
The objective of this research work is to show the scope of Computational Linguistics in the use of information tools for the morphological automatic analysis. Two programs are described: on one hand, Smorph, a software created by Gabriel Bes, whose formalization makes reference to headword and endings; on the other hand, Nook system, designed by Marx Silverstein to make the morphological, syntactic and semantic analyses of natural languages. Due to the fact that this system still does not have linguistic data corresponding to Spanish, an adaptation of the models that belong to the noun category, as they are stated in Smorph for the creation of grammars and dictionaries in Spanish will be shown., Este trabajo tiene como objetivo mostrar los alcances de la lingüística computacional en el uso de herramientas informáticas para el análisis automático morfológico. Se describen dos programas: por un lado, Smorph, software creado por Gabriel Bes, cuya formalización refiere al lema y terminaciones; por otro, el sistema Nooj, diseñado por Marx Silverstein para realizar el análisis morfológico, sintáctico y semántico de lenguas naturales. Debido a que este aún no posee datos lingüísticos correspondientes al español, se mostrará la adaptación de los modelos correspondientes a la categoría nombre, declarados en Smorph para la creación de gramáticas y diccionarios en español, necesarios en Nooj.
- Published
- 2013
17. Análisis morfológico con herramientas informáticas: reconocimiento de nombres en textos de español con el sistema Nooj
- Abstract
The objective of this research work is to show the scope of Computational Linguistics in the use of information tools for the morphological automatic analysis. Two programs are described: on one hand, Smorph, a software created by Gabriel Bes, whose formalization makes reference to headword and endings; on the other hand, Nook system, designed by Marx Silverstein to make the morphological, syntactic and semantic analyses of natural languages. Due to the fact that this system still does not have linguistic data corresponding to Spanish, an adaptation of the models that belong to the noun category, as they are stated in Smorph for the creation of grammars and dictionaries in Spanish will be shown., Este trabajo tiene como objetivo mostrar los alcances de la lingüística computacional en el uso de herramientas informáticas para el análisis automático morfológico. Se describen dos programas: por un lado, Smorph, software creado por Gabriel Bes, cuya formalización refiere al lema y terminaciones; por otro, el sistema Nooj, diseñado por Marx Silverstein para realizar el análisis morfológico, sintáctico y semántico de lenguas naturales. Debido a que este aún no posee datos lingüísticos correspondientes al español, se mostrará la adaptación de los modelos correspondientes a la categoría nombre, declarados en Smorph para la creación de gramáticas y diccionarios en español, necesarios en Nooj.
- Published
- 2013
18. Annotation of Lithuanian lexemes : peculiarities and problems
- Author
-
Rimkutė, Erika, Valskys, Vidas, and Vaskelienė, Jolanta
- Subjects
Morphological annotator ,Lietuva (Lithuania) ,Lexical database ,Žodžių daryba. Žodžio dalys / Word formation. Parts of a word ,Morfologija / Morphology ,Tekstynas ,Substantive ,Automatic morphological analysis ,Corpus ,Automatinė morfologinė analizė ,Kalbos dalys. Morfologija / Morphology - Abstract
Straipsnyje rašoma apie lietuvių kalbos morfologinio anotatoriaus veikimo principus, automatinės morfologinės analizės specifiką. Didžiausias dėmesys skiriamas vienam iš 2007-2008 m. Valstybinio mokslo ir studijų fondo remto projekto "Internetiniai ištekliai: anotuotas lietuvių kalbos tekstynas ir anotavimo priemonės (ALKA2)" įgyvendintų darbų - lietuvių kalbos morfologinio anotatoriaus leksinės duomenų bazės pildymui. Išsamiai aprašoma į morfologinio anotatoriaus leksikos duomenų bazę įtrauktinų žodžių atranka, morfologinio anotavimo etapai, sunkumai, su kuriais susidurta atliekant šį darbą. Morfologinio anotatoriaus leksikos bazė padidinta 24 000 žodžių (daugiausia tikrinių ir bendrinių daiktavardžių), todėl tikimasi, kad gana žymiai pagerės morfologinio anotatoriaus kokybė ir bus išvengta daugybės neatpažintų žodžių. Šiame straipsnyje norėta parodyti anotavimo procesą; atskleisti, kad kyla sunkumų ne tik vertinant, ar nauji žodžiai teiktini, ar neteiktini, reikalingi lietuvių kalbai ar galima apsieiti be jų; sudėtinga ne tik nustatyti naujų žodžių reikšmes, bet taip pat ir analizuoti morfologiškai: nustatyti linksniavimo paradigmą, giminę, kaitymą skaičiais, darybinius vedinius ir pan. The article presents the principles of the morphological annotator and the peculiarities of automatic morphological analysis. The paper focuses on building the lexical database of the Lithuanian morphological annotator, which is one of the completed tasks of the project Internet Resources: Annotated Corpus of the Lithuanian Language and Tools of Annotation (ALKA 2), implemented in 2007-2008 and sponsored by the Lithuanian State Science and Studies Foundation. The selection of words to be included into the lexical database of the morphological annotator is described in detail. The stages of morphological annotation and difficulties in this paper are also discussed. The lexical database of the morphological annotator has increased by 24 000 words (mostly proper and common nouns). Therefore it is expected that the quality of the morphological annotator will improve considerably and many unrecognized words will be avoided. The goal of the article is to show the process of annotation. It reveals that problems arise not only during the evaluation of acceptability of new words for the Lithuanian language and the identification of their meanings, but also during their morphological analysis. It is difficult to determine their declension paradigms, gender, number inflection, derivatives, etc.
- Published
- 2009
19. Problems of the automatic morphological analysis
- Author
-
Rimkutė, Erika
- Subjects
Antraštinis žodis (lema) ,Vienareikšminimas ,Lietuva (Lithuania) ,Headline word (lemma) ,Headline word (lhema) ,Morfologija / Morphology ,Tekstynas ,Antraštinis žodis(lema) ,Automatic morphological analysis ,Corpus ,Monosemy ,Automatinė morfologinė analizė ,Polysemy - Abstract
Straipsnyje pristatoma automatinė morfologinė lietuvių kalbos analizė ir automatiniu būdu sulemuotas bei morfologiškai anotuotas 1 mln. žodžių tekstynėlis. Tam naudojama kompiuterinė programa „Lemuoklis“, automatiškai nustatanti rašytinės žodžio formos antraštinį pavidalą ir galimas tos formos morfologines pažymas. Nagrinėjant Vytauto Didžiojo universiteto Kompiuterinės lingvistikos centre sudarinėjamą automatiškai anotuotą tekstyną, išryškėjo didelis morfologinis daugiareikšmiškumas (apie 40 proc. visų formų yra morfologiškai daugiareikšmės). Lingvistas, norėdamas vienareikšminti tokias formas, dažnai susiduria su problema, kurią formą palikti. Paaiškėjo, kad nėra aiški nekaitomų žodelių klasifikacija, ne visada aiškiai nustatomos tarnybinių kalbos dalių ribos. Iškilo klausimas, kaip skirti visiškai sutampančias kaitomas ir nekaitomas kalbos dalis, kokios morfologinės kategorijos būdingos vienoms ar kitoms kalbos dalims, kaip galima vienareikšminti morfologiškai daugiareikšmes formas bei žodžius. The article deals with the automatically tagged corpus of the Lithuanian language. The corpus with morphological tags has shown a high level degree ambiguity of the language: about 40 percent of word forms are ambiguous. The corpus linguistics has directed attention to such issues, which have not been analysed using methods of traditional linguistics. The morphological ambiguity of language has become obvious only in this automatically tagged corpus. The computational program „Lemuoklis“, created by V. Zinkevičius, can define lemmas and morphological categories of word forms. The morphological ambiguity has appeared only in texts, which were processed by this program. It is very important for automatic morphological analysis to define clearly parts of speech because the accuracy of such analysis can help to avoid morphological ambiguity. But often it is difficult to choose the right form even for the human tagger, as dictionaries and grammars do not agree about how to define parts of speech and some other morphological categories. Some problematic aspects of the Lithuanian morphology are analysed in this article. Very often it is difficult to decide which part of speech a non-inflective word belongs to, what the boundaries of some words are, how one could separate some ambiguous inflective and noninlective words, what morphological categories some parts of speech have.
- Published
- 2003
20. Morphological analysis of inflective languajes through generation
- Author
-
Gelbukh Khan, Alexander Felixovitch and Sidorov, Grigori
- Subjects
Lenguajes flexivos ,Inflective languages ,Análisis morfológico automático ,Automatic morphological analysis ,Análisis a través de generación ,Analysis through generation - Abstract
Un problema crucial en el desarrollo de los sistemas para el análisis morfológico automático de los idiomas flexivos es el tratamiento de las alternaciones de la base. Los modelos existentes requieren el desarrollo de las reglas correspondientes que especifican qué variantes de la base se pueden generar de la variante dada. Un gran número de tales reglas (por ejemplo, para el lenguaje ruso alrededor de un mil) no tiene ninguna interpretación lingüística razonable. Sugerimos un método que evite el uso de tales reglas gracias a la generación y verificación de las hipótesis sobre las formas gramaticales posibles. Los métodos de este tipo –conocidos como análisis a través de generación– hacen el desarrollo de sistemas mucho más simple que el el enfoque directo estándar. Un sistema para el análisis y la generación morfológica para el lenguaje ruso, desarrollado con nuestro método está disponible sin costo para el uso académico; el sistema para el español está bajo desarrollo. A crucial problem in development of systems for automatic morphological analysis for inflective languages is the treatment of stem alternations. The existing models require development of the corresponding rules that specify what stems can be generated from a given one. Many of such rules (e.g., for Russian about a thousand) do not have any reasonable linguistic interpretation. We suggest a method that avoids the use of such rules by generating and verifying the hypotheses about possible grammatical forms. The methods of such type are known as analysis through generation; they make the system development much simpler than the standard direct approach. A morphological analysis and generation system for Russian developed with our method is freely available for academic use; a Spanish system is being implemented. Work done under partial support of Mexican Government (CONACyT and SNI) and CGEPI-IPN, Mexico.
- Published
- 2002
21. Morphological analysis of inflective languajes through generation
- Abstract
Un problema crucial en el desarrollo de los sistemas para el análisis morfológico automático de los idiomas flexivos es el tratamiento de las alternaciones de la base. Los modelos existentes requieren el desarrollo de las reglas correspondientes que especifican qué variantes de la base se pueden generar de la variante dada. Un gran número de tales reglas (por ejemplo, para el lenguaje ruso alrededor de un mil) no tiene ninguna interpretación lingüística razonable. Sugerimos un método que evite el uso de tales reglas gracias a la generación y verificación de las hipótesis sobre las formas gramaticales posibles. Los métodos de este tipo –conocidos como análisis a través de generación– hacen el desarrollo de sistemas mucho más simple que el el enfoque directo estándar. Un sistema para el análisis y la generación morfológica para el lenguaje ruso, desarrollado con nuestro método está disponible sin costo para el uso académico; el sistema para el español está bajo desarrollo., A crucial problem in development of systems for automatic morphological analysis for inflective languages is the treatment of stem alternations. The existing models require development of the corresponding rules that specify what stems can be generated from a given one. Many of such rules (e.g., for Russian about a thousand) do not have any reasonable linguistic interpretation. We suggest a method that avoids the use of such rules by generating and verifying the hypotheses about possible grammatical forms. The methods of such type are known as analysis through generation; they make the system development much simpler than the standard direct approach. A morphological analysis and generation system for Russian developed with our method is freely available for academic use; a Spanish system is being implemented.
- Published
- 2002
22. Morphological analysis of inflective languajes through generation
- Abstract
Un problema crucial en el desarrollo de los sistemas para el análisis morfológico automático de los idiomas flexivos es el tratamiento de las alternaciones de la base. Los modelos existentes requieren el desarrollo de las reglas correspondientes que especifican qué variantes de la base se pueden generar de la variante dada. Un gran número de tales reglas (por ejemplo, para el lenguaje ruso alrededor de un mil) no tiene ninguna interpretación lingüística razonable. Sugerimos un método que evite el uso de tales reglas gracias a la generación y verificación de las hipótesis sobre las formas gramaticales posibles. Los métodos de este tipo –conocidos como análisis a través de generación– hacen el desarrollo de sistemas mucho más simple que el el enfoque directo estándar. Un sistema para el análisis y la generación morfológica para el lenguaje ruso, desarrollado con nuestro método está disponible sin costo para el uso académico; el sistema para el español está bajo desarrollo., A crucial problem in development of systems for automatic morphological analysis for inflective languages is the treatment of stem alternations. The existing models require development of the corresponding rules that specify what stems can be generated from a given one. Many of such rules (e.g., for Russian about a thousand) do not have any reasonable linguistic interpretation. We suggest a method that avoids the use of such rules by generating and verifying the hypotheses about possible grammatical forms. The methods of such type are known as analysis through generation; they make the system development much simpler than the standard direct approach. A morphological analysis and generation system for Russian developed with our method is freely available for academic use; a Spanish system is being implemented.
- Published
- 2002
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.