Descriptor: "automatic morphological analysis" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"automatic morphological analysis"' showing total 22 results

Start Over Descriptor "automatic morphological analysis"

22 results on '"automatic morphological analysis"'

1. Building a Combined Morphological Model for Russian Word Forms

Author: Bolshakova, Elena I., Sapin, Alexander S., Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Burnaev, Evgeny, editor, Ignatov, Dmitry I., editor, Ivanov, Sergei, editor, Khachay, Michael, editor, Koltsova, Olessia, editor, Kutuzov, Andrei, editor, Kuznetsov, Sergei O., editor, Loukachevitch, Natalia, editor, Napoli, Amedeo, editor, Panchenko, Alexander, editor, Pardalos, Panos M., editor, Saramäki, Jari, editor, Savchenko, Andrey V., editor, Tsymbalov, Evgenii, editor, and Tutubalina, Elena, editor
Published: 2022
Full Text: View/download PDF

2. The morpholexicon of the Uzbek language as a source for the Corpus

Author: Kholiyorov, Ural Menglievich
Published: 2021
Full Text: View/download PDF

3. Kladenští type as a problem of automatic morphological analysis.

Author: Osolsobě, Klára and Žižková, Hana
Subjects: *PARTS of speech, *ADJECTIVES (Grammar)
Abstract: The aim of our paper is to demonstrate the procedures by which the data needed to refine tools for automatic morphological analysis of Czech can be obtained using a corpus, namely the Araneum Bohemicum IV Maximum (Czech, 20.03) 7.10 G web corpus of the ARANEA series and Araneum Bohemicum Maximum (Czech, 15.04) 3,20 G (hereinafter Araneum). Particularly, we will focus on propria of the Kladenští type, i.e., substantivized adjectives of denoting groups of persons according to affiliation. The goal of the probe into the Aranea web corpus is: 1) a corpus-based description of frequented properties of the Kladenští type, which can be used as a starting point for rule disambiguation; 2) creating a list of the most frequent lemmas belonging to the Kladenští type, which can then be included into dictionaries of automatic morphological analyzers (e.g. the MorfFlex dictionary by Hajič and Hlaváčová). We believe that the probe can help improve the results of tools for automatic morphological analysis of Czech. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

4. IMPROVING NOMINALIZED ADJECTIVES TAGGING.

Author: OSOLSOBĚ, KLÁRA and ŽIŽKOVÁ, HANA
Subjects: *PARTS of speech, *DATA analysis
Abstract: Part of speech transitions represent an interesting issue in terms of Automatic Morphological Analysis (AMA). In these cases, two parts of speech have to be considered: initial and final. However, their automatic recognition is complicated by the same form. This article presents the results of a corpus study aimed at mapping nominalized adjectives tagging with a focus on detecting candidates for nominalization among frequent adjectives. Analysis of the data obtained from the ČNK SYN v5 corpus shows different reasons for incorrect tagging. Taking into account these reasons, we propose three solutions for the improvement nominalized adjectives tagging. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

5. Nástroj na tvaroslovnou analýzu staré angličtiny : Morphological Analyser of Old English

Author: Ondřej Tichý
Subjects: Old English, historical linguistics, computational linguistics, automatic morphological analysis, morphology, forms generator, stará angličtina, historická lingvistika, komputační lingvistika, automatická morfologická analýza, morfologie, generátor tvarů, Philology. Linguistics, P1-1091
Abstract: The paper describes the construction and testing of an electronic application for semi-automatic morphological analysis of Old English. It introduces the state of the art in the field of electronic analysis of Old English, provides a brief overview of Old English morphology and discusses the reasoning behind our theoretical framework. An account of the chosen methodology is offered and a specific description of its implementation is provided: from the acquisition and preparation of the lexical input data, through the programming of the forms generator to the testing of the results by analysing Old English text. The resulting recall of 95% is a success; however, the paper also hints at how it may be improved. It also discusses further use and development of the analyser, especially the disambiguation of its results. The paper makes a future semi-automatic morphological tagging of Old English texts a real possibility.
Published: 2017

6. Korpusy jako zdroje dat pro úpravy nástrojů automatické morfologické analýzy (Slovotvorné varianty adjektiv na [(ou)|í]cí z hlediska morfologického značkování) : Corpora as Data Sources for the Up-Grading of Morphological Tagging

Author: Osolsobě, Klára
Subjects: gerund/deverbal adjective, pos tagging, automatic morphological analysis, variant, derivational, morphology, verbální adjektivum, morfologické značkování, automatická morfologická analýza, Philology. Linguistics, P1-1091
Abstract: Adjectives ending with -oucí/-ící are regularly derived from verbs and hence are not usually listed in any of the Czech monolingual dictionaries. On the level of automatic morphological analysis (the dictionary) of Czech they should be generated from verbal roots and tagged as verbal adjectives (pos tag: AG.*). The data from Czech corpora prove a) inconsistencies in tagging and b) gaps in the dictionary. The main cause of both kinds of insufficiency is the existence of variants on the level of verbal forms from which the verbal adjectives are potentially derived. Consequently, text corpora are a significant sourceof knowledge about the formation and use of adjectives with endings -oucí/-ící that can be important for both a) automatic morphological analysis of Czech and b) theoretical description of Czech grammar(derivational morphology). Our goal is to present a corpus-based study of the Czech gerund, i.e. verbaladjectives with -oucí/-ící. The link between the inflected and the word-formation variants will bedemonstrated using material from the SYN corpus (2,6 billion tokens of written Czech) and the large web corpus czTenTen12 (5,2 billion tokens of Czech text from the Internet — cleaned and deduplicated).
Published: 2015

7. On the Root-Based Lexicon for Polish

Author: Rabiega-Wiśniewska, Joanna, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Nierstrasz, Oscar, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Sudan, Madhu, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Vardi, Moshe Y., Series editor, Weikum, Gerhard, Series editor, Marciniak, Małgorzata, editor, and Mykowiecka, Agnieszka, editor
Published: 2009
Full Text: View/download PDF

8. COMPOUND ADVERBS AS AN ISSUE IN MACHINE ANALYSIS OF CZECH LANGUAGE.

Author: ŽIŽKOVÁ, HANA
Subjects: *ADVERBS (Grammar), *CZECH language, *MORPHOLOGY (Grammar), *PARTS of speech, *CORPORA
Abstract: Compound adverbs represent an interesting issue in terms of Automatic Morphological Analysis (AMA). The reason is that compound adverbs in Czech are expressions formed by compounding existing words that are different parts of speech without any change in their form. An indicative sign of compound adverbs is that they can always be decomposed again. Compound adverbs may be written as one word but sometimes a multiword form coexists. A word that is originally a different part of speech gains an adverbial meaning and becomes an adverb. This article presents the results of a corpus probe aimed at mapping expressions that are demonstrably compound adverbs and were not recognized by AMA or were incorrectly tagged by AMA as another part of speech. Analysis of data obtained from the Czech National Corpus (ČNK) SYN v3 show that the unrecognized and incorrectly tagged units can be divided into several groups. Based on knowledge of these groups it is possible to refine part of speech tagging by AMA. The corpus probe examined units written in accordance with the current codification as well as substandard units. [ABSTRACT FROM AUTHOR]
Published: 2017
Full Text: View/download PDF

9. Nová automatická morfologická analýza češtiny.

Author: OSOLSOBĚ, KLÁRA, HLAVÁČOVÁ, JAROSLAVA, PETKEVIČ, VLADIMÍR, SVÁŠEK, MARTIN, and ŠIMANDL, JOSEF
Abstract: A detailed morphological description of word forms in any language is one of the necessary conditions for the successful automatic processing of linguistic data. The aim of this paper is to present a project aimed at a new description of Czech morphology, especially the planned changes in the tagset. The key changes are as follows: 1) the unambiguous description of variants; 2) the concept of a multiple lemma; 3) the revision of part-of-speech definitions. [ABSTRACT FROM AUTHOR]
Published: 2017

10. Nástroj na tvaroslovnou analýzu staré angličtiny.

Author: Tichý, Ondřej
Abstract: The paper describes the construction and testing of an electronic application for semi-automatic morphological analysis of Old English. It introduces the state of the art in the field of electronic analysis of Old English, provides a brief overview of Old English morphology and discusses the reasoning behind our theoretical framework. An account of the chosen methodology is offered and a specific description of its implementation is provided: from the acquisition and preparation of the lexical input data, through the programming of the forms generator to the testing of the results by analysing Old English text. The resulting recall of 95% is a success; however, the paper also hints at how it may be improved. It also discusses further use and development of the analyser, especially the disambiguation of its results. The paper makes a future semi-automatic morphological tagging of Old English texts a real possibility. [ABSTRACT FROM AUTHOR]
Published: 2017

11. Compound Adverbs as an Issue in Machine Analysis of Czech language

Author: Hana Žižková
Subjects: Czech, Linguistics and Language, automatic morphological analysis, Computer science, business.industry, corpus, 02 engineering and technology, Part of speech, computer.software_genre, multiword expression, Language and Linguistics, language.human_language, lcsh:Philology. Linguistics, nominal form, lcsh:P1-1091, tag, 0202 electrical engineering, electronic engineering, information engineering, language, 020201 artificial intelligence & image processing, compound adverb, Artificial intelligence, business, computer, Natural language processing
Abstract: Compound adverbs represent an interesting issue in terms of Automatic Morphological Analysis (AMA). The reason is that compound adverbs in Czech are expressions formed by compounding existing words that are different parts of speech without any change in their form. An indicative sign of compound adverbs is that they can always be decomposed again. Compound adverbs may be written as one word but sometimes a multiword form coexists. A word that is originally a different part of speech gains an adverbial meaning and becomes an adverb. This article presents the results of a corpus probe aimed at mapping expressions that are demonstrably compound adverbs and were not recognized by AMA or were incorrectly tagged by AMA as another part of speech. Analysis of data obtained from the Czech National Corpus (ČNK) SYN v3 show that the unrecognized and incorrectly tagged units can be divided into several groups. Based on knowledge of these groups it is possible to refine part of speech tagging by AMA. The corpus probe examined units written in accordance with the current codification as well as substandard units.
Published: 2017

12. ANÁLISIS MORFOLÓGICO CON HERRAMIENTAS INFORMÁTICAS. RECONOCIMIENTO DE NOMBRES EN TEXTOS DE ESPAÑOL CON EL SISTEMA NOOJ.

Author: Tramallino, Carolina Paola
Subjects: COMPUTATIONAL linguistics, SEMANTICS, NOUNS, GRAMMAR
Abstract: Copyright of Lingüística y Literatura is the property of Universidad de Antioquia, Facultad de Comunicaciones and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Published: 2013

13. Morphological Analyser of Old English

Author: Tichý, Ondřej
Subjects: automatická morfologická analýza, computational linguistics, forms generator, historical linguistics, stará angličtina, automatic morphological analysis, historická lingvistika, komputační lingvistika, morphology, generátor tvarů, Old English, morfologie
Published: 2017

14. Corpora as Data Sources for the Up-Grading of Morphological Tagging

Author: Osolsobě, Klára and Čermák, Petr
Subjects: automatická morfologická analýza, varianta, variant, pos tagging, automatic morphological analysis, morphology, slovotvorba, derivational, morfologické značkování, gerund/deverbal adjective, verbální adjektivum
Published: 2015

15. Morphological Analyser of Old English

Author: Tichý, Ondřej, Čermák, Jan, Petkevič, Vladimír, and Kučera, Karel
Subjects: komputační lingvistika, automatická morfologická analýza, Olde Enlish, morphology, computational linguistics, historická lingvistika, morfologie, historical linguistics, stará angličtina, automatic morphological analysis
Abstract: The paper describes the construction and testing of an electronic application for automatic morphological analysis of Old English. It introduces resources and methodologies at our disposal based on the state of the art in the field of electronic analysis of Old English and on an overview of Old English morphology. A detailed account of the chosen methodology is offered and a specific description of the implementation is provided: from the acquisition and preparation of the input data and choice of technology to the programming and testing of the results. The resulting recall of 95% can be seen as a success of the project, however, the paper also shows how the recall may be improved. It also discusses further use of the analyser, especially the disambiguation of its results. The paper makes a future semi-automatic morphological tagging of Old English texts a real possibility. Powered by TCPDF (www.tcpdf.org)
Published: 2014

16. Análisis morfológico con herramientas informáticas: reconocimiento de nombres en textos de español con el sistema Nooj

Abstract: The objective of this research work is to show the scope of Computational Linguistics in the use of information tools for the morphological automatic analysis. Two programs are described: on one hand, Smorph, a software created by Gabriel Bes, whose formalization makes reference to headword and endings; on the other hand, Nook system, designed by Marx Silverstein to make the morphological, syntactic and semantic analyses of natural languages. Due to the fact that this system still does not have linguistic data corresponding to Spanish, an adaptation of the models that belong to the noun category, as they are stated in Smorph for the creation of grammars and dictionaries in Spanish will be shown., Este trabajo tiene como objetivo mostrar los alcances de la lingüística computacional en el uso de herramientas informáticas para el análisis automático morfológico. Se describen dos programas: por un lado, Smorph, software creado por Gabriel Bes, cuya formalización refiere al lema y terminaciones; por otro, el sistema Nooj, diseñado por Marx Silverstein para realizar el análisis morfológico, sintáctico y semántico de lenguas naturales. Debido a que este aún no posee datos lingüísticos correspondientes al español, se mostrará la adaptación de los modelos correspondientes a la categoría nombre, declarados en Smorph para la creación de gramáticas y diccionarios en español, necesarios en Nooj.
Published: 2013

17. Análisis morfológico con herramientas informáticas: reconocimiento de nombres en textos de español con el sistema Nooj

Abstract: The objective of this research work is to show the scope of Computational Linguistics in the use of information tools for the morphological automatic analysis. Two programs are described: on one hand, Smorph, a software created by Gabriel Bes, whose formalization makes reference to headword and endings; on the other hand, Nook system, designed by Marx Silverstein to make the morphological, syntactic and semantic analyses of natural languages. Due to the fact that this system still does not have linguistic data corresponding to Spanish, an adaptation of the models that belong to the noun category, as they are stated in Smorph for the creation of grammars and dictionaries in Spanish will be shown., Este trabajo tiene como objetivo mostrar los alcances de la lingüística computacional en el uso de herramientas informáticas para el análisis automático morfológico. Se describen dos programas: por un lado, Smorph, software creado por Gabriel Bes, cuya formalización refiere al lema y terminaciones; por otro, el sistema Nooj, diseñado por Marx Silverstein para realizar el análisis morfológico, sintáctico y semántico de lenguas naturales. Debido a que este aún no posee datos lingüísticos correspondientes al español, se mostrará la adaptación de los modelos correspondientes a la categoría nombre, declarados en Smorph para la creación de gramáticas y diccionarios en español, necesarios en Nooj.
Published: 2013

18. Annotation of Lithuanian lexemes : peculiarities and problems

Author: Rimkutė, Erika, Valskys, Vidas, and Vaskelienė, Jolanta
Subjects: Morphological annotator, Lietuva (Lithuania), Lexical database, Žodžių daryba. Žodžio dalys / Word formation. Parts of a word, Morfologija / Morphology, Tekstynas, Substantive, Automatic morphological analysis, Corpus, Automatinė morfologinė analizė, Kalbos dalys. Morfologija / Morphology
Abstract: Straipsnyje rašoma apie lietuvių kalbos morfologinio anotatoriaus veikimo principus, automatinės morfologinės analizės specifiką. Didžiausias dėmesys skiriamas vienam iš 2007-2008 m. Valstybinio mokslo ir studijų fondo remto projekto "Internetiniai ištekliai: anotuotas lietuvių kalbos tekstynas ir anotavimo priemonės (ALKA2)" įgyvendintų darbų - lietuvių kalbos morfologinio anotatoriaus leksinės duomenų bazės pildymui. Išsamiai aprašoma į morfologinio anotatoriaus leksikos duomenų bazę įtrauktinų žodžių atranka, morfologinio anotavimo etapai, sunkumai, su kuriais susidurta atliekant šį darbą. Morfologinio anotatoriaus leksikos bazė padidinta 24 000 žodžių (daugiausia tikrinių ir bendrinių daiktavardžių), todėl tikimasi, kad gana žymiai pagerės morfologinio anotatoriaus kokybė ir bus išvengta daugybės neatpažintų žodžių. Šiame straipsnyje norėta parodyti anotavimo procesą; atskleisti, kad kyla sunkumų ne tik vertinant, ar nauji žodžiai teiktini, ar neteiktini, reikalingi lietuvių kalbai ar galima apsieiti be jų; sudėtinga ne tik nustatyti naujų žodžių reikšmes, bet taip pat ir analizuoti morfologiškai: nustatyti linksniavimo paradigmą, giminę, kaitymą skaičiais, darybinius vedinius ir pan. The article presents the principles of the morphological annotator and the peculiarities of automatic morphological analysis. The paper focuses on building the lexical database of the Lithuanian morphological annotator, which is one of the completed tasks of the project Internet Resources: Annotated Corpus of the Lithuanian Language and Tools of Annotation (ALKA 2), implemented in 2007-2008 and sponsored by the Lithuanian State Science and Studies Foundation. The selection of words to be included into the lexical database of the morphological annotator is described in detail. The stages of morphological annotation and difficulties in this paper are also discussed. The lexical database of the morphological annotator has increased by 24 000 words (mostly proper and common nouns). Therefore it is expected that the quality of the morphological annotator will improve considerably and many unrecognized words will be avoided. The goal of the article is to show the process of annotation. It reveals that problems arise not only during the evaluation of acceptability of new words for the Lithuanian language and the identification of their meanings, but also during their morphological analysis. It is difficult to determine their declension paradigms, gender, number inflection, derivatives, etc.
Published: 2009

19. Problems of the automatic morphological analysis

Author: Rimkutė, Erika
Subjects: Antraštinis žodis (lema), Vienareikšminimas, Lietuva (Lithuania), Headline word (lemma), Headline word (lhema), Morfologija / Morphology, Tekstynas, Antraštinis žodis(lema), Automatic morphological analysis, Corpus, Monosemy, Automatinė morfologinė analizė, Polysemy
Abstract: Straipsnyje pristatoma automatinė morfologinė lietuvių kalbos analizė ir automatiniu būdu sulemuotas bei morfologiškai anotuotas 1 mln. žodžių tekstynėlis. Tam naudojama kompiuterinė programa „Lemuoklis“, automatiškai nustatanti rašytinės žodžio formos antraštinį pavidalą ir galimas tos formos morfologines pažymas. Nagrinėjant Vytauto Didžiojo universiteto Kompiuterinės lingvistikos centre sudarinėjamą automatiškai anotuotą tekstyną, išryškėjo didelis morfologinis daugiareikšmiškumas (apie 40 proc. visų formų yra morfologiškai daugiareikšmės). Lingvistas, norėdamas vienareikšminti tokias formas, dažnai susiduria su problema, kurią formą palikti. Paaiškėjo, kad nėra aiški nekaitomų žodelių klasifikacija, ne visada aiškiai nustatomos tarnybinių kalbos dalių ribos. Iškilo klausimas, kaip skirti visiškai sutampančias kaitomas ir nekaitomas kalbos dalis, kokios morfologinės kategorijos būdingos vienoms ar kitoms kalbos dalims, kaip galima vienareikšminti morfologiškai daugiareikšmes formas bei žodžius. The article deals with the automatically tagged corpus of the Lithuanian language. The corpus with morphological tags has shown a high level degree ambiguity of the language: about 40 percent of word forms are ambiguous. The corpus linguistics has directed attention to such issues, which have not been analysed using methods of traditional linguistics. The morphological ambiguity of language has become obvious only in this automatically tagged corpus. The computational program „Lemuoklis“, created by V. Zinkevičius, can define lemmas and morphological categories of word forms. The morphological ambiguity has appeared only in texts, which were processed by this program. It is very important for automatic morphological analysis to define clearly parts of speech because the accuracy of such analysis can help to avoid morphological ambiguity. But often it is difficult to choose the right form even for the human tagger, as dictionaries and grammars do not agree about how to define parts of speech and some other morphological categories. Some problematic aspects of the Lithuanian morphology are analysed in this article. Very often it is difficult to decide which part of speech a non-inflective word belongs to, what the boundaries of some words are, how one could separate some ambiguous inflective and noninlective words, what morphological categories some parts of speech have.
Published: 2003

20. Morphological analysis of inflective languajes through generation

Author: Gelbukh Khan, Alexander Felixovitch and Sidorov, Grigori
Subjects: Lenguajes flexivos, Inflective languages, Análisis morfológico automático, Automatic morphological analysis, Análisis a través de generación, Analysis through generation
Abstract: Un problema crucial en el desarrollo de los sistemas para el análisis morfológico automático de los idiomas flexivos es el tratamiento de las alternaciones de la base. Los modelos existentes requieren el desarrollo de las reglas correspondientes que especifican qué variantes de la base se pueden generar de la variante dada. Un gran número de tales reglas (por ejemplo, para el lenguaje ruso alrededor de un mil) no tiene ninguna interpretación lingüística razonable. Sugerimos un método que evite el uso de tales reglas gracias a la generación y verificación de las hipótesis sobre las formas gramaticales posibles. Los métodos de este tipo –conocidos como análisis a través de generación– hacen el desarrollo de sistemas mucho más simple que el el enfoque directo estándar. Un sistema para el análisis y la generación morfológica para el lenguaje ruso, desarrollado con nuestro método está disponible sin costo para el uso académico; el sistema para el español está bajo desarrollo. A crucial problem in development of systems for automatic morphological analysis for inflective languages is the treatment of stem alternations. The existing models require development of the corresponding rules that specify what stems can be generated from a given one. Many of such rules (e.g., for Russian about a thousand) do not have any reasonable linguistic interpretation. We suggest a method that avoids the use of such rules by generating and verifying the hypotheses about possible grammatical forms. The methods of such type are known as analysis through generation; they make the system development much simpler than the standard direct approach. A morphological analysis and generation system for Russian developed with our method is freely available for academic use; a Spanish system is being implemented. Work done under partial support of Mexican Government (CONACyT and SNI) and CGEPI-IPN, Mexico.
Published: 2002

21. Morphological analysis of inflective languajes through generation

Abstract: Un problema crucial en el desarrollo de los sistemas para el análisis morfológico automático de los idiomas flexivos es el tratamiento de las alternaciones de la base. Los modelos existentes requieren el desarrollo de las reglas correspondientes que especifican qué variantes de la base se pueden generar de la variante dada. Un gran número de tales reglas (por ejemplo, para el lenguaje ruso alrededor de un mil) no tiene ninguna interpretación lingüística razonable. Sugerimos un método que evite el uso de tales reglas gracias a la generación y verificación de las hipótesis sobre las formas gramaticales posibles. Los métodos de este tipo –conocidos como análisis a través de generación– hacen el desarrollo de sistemas mucho más simple que el el enfoque directo estándar. Un sistema para el análisis y la generación morfológica para el lenguaje ruso, desarrollado con nuestro método está disponible sin costo para el uso académico; el sistema para el español está bajo desarrollo., A crucial problem in development of systems for automatic morphological analysis for inflective languages is the treatment of stem alternations. The existing models require development of the corresponding rules that specify what stems can be generated from a given one. Many of such rules (e.g., for Russian about a thousand) do not have any reasonable linguistic interpretation. We suggest a method that avoids the use of such rules by generating and verifying the hypotheses about possible grammatical forms. The methods of such type are known as analysis through generation; they make the system development much simpler than the standard direct approach. A morphological analysis and generation system for Russian developed with our method is freely available for academic use; a Spanish system is being implemented.
Published: 2002

22. Morphological analysis of inflective languajes through generation

Abstract: Un problema crucial en el desarrollo de los sistemas para el análisis morfológico automático de los idiomas flexivos es el tratamiento de las alternaciones de la base. Los modelos existentes requieren el desarrollo de las reglas correspondientes que especifican qué variantes de la base se pueden generar de la variante dada. Un gran número de tales reglas (por ejemplo, para el lenguaje ruso alrededor de un mil) no tiene ninguna interpretación lingüística razonable. Sugerimos un método que evite el uso de tales reglas gracias a la generación y verificación de las hipótesis sobre las formas gramaticales posibles. Los métodos de este tipo –conocidos como análisis a través de generación– hacen el desarrollo de sistemas mucho más simple que el el enfoque directo estándar. Un sistema para el análisis y la generación morfológica para el lenguaje ruso, desarrollado con nuestro método está disponible sin costo para el uso académico; el sistema para el español está bajo desarrollo., A crucial problem in development of systems for automatic morphological analysis for inflective languages is the treatment of stem alternations. The existing models require development of the corresponding rules that specify what stems can be generated from a given one. Many of such rules (e.g., for Russian about a thousand) do not have any reasonable linguistic interpretation. We suggest a method that avoids the use of such rules by generating and verifying the hypotheses about possible grammatical forms. The methods of such type are known as analysis through generation; they make the system development much simpler than the standard direct approach. A morphological analysis and generation system for Russian developed with our method is freely available for academic use; a Spanish system is being implemented.
Published: 2002

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

22 results on '"automatic morphological analysis"'

1. Building a Combined Morphological Model for Russian Word Forms

2. The morpholexicon of the Uzbek language as a source for the Corpus

3. Kladenští type as a problem of automatic morphological analysis.

4. IMPROVING NOMINALIZED ADJECTIVES TAGGING.

5. Nástroj na tvaroslovnou analýzu staré angličtiny : Morphological Analyser of Old English

6. Korpusy jako zdroje dat pro úpravy nástrojů automatické morfologické analýzy (Slovotvorné varianty adjektiv na [(ou)|í]cí z hlediska morfologického značkování) : Corpora as Data Sources for the Up-Grading of Morphological Tagging

7. On the Root-Based Lexicon for Polish

8. COMPOUND ADVERBS AS AN ISSUE IN MACHINE ANALYSIS OF CZECH LANGUAGE.

9. Nová automatická morfologická analýza češtiny.

10. Nástroj na tvaroslovnou analýzu staré angličtiny.

11. Compound Adverbs as an Issue in Machine Analysis of Czech language

12. ANÁLISIS MORFOLÓGICO CON HERRAMIENTAS INFORMÁTICAS. RECONOCIMIENTO DE NOMBRES EN TEXTOS DE ESPAÑOL CON EL SISTEMA NOOJ.

13. Morphological Analyser of Old English

14. Corpora as Data Sources for the Up-Grading of Morphological Tagging

15. Morphological Analyser of Old English

16. Análisis morfológico con herramientas informáticas: reconocimiento de nombres en textos de español con el sistema Nooj

17. Análisis morfológico con herramientas informáticas: reconocimiento de nombres en textos de español con el sistema Nooj

18. Annotation of Lithuanian lexemes : peculiarities and problems

19. Problems of the automatic morphological analysis

20. Morphological analysis of inflective languajes through generation

21. Morphological analysis of inflective languajes through generation

22. Morphological analysis of inflective languajes through generation

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

22 results on '"automatic morphological analysis"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources