22 results on '"Tomohiro Ohno"'
Search Results
2. Hierarchical Coordinate Structure Analysis for Japanese Statutory Sentences Using Neural Language Models
- Author
-
Takahiro Yamakoshi, Katsuhiko Toyama, Makoto Nakamura, Yasuhiro Ogawa, and Tomohiro Ohno
- Subjects
Coordinate structure ,business.industry ,Statutory law ,Computer science ,Language model ,Artificial intelligence ,business ,computer.software_genre ,computer ,Natural language processing - Published
- 2018
- Full Text
- View/download PDF
3. Sequential Linefeed Insertion into Lecture Transcriptions for Real-Time Captioning
- Author
-
Masaki Murata, Shigeki Matsubara, and Tomohiro Ohno
- Subjects
Closed captioning ,Phrase ,Computer Networks and Communications ,business.industry ,Computer science ,Applied Mathematics ,Speech recognition ,General Physics and Astronomy ,Speech corpus ,computer.software_genre ,Dependency structure ,Signal Processing ,Information support ,Artificial intelligence ,Electrical and Electronic Engineering ,business ,computer ,Natural language processing ,Sentence ,Delay time ,Spoken language - Abstract
SUMMARY To generate readable captions for Japanese spoken monologues such as lectures in real time, it is necessary to sequentially display captions that have proper linefeeds inserted. This paper proposes a technique for sequentially inserting proper linefeeds into a lecture transcript whenever a bunsetsu, which is a linguistic unit shorter than a sentence in Japanese and that roughly corresponds to a basic phrase in English, is identified. Under the assumption that linefeeds are inserted at bunsetsu boundaries, this technique can reduce the delay time of captioning to the utmost possible. This technique statistically judges whether or not a linefeed should be inserted into each bunsetsu boundary by using the information that is available at the time. We conducted experiments on linefeed insertion using a Japanese lecture corpus. The experimental results confirmed that our method, which is a bunsetsu-based linefeed insertion method, was almost as accurate as the sentence-based linefeed insertion method. In addition, we conducted comparative evaluations using four baseline methods. The results confirmed that our method could insert linefeeds more accurately than the simple methods that are thought to have the same delay time as our method.
- Published
- 2015
- Full Text
- View/download PDF
4. Utilization of Multi-word Expressions to Improve Statistical Machine Translation of Statutory Sentences
- Author
-
Katsuhiko Toyama, Makoto Nakamura, Satomi Sakamoto, Yasuhiro Ogawa, and Tomohiro Ohno
- Subjects
Machine translation ,Computer science ,business.industry ,Speech recognition ,Automatic translation ,computer.software_genre ,Translation (geometry) ,Sørensen–Dice coefficient ,Rule-based machine translation ,Artificial intelligence ,business ,computer ,Word (computer architecture) ,Natural language processing - Abstract
Statutory sentences are generally difficult to read because of their complicated expressions and length. Such difficulty is one reason for the low quality of statistical machine translation (SMT). Multi-word expressions (MWEs) also complicate statutory sentences and extend their length. Therefore, we proposed a method that utilizes MWEs to improve the SMT system of statutory sentences. In our method, we extracted the monolingual MWEs from a parallel corpus, automatically acquired these translations based on the Dice coefficient, and integrated the extracted bilingual MWEs into an SMT system by the single-tokenization strategy. The experiment results with our SMT system using the proposed method significantly improved the translation quality. Although automatic translation equivalent acquisition using the Dice coefficient is not perfect, the best system’s score was close to a system that used bilingual MWEs whose equivalents are translated by hand.
- Published
- 2017
- Full Text
- View/download PDF
5. Sequential Linefeed Insertion into Lecture Transcription for Real-time Captioning
- Author
-
Tomohiro Ohno, Masaki Murata, and Shigeki Matsubara
- Subjects
Closed captioning ,business.industry ,Computer science ,Speech recognition ,Speech corpus ,computer.software_genre ,Dependency structure ,Artificial intelligence ,Information support ,Electrical and Electronic Engineering ,Transcription (software) ,business ,computer ,Natural language processing ,Spoken language - Published
- 2013
- Full Text
- View/download PDF
6. Construction of linefeed insertion rules for lecture transcript and their evaluation
- Author
-
Shigeki Matsubara, Masaki Murata, and Tomohiro Ohno
- Subjects
Closed captioning ,linefeed insertion rules ,business.industry ,Computer science ,Speech corpus ,computer.software_genre ,spoken language ,sentence analysis ,Japanese language ,Morpheme ,clause boundaries ,real-time captioning ,ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,Artificial intelligence ,business ,computer ,Sentence analysis ,speech corpus ,Sentence ,Natural language processing ,Spoken language - Abstract
The development of a captioning system that supports the real-time understanding of monologue speech such as lectures and commentaries is required. In monologues, since a sentence tends to be long, each sentence is often displayed in multi lines on the screen. In the case, it is necessary to insert linefeeds into a text so that the text becomes easy to read. This paper proposes a rule-based technique for inserting linefeeds into a Japanese spoken monologue sentence as an elemental technique to generate the readable captions. Our method inserts linefeeds into a sentence by applying the rules based on morphemes, dependencies and clause boundaries. We established the rules by circumstantially investigating the corpus annotated with linefeeds. An experiment using Japanese monologue corpus has shown the effectiveness of our rules.
- Published
- 2010
7. Text-Style Conversion of Speech Transcript into Web Document for Lecture Archive
- Author
-
Masashi Ito, Shigeki Matsubara, and Tomohiro Ohno
- Subjects
Knowledge society ,Information retrieval ,Web 2.0 ,business.industry ,Computer science ,digital archiving ,paraphrasing ,computer.software_genre ,Readability ,Human-Computer Interaction ,Artificial Intelligence ,web contents ,Web page ,natural languages ,Redundancy (engineering) ,ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,Segmentation ,The Internet ,Computer Vision and Pattern Recognition ,Artificial intelligence ,business ,computer ,Natural language processing ,Natural language ,spoken language processing - Abstract
It is very significant to the knowledge society to accumulate spoken documents on the web. However, because of the high redundancy of spontaneous speech, the faithfully transcribed text is not readable on an Internet browser, and therefore not suitable as a web document. This paper proposes a technique for converting spoken documents into web documents for the purpose of building a speech archiving system. The technique edits automatically transcribed texts and improves their readability on the browser. The readable text can be generated by applying technology such as paraphrasing, segmentation, and structuring transcribed texts. Editing experiments using lecture data demonstrated the feasibility of the technique. A prototype system of spoken document archiving was implemented to confirm its effectiveness.
- Published
- 2009
8. Dependency parsing of Japanese monologue using clause boundaries
- Author
-
Yasuyoshi Inagaki, Takehiko Maruyama, Tomohiro Ohno, Hideki Kashioka, Hideki Tanaka, and Shigeki Matsubara
- Subjects
Structure (mathematical logic) ,Linguistics and Language ,Sentence length ,Computer science ,business.industry ,Balanced sentence ,Speech recognition ,General Social Sciences ,Speech corpus ,Library and Information Sciences ,computer.software_genre ,ComputingMethodologies_ARTIFICIALINTELLIGENCE ,Language and Linguistics ,Education ,Feature (linguistics) ,TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES ,Dependency grammar ,Artificial intelligence ,Computational linguistics ,business ,computer ,Sentence ,Natural language processing ,Spoken language - Abstract
Spoken monologues feature greater sentence length and structural complexity than spoken dialogues. To achieve high-parsing performance for spoken monologues, simplifying the structure by dividing a sentence into suitable language units could prove effective. This paper proposes a method for dependency parsing of Japanese spoken monologues based on sentence segmentation. In this method, dependency parsing is executed in two stages: at the clause level and the sentence level. First, dependencies within a clause are identified by dividing a sentence into clauses and executing stochastic dependency parsing for each clause. Next, dependencies across clause boundaries are identified stochastically, and the dependency structure of the entire sentence is thus completed. An experiment using a spoken monologue corpus shows the effectiveness of this method for efficient dependency parsing of Japanese monologue sentences.
- Published
- 2007
- Full Text
- View/download PDF
9. Robust Dependency Parsing of Spontaneous Japanese Spoken Language
- Author
-
Yasuyoshi Inagaki, Nobuo Kawaguchi, Tomohiro Ohno, and Shigeki Matsubara
- Subjects
Parsing ,Computer science ,business.industry ,Speech recognition ,computer.software_genre ,syntactically annotated corpus ,dependency parsing ,linguistic phenomena ,Japanese speech ,Artificial Intelligence ,Hardware and Architecture ,Dependency grammar ,S-attributed grammar ,Written language ,stochastic parsing ,Computer Vision and Pattern Recognition ,Artificial intelligence ,Electrical and Electronic Engineering ,business ,computer ,Software ,Natural language processing ,Utterance ,Bottom-up parsing ,Spoken language - Abstract
Spontaneously spoken Japanese includes a lot of grammatically ill-formed linguistic phenomena such as fillers, hesitations, inversions, and so on, which do not appear in written language. This paper proposes a novel method of robust dependency parsing using a large-scale spoken language corpus, and evaluates the availability and robustness of the method using spontaneously spoken dialogue sentences. By utilizing stochastic information about the appearance of ill-formed phenomena, the method can robustly parse spoken Japanese including fillers, inversions, or dependencies over utterance units. Experimental results reveal that the parsing accuracy reached 87.0 %, and we confirmed that it is effective to utilize the location information of a bunsetsu, and the distance information between bunsetsus as stochastic information.
- Published
- 2005
10. Robust Dependency Parsing of Spontaneous Japanese Speech and Its Evaluation
- Author
-
Yasuyoshi Inagaki, Tomohiro Ohno, Shigeki Matsubara, and Nobuo Kawaguchi
- Subjects
Parsing ,business.industry ,Computer science ,Foundation (evidence) ,parsing ,corpus ,computer.software_genre ,Linguistics ,Japanese speech ,Dependency grammar ,dependency grammar ,Artificial intelligence ,business ,computer ,Natural language processing - Abstract
Spontaneously spoken Japanese includes a lot of grammatically ill-formed linguistic phenomena such as fillers, hesitations, inversions, and so on, which do not appear in written language. This paper proposes a method of robust dependency parsing using a large-scale spoken language corpus, and evaluates the availability and robustness of the method using spontaneously spoken dialogue sentences. By utilizing stochastic information about the appearance of ill-formed phenomena, the method can robustly parse spoken Japanese including fillers, inversions, or dependencies over utterance units. As a result of an experiment, the parsing accuracy provided 87.0%, and we confirmed that it is effective to utilize the location information of a bunsetsu, and the distance information between bunsetsus as stochastic information., Grant-in-Aids for Young Scientists of the Ministry of Education, Science, Sports and Culture, Japan;The Tatematsu Foundation
- Published
- 2004
11. SPIRAL CONSTRUCTION OF SYNTACTICALLY ANNOTATED SPOKEN LANGUAGE CORPUS
- Author
-
Tomohiro Ohno, Yasuyoshi Inagaki, Nobuo Kawaguchi, and Shigeki Matsubara
- Subjects
Dependency (UML) ,Parsing ,Computer science ,business.industry ,Speech recognition ,computer.software_genre ,Speech processing ,ComputingMethodologies_ARTIFICIALINTELLIGENCE ,Rule-based machine translation ,Dependency grammar ,Language database ,Artificial intelligence ,Computational linguistics ,business ,Stochastic parsing ,Dependency parsing ,Spoken dialogue corpus ,computer ,Natural language processing ,Natural language ,Spoken language - Abstract
Spontaneous speech includes a broad range of linguistic phenomena characteristic of spoken language, and therefore a statistical approach would be effective for robust parsing of spoken language. Though a largescale syntactically annotated corpus is required for the stochastic parsing, its construction requires a lot of human resources. This paper proposes a method of efficiently constructing a spoken language corpus for which the dependency analysis is provided. This method uses an existing spoken language corpus. A stochastic dependency parse is employed to tag spoken language sentences with the dependency structures, and the results are corrected manually. The tagged corpus is constructed in a spiral fashion where in the corrected data is utilized as the statistical information for automatic parsing of other data. Taking this spiral approach reduces the parsing errors, also allowing us to reduce the correction cost. An experiment using 10,995 Japanese utterances shows the spiral approach to be effective for efficient corpus construction.
- Published
- 2003
12. Personalized Text Formatting for E-mail Messages
- Author
-
Shigeki Matsubara, Tomohiro Ohno, and Masaki Murata
- Subjects
Information retrieval ,Computer science ,InformationSystems_INFORMATIONSYSTEMSAPPLICATIONS ,Maximum entropy method ,Line length ,computer.file_format ,Blank ,GeneralLiterature_MISCELLANEOUS ,Readability ,Learning data ,Disk formatting ,ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,Formatted text ,computer - Abstract
E-mail systems are common communication tools, and it is desirable that e-mailmessages are written readably for recipients. One of techniques to write readable e-mail messages is to insert linefeeds and blank lines appropriately. However, linefeeds and blank lines in incoming e-mails are not always inserted at positions where recipients feel readable. If linefeeds and blank lines are inserted automatically at proper positions, readability of e-mail texts are improved and recipients can read e-mail texts efficiently. This paper proposes a method for text formatting of email texts by inserting linefeeds and blank lines into incoming e-mails at positions where recipients feel readable.
- Published
- 2012
- Full Text
- View/download PDF
13. Automatic Text Formatting for Social Media Based on Linefeed and Comma Insertion
- Author
-
Tomohiro Ohno, Masaki Murata, and Shigeki Matsubara
- Subjects
Computer science ,business.industry ,Dependency relation ,Maximum entropy method ,computer.software_genre ,Boundary (real estate) ,Disk formatting ,Factor (programming language) ,Social media ,Artificial intelligence ,business ,computer ,Natural language processing ,Spoken language ,computer.programming_language - Abstract
By appearance of social media, people are coming to be able to transmit information easily on a personal level. However, because users of social media generally spend little time on describing information, low-quality texts are transmitted and it blocks the spread of information. On transmitted texts in social media, commas and linefeeds are inserted incorrectly, and it becomes a factor of low-quality texts. This paper proposes a method for automatically formatting Japanese texts in social media. Our method formats texts by inserting commas and linefeeds appropriately. In our method, the positions where commas and linefeeds should be inserted are decided based on machine learning using morphological information, dependency relation and clause boundary information. An experiment using Japanese spoken language texts has shown the effectiveness of our method.
- Published
- 2011
- Full Text
- View/download PDF
14. Automatic Linefeed Insertion for Improving Readability of Lecture Transcript
- Author
-
Shigeki Matsubara, Masaki Murata, and Tomohiro Ohno
- Subjects
Closed captioning ,Computer science ,Dependency relation ,business.industry ,Morpheme ,Artificial intelligence ,business ,computer.software_genre ,Adverbial clause ,computer ,Readability ,Natural language processing ,Sentence - Abstract
The development of a captioning system that supports the real-time understanding of monologue speech such as lectures and commentaries is required. In monologues, since a sentence tends to be long, each sentence is often displayed in multi lines on the screen and becomes unreadable. In the case, it is necessary to insert linefeeds into a text so that the text becomes easy to read. This paper proposes a technique for inserting linefeeds into a Japanese spoken monologue sentence as an elemental technique to generate the readable captions. Our method inserts linefeeds into a sentence by applying the rules based on morphemes, dependencies and clause boundaries. We established the rules by circumstantially investigating the corpus annotated with linefeeds. An experiment using Japanese monologue corpus has shown the effectiveness of our rules.
- Published
- 2009
- Full Text
- View/download PDF
15. Text Editing for Lecture Speech Archiving on the Web
- Author
-
Tomohiro Ohno, Masashi Ito, and Shigeki Matsubara
- Subjects
Knowledge society ,Information retrieval ,Computer science ,business.industry ,computer.software_genre ,Structuring ,Readability ,Text editing ,World Wide Web ,Web page ,ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,Redundancy (engineering) ,Segmentation ,The Internet ,Artificial intelligence ,business ,computer ,Natural language processing - Abstract
It is very significant in the knowledge society to accumulate spoken documents on the web. However, because of the high redundancy of spontaneous speech, the transcribed text in itself is not readable on an Internet browser, and therefore not suitable as a web document. This paper proposes a technique for converting spoken documents into web documents for the purpose of building a speech archiving system. The technique edits automatically transcribed texts and improves its readability on the browser. The readable text can be generated by applying technology such as paraphrasing, segmentation and structuring to the transcribed texts. An edit experiment using lecture data showed the feasibility of the technique. A prototype system of spoken document archiving was implemented to confirm its effectiveness.
- Published
- 2009
- Full Text
- View/download PDF
16. Linefeed insertion into Japanese spoken monologue for captioning
- Author
-
Shigeki Matsubara, Tomohiro Ohno, and Masaki Murata
- Subjects
Closed captioning ,Computer science ,business.industry ,Speech recognition ,ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,Line length ,Artificial intelligence ,computer.software_genre ,business ,computer ,Natural language processing ,Sentence - Abstract
To support the real-time understanding of spoken monologue such as lectures and commentaries, the development of a captioning system is required. In monologues, since a sentence tends to be long, each sentence is often displayed in multi lines on one screen, it is necessary to insert linefeeds into a text so that the text becomes easy to read. This paper proposes a technique for inserting linefeeds into a Japanese spoken monologue text as an elemental technique to generate the readable captions. Our method appropriately inserts linefeeds into a sentence by machine learning, based on the information such as dependencies, clause boundaries, pauses and line length. An experiment using Japanese speech data has shown the effectiveness of our technique., P09;1060
- Published
- 2009
- Full Text
- View/download PDF
17. Simultaneous Summarization of Japanese Spoken Monologue for Real-time Captioning
- Author
-
Hideki Kashioka, Y. Inagaki, Tomohiro Ohno, and Shigeki Matsubara
- Subjects
Closed captioning ,Computer science ,business.industry ,Speech recognition ,Speech input ,computer.software_genre ,Speech processing ,Automatic summarization ,Dependency structure ,Rule-based machine translation ,Artificial intelligence ,business ,computer ,Natural language processing - Abstract
The development of a captioning system that supports the real-time understanding of monologue speech such as lectures and commentary is now in demand. In a realtime captioning system, it is necessary to summarize speech so that the audience can understand it within the display time and to output the caption simultaneously with the monologue speech input. This paper proposes a technique for simultaneous summarization of Japanese spoken monologue toward real-time captioning. Our technique identifies a unit for which the summarization is executed each time a clause boundary is detected. Then our technique summarizes it based on the dependency structure. An experiment using Japanese monologues has shown the feasibility of our technique.
- Published
- 2007
- Full Text
- View/download PDF
18. Towards Robust Spoken Dialogue Systems Using Large-Scale In-Car Speech Corpus
- Author
-
Nobuo Kawaguchi, Yukiko Yamaguchi, Shigeki Matsubara, Keita Hayashi, Shingo Kato, Tomohiro Ohno, Kazuya Takeda, Takahiro Ono, Yuki Irie, and Hiroya Murao
- Subjects
Data collection ,Scope (project management) ,Computer science ,business.industry ,Speech recognition ,Speech corpus ,computer.software_genre ,Knowledge acquisition ,Dependency structure ,Scale (social sciences) ,Artificial intelligence ,business ,computer ,Natural language processing ,Dependency (project management) - Abstract
Researchers of the CIAIR project at Nagoya University have constructed a data collection vehicle and have collected about 179 hours of multi-modal data. Speech data from about 800 subjects have been transcribed and speech intentions, dependency structures, and dialogue structures to the text data have been annotated. Various research activities within the project’s scope are continuing using the annotated data such as speech intention understanding and speaker’s knowledge acquisition. In this chapter, we introduce these research activities and present the several findings from the in-car speech corpus.
- Published
- 2007
- Full Text
- View/download PDF
19. Dependency parsing of Japanese spoken monologue based on clause boundaries
- Author
-
Yasuyoshi Inagaki, Shigeki Matsubara, Takehiko Maruyama, Hideki Kashioka, and Tomohiro Ohno
- Subjects
Structure (mathematical logic) ,Parsing ,Sentence length ,business.industry ,Computer science ,Speech recognition ,computer.software_genre ,ComputingMethodologies_ARTIFICIALINTELLIGENCE ,Dependency structure ,Feature (linguistics) ,TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES ,Dependency grammar ,S-attributed grammar ,Artificial intelligence ,business ,computer ,Natural language processing ,Sentence ,Bottom-up parsing - Abstract
Spoken monologues feature greater sentence length and structural complexity than do spoken dialogues. To achieve high parsing performance for spoken monologues, it could prove effective to simplify the structure by dividing a sentence into suitable language units. This paper proposes a method for dependency parsing of Japanese monologues based on sentence segmentation. In this method, the dependency parsing is executed in two stages: at the clause level and the sentence level. First, the dependencies within a clause are identified by dividing a sentence into clauses and executing stochastic dependency parsing for each clause. Next, the dependencies over clause boundaries are identified stochastically, and the dependency structure of the entire sentence is thus completed. An experiment using a spoken monologue corpus shows this method to be effective for efficient dependency parsing of Japanese monologue sentences, P06;1022
- Published
- 2006
- Full Text
- View/download PDF
20. 節境界単位での漸進的な独話係り受け解析
- Author
-
Yasuyoshi Inagaki, Naoto Kato, Hideki Kashioka, Shigeki Matsubara, and Tomohiro Ohno
- Subjects
Dependency (UML) ,Computer science ,Speech recognition ,音声言語 ,コーパス ,corpus ,Top-down parsing ,computer.software_genre ,incremental parsing ,Dependency grammar ,係り受け解析 ,独話 ,節境界解析 ,monologue ,漸進的解析 ,Interpretation (logic) ,Parsing ,business.industry ,spoken language ,dependency parsing ,clause boundary ,TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES ,Artificial intelligence ,business ,computer ,Natural language processing ,Bottom-up parsing - Abstract
同時通訳や字幕生成のように, 独話を同時的に処理する音声言語処理システムでは, 音声入力にしたがって順次, 解析を実行する漸進的解析技術が必要である.本論文では, 節を解析単位とする独話の漸進的係り受け解析手法を提案する.本手法では, 節境界解析に基づき, 話者による音声入力と同時的に節を同定する.節が入力されるたびにその節の内部の係り受け構造を作成し, すでに入力された節との係り受け関係を動的に決定する.独話文全体が入力される前の段階で係り受け関係を出力することが可能であり, 同時的な音声理解のための言語解析技術として利用できる.独話データを用いた解析実験により, 本手法が, 従来の独話文係り受け解析と同程度の解析性能を備えていることを確認した., In applications of spoken monologue processing such as simultaneous machine interpretation and automatic captions generation, incremental language parsing is strongly required. This paper proposes a technique for incremental dependency parsing of spoken Japanese monologue on a clause-by-clause basis. The technique identifies the clauses based on clause boundaries analysis, analyzes the dependency structures of them, and tries to decide the dependency relations with another clauses, simultaneously with the monologue speech input. The dependency relations are outputted at the stage before the input of the entire monologue sentence, and therefore, our technique can be used for language parsing in simultaneous Japanese speech understanding. An experiment using Japanese monologues has shown that our technique had the same degree of the performance as our past dependency parsing for monologue sentences.
- Published
- 2005
- Full Text
- View/download PDF
21. Acquisition of hyponymy relations for agricultural terms from a Japanese statutory corpus
- Author
-
Yasuhiro Ogawa, Tomohiro Ohno, Makoto Nakamura, and Katsuhiko Toyama
- Subjects
Vocabulary ,Relation (database) ,Computer science ,media_common.quotation_subject ,Aquatic Science ,computer.software_genre ,Legal text processing ,Statutory law ,lcsh:Agriculture (General) ,media_common ,Thesaurus (information retrieval) ,Japanese statutory corpus ,Information retrieval ,lcsh:T58.5-58.64 ,lcsh:Information technology ,business.industry ,Forestry ,AGROVOC ,lcsh:S1-972 ,Computer Science Applications ,Animal Science and Zoology ,Artificial intelligence ,business ,Agronomy and Crop Science ,computer ,Natural language processing - Abstract
This paper, which aims to increment the vocabulary of an existing thesaurus using hyponymy relations, focuses on an agricultural thesaurus called AGROVOC. Our main goal is to acquire AGROVOC-qualified candidates from the hyponymy relations of legal texts and tables. We propose a pattern-based approach to hyponymy relation acquisition. Our experimental result showed that 222 and 868 candidates are extracted from statutory sentences with 67.1% precision and tables with 37.0% precision, respectively.
- Full Text
- View/download PDF
22. Japanese word reordering executed concurrently with dependency parsing and its evaluation
- Author
-
Tomohiro Ohno, Shigeki Matsubara, Yoshihide Kato, and Kazushi Yoshida
- Subjects
business.industry ,Computer science ,Speech recognition ,Evaluation data ,Dependency grammar ,Artificial intelligence ,computer.software_genre ,business ,computer ,Natural language processing ,Sentence ,Word (computer architecture) - Abstract
This paper proposes a method for reordering words in a Japanese sentence based on concurrent execution with dependency parsing so that the sentence becomes more readable. Our contributions are summarized as follows: (1) we extend a probablistic model used in the previous work which concurrently performs word reordering and dependency parsing; (2) we conducted an evaluation experiment using our semi-automatically constructed evaluation data so that sentences in the data are more likely to be spontaneously written by natives than the automatically constructed evaluation data in the previous work.
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.