6 results on '"Iroro Orife"'
Search Results
2. A large TV dataset for speech and music activity detection
- Author
-
Yun-Ning Hung, Chih-Wei Wu, Iroro Orife, Aaron Hipple, William Wolcott, and Alexander Lerch
- Subjects
Acoustics and Ultrasonics ,Electrical and Electronic Engineering - Abstract
Automatic speech and music activity detection (SMAD) is an enabling task that can help segment, index, and pre-process audio content in radio broadcast and TV programs. However, due to copyright concerns and the cost of manual annotation, the limited availability of diverse and sizeable datasets hinders the progress of state-of-the-art (SOTA) data-driven approaches. We address this challenge by presenting a large-scale dataset containing Mel spectrogram, VGGish, and MFCCs features extracted from around 1600 h of professionally produced audio tracks and their corresponding noisy labels indicating the approximate location of speech and music segments. The labels are several sources such as subtitles and cuesheet. A test set curated by human annotators is also included as a subset for evaluation. To validate the generalizability of the proposed dataset, we conduct several experiments comparing various model architectures and their variants under different conditions. The results suggest that our proposed dataset is able to serve as a reliable training resource and leads to SOTA performances on various public datasets. To the best of our knowledge, this dataset is the first large-scale, open-sourced dataset that contains features extracted from professionally produced audio tracks and their corresponding frame-level speech and music annotations.
- Published
- 2022
- Full Text
- View/download PDF
3. MasakhaNER: Named entity recognition for African languages
- Author
-
Julia Kreutzer, Ayodele Awokoya, Ignatius Ezeani, Rubungo Andre Niyongabo, Happy Buzaaba, Adewale Akinfaderin, Samuel Oyerinde, Stephen Mayhew, Emmanuel Anebi, Mofetoluwa Adeyemi, Kelechi Ogueji, Abdoulaye Diallo, Seid Muhie Yimam, Jade Abbott, Joyce Nakatumba-Nabende, Victor Akinode, Blessing Sibanda, Catherine Gitau, Chester Palen-Michel, Shamsuddeen Hassan Muhammad, Degaga Wolde, Graham Neubig, Tendai Marengereke, Paul Rayson, Derguene Mbaye, Eric Peter Wairagala, Daniel D'souza, Tosin P. Adewumi, Jonathan Mukiibi, Chris Chinenye Emezue, David Ifeoluwa Adelani, Shruti Rijhwani, Iroro Orife, Verrah Otiende, Maurice Katusiime, Yvonne Wambui, Dibora Gebreyohannes, Kelechi Nwaike, Salomey Osei, Chiamaka Chukwuneke, Henok Tilaye, Deborah Nabagereka, Thierno Ibrahima Diop, Orevaoghene Ahia, Jesujoba O. Alabi, Sebastian Ruder, Davis David, Mouhamadane Mboup, Samba Ngom, Tajuddeen R. Gwadabe, Bonaventure F. P. Dossou, Temilola Oloyede, Perez Ogayo, Clemencia Siro, Gerald Muriuki, Aremu Anuoluwapo, Nkiruka Odu, Tobius Saul Bateesa, Abdoulaye Faye, Israel Abebe Azime, Constantine Lignos, Saarland University [Saarbrücken], Masakhane NLP, Retro Rabbit, Carnegie Mellon University [Pittsburgh] (CMU), ProQuest, Google Research, Brandeis University, Université de Tsukuba = University of Tsukuba, DeepMind, DeepMind Technologies, Duolingo, African Institute for Mathematical Sciences (AIMS), University of Porto, Bayero University Kano (BUK), Technische Universität Munchen - Université Technique de Munich [Munich, Allemagne] (TUM), Makerere University [Kampala, Ouganda] (MAK), African Leadership University, University of Lagos, Max Planck Institute for Informatics [Saarbrücken], Universität Hamburg (UHH), University of Chinese Academy of Sciences [Beijing] (UCAS), Lancaster University, University of Electronic Science and Technology of China (UESTC), United States International University - Africa, Niger-Volta Language Technologies Institute, Luleå University of Technology (LUT), African University of Science and Technology (AUST), University of Ibadan, Namibia University of Science and Technology (NUST), InstaDeep, Jacobs University [Bremen], University of Waterloo [Waterloo], European Project: 825081,H2020,COMPRISE(2018), Technical University of Munich (TUM), DeepMind [London], Universidade do Porto = University of Porto, and University of Electronic Science and Technology of China [Chengdu] (UESTC)
- Subjects
FOS: Computer and information sciences ,Linguistics and Language ,Computer Science - Computation and Language ,Computer science ,business.industry ,Computer Science - Artificial Intelligence ,Communication ,Languages of Africa ,computer.software_genre ,Code (semiotics) ,[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] ,Computer Science Applications ,Human-Computer Interaction ,Artificial Intelligence (cs.AI) ,Named-entity recognition ,Artificial Intelligence ,Artificial intelligence ,Transfer of learning ,business ,computer ,Computation and Language (cs.CL) ,Natural language processing - Abstract
We take a step towards addressing the under-representation of the African continent in NLP research by creating the first large publicly available high-quality dataset for named entity recognition (NER) in ten African languages, bringing together a variety of stakeholders. We detail characteristics of the languages to help researchers understand the challenges that these languages pose for NER. We analyze our datasets and conduct an extensive empirical evaluation of state-of-the-art methods across both supervised and transfer learning settings. We release the data, code, and models in order to inspire future research on African NLP., Comment: Accepted to TACL 2021, pre-MIT Press publication version
- Published
- 2021
- Full Text
- View/download PDF
4. Participatory Research for Low-resourced Machine Translation: A Case Study in African Languages
- Author
-
Blessing Itoro Bassey, Ghollah Kioko, Masabata Mokgesi-Selinga, Mofe Adeyemi, Musie Meressa, Julia Kreutzer, Herman Kamper, Rubungo Andre Niyongabo, Chris Chinenye Emezue, Arshath Ramkilowan, Taiwo Fagbohungbe, Timi E. Fasubaa, Hady Elsahar, Salomey Osei, Daniel Whitenack, Tajudeen Kolawole, Ignatius Ezeani, Shamsuddeen Hassan Muhammad, Kathleen Siminyu, Jamiil Toure Ali, Adewale Akinfaderin, Tshinondiwa Matsila, Bonaventure F. P. Dossou, Vukosi Marivate, Wilhelmina Nekoto, Idris Abdulkabir Dangana, Iroro Orife, Lawrence Okegbemi, Espoir Murhabazi, Salomon Kabongo, Orevaoghene Ahia, Alp Öktem, Ricky Macharm, Elan van Biljon, Jade Abbott, Kolawole Tajudeen, Blessing Sibanda, Perez Ogayo, Solomon Oluwole Akinola, Kelechi Ogueji, Christopher Onyefuluchi, Abdallah Bashir, Ayodele Olabiyi, Sackey Freshia, Kevin Degila, Jason Webster, Goodness Duru, and Laura Martinus
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Computer Science - Computation and Language ,Machine translation ,Computer Science - Artificial Intelligence ,Process (engineering) ,Computer science ,Languages of Africa ,Participatory action research ,02 engineering and technology ,computer.software_genre ,Data science ,Machine Learning (cs.LG) ,Task (project management) ,Focus (linguistics) ,03 medical and health sciences ,Artificial Intelligence (cs.AI) ,0302 clinical medicine ,030221 ophthalmology & optometry ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Computation and Language (cs.CL) ,computer - Abstract
Research in NLP lacks geographic diversity, and the question of how NLP can be scaled to low-resourced languages has not yet been adequately solved. "Low-resourced"-ness is a complex problem going beyond data availability and reflects systemic problems in society. In this paper, we focus on the task of Machine Translation (MT), that plays a crucial role for information accessibility and communication worldwide. Despite immense improvements in MT over the past decade, MT is centered around a few high-resourced languages. As MT researchers cannot solve the problem of low-resourcedness alone, we propose participatory research as a means to involve all necessary agents required in the MT development process. We demonstrate the feasibility and scalability of participatory research with a case study on MT for African languages. Its implementation leads to a collection of novel translation datasets, MT benchmarks for over 30 languages, with human evaluations for a third of them, and enables participants without formal training to make a unique scientific contribution. Benchmarks, models, data, code, and evaluation results are released under https://github.com/masakhane-io/masakhane-mt., Comment: Findings of EMNLP 2020; updated benchmarks
- Published
- 2020
- Full Text
- View/download PDF
5. Attentive Sequence-to-Sequence Learning for Diacritic Restoration of YorùBá Language Text
- Author
-
Iroro Orife
- Subjects
060201 languages & linguistics ,Computer science ,business.industry ,Yoruba ,06 humanities and the arts ,02 engineering and technology ,computer.software_genre ,language.human_language ,0602 languages and literature ,Diacritic ,0202 electrical engineering, electronic engineering, information engineering ,language ,020201 artificial intelligence & image processing ,Sequence learning ,Artificial intelligence ,business ,computer ,Natural language processing ,Sequence (medicine) - Published
- 2018
- Full Text
- View/download PDF
6. Tae Hong Park: Introduction to Digital Signal Processing: Computer Musically Speaking
- Author
-
Iroro Orife
- Subjects
Multimedia ,business.industry ,Computer science ,Speech recognition ,Media Technology ,business ,computer.software_genre ,computer ,Music ,Digital signal processing ,Computer Science Applications - Published
- 2009
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.