45 results on '"Cagdas Bilen"'
Search Results
2. The CNN news footage datasets: Enabling supervision in image retrieval.
- Author
-
Cagdas Bilen, Joaquin Zepeda, and Patrick Pérez
- Published
- 2016
- Full Text
- View/download PDF
3. Multichannel audio declipping.
- Author
-
Alexey Ozerov, Cagdas Bilen, and Patrick Pérez
- Published
- 2016
- Full Text
- View/download PDF
4. Automatic allocation of NTF components for user-guided audio source separation.
- Author
-
Cagdas Bilen, Alexey Ozerov, and Patrick Pérez
- Published
- 2016
- Full Text
- View/download PDF
5. Supervised learning of low-rank transforms for image retrieval.
- Author
-
Cagdas Bilen, Joaquin Zepeda, and Patrick Pérez
- Published
- 2016
- Full Text
- View/download PDF
6. Compressive sampling-based informed source separation.
- Author
-
Cagdas Bilen, Alexey Ozerov, and Patrick Pérez
- Published
- 2015
- Full Text
- View/download PDF
7. Audio declipping via nonnegative matrix factorization.
- Author
-
Cagdas Bilen, Alexey Ozerov, and Patrick Pérez
- Published
- 2015
- Full Text
- View/download PDF
8. A Framework for the Robust Evaluation of Sound Event Detection.
- Author
-
Cagdas Bilen, Giacomo Ferroni, Francesco Tuveri, Juan Azcarreta, and Sacha Krstulovic
- Published
- 2019
9. A conjugate gradient algorithm for blind sensor calibration in sparse recovery.
- Author
-
Hao Shen 0002, Martin Kleinsteuber, Cagdas Bilen, and Rémi Gribonval
- Published
- 2013
- Full Text
- View/download PDF
10. Blind phase calibration in sparse recovery.
- Author
-
Cagdas Bilen, Gilles Puy, Rémi Gribonval, and Laurent Daudet
- Published
- 2013
11. On compressed sensing in parallel MRI of cardiac perfusion using temporal wavelet and TV regularization.
- Author
-
Cagdas Bilen, Ivan W. Selesnick, Yao Wang 0001, Ricardo Otazo, Daniel Kim, Leon Axel, and Daniel K. Sodickson
- Published
- 2010
- Full Text
- View/download PDF
12. End-to-End Stereoscopic Video Streaming System.
- Author
-
Selen Pehlivan, Anil Aksay, Cagdas Bilen, Gozde Bozdagi Akar, and M. Reha Civanlar
- Published
- 2006
- Full Text
- View/download PDF
13. Temporal and spatial scaling for stereoscopic video compression.
- Author
-
Anil Aksay, Cagdas Bilen, Engin Kurutepe, Tanir Ozcelebi, Gozde Bozdagi Akar, M. Reha Civanlar, and A. Murat Tekalp
- Published
- 2006
14. Schemes for Multiple Description Coding of Stereoscopic Video.
- Author
-
Andrey Norkin, Anil Aksay, Cagdas Bilen, Gozde Bozdagi Akar, Atanas P. Gotchev, and Jaakko Astola
- Published
- 2006
- Full Text
- View/download PDF
15. A Multi-View Video Codec Based on H.264.
- Author
-
Cagdas Bilen, Anil Aksay, and Gozde Bozdagi Akar
- Published
- 2006
- Full Text
- View/download PDF
16. Subjective evaluation of effects of spectral and spatial redundancy reduction on stereo images.
- Author
-
Anil Aksay, Cagdas Bilen, and Gozde Bozdagi Akar
- Published
- 2005
17. Convex Optimization Approaches for Blind Sensor Calibration Using Sparsity.
- Author
-
Cagdas Bilen, Gilles Puy, Rémi Gribonval, and Laurent Daudet
- Published
- 2014
- Full Text
- View/download PDF
18. High-Speed Compressed Sensing Reconstruction in Dynamic Parallel MRI Using Augmented Lagrangian and Parallel Processing.
- Author
-
Cagdas Bilen, Yao Wang 0001, and Ivan W. Selesnick
- Published
- 2012
- Full Text
- View/download PDF
19. End-to-end stereoscopic video streaming with content-adaptive rate and format control.
- Author
-
Anil Aksay, Selen Pehlivan, Engin Kurutepe, Cagdas Bilen, Tanir Ozcelebi, Gozde Bozdagi Akar, M. Reha Civanlar, and A. Murat Tekalp
- Published
- 2007
- Full Text
- View/download PDF
20. Proceedings of the second 'international Traveling Workshop on Interactions between Sparse models and Technology' (iTWIST'14).
- Author
-
Laurent Jacques, Christophe De Vleeschouwer, Yannick Boursier, Prasad Sudhakar, C. De Mol, Aleksandra Pizurica, Sandrine Anthoine, Pierre Vandergheynst, Pascal Frossard, Cagdas Bilen, Srdan Kitic, Nancy Bertin, Rémi Gribonval, Nicolas Boumal, Bamdev Mishra, Pierre-Antoine Absil, Rodolphe Sepulchre, Shaun Bundervoet, Colas Schretter, Ann Dooms, Peter Schelkens, Olivier Chabiron, François Malgouyres, Jean-Yves Tourneret, Nicolas Dobigeon, Pierre Chainais, Cédric Richard, Bruno Cornelis, Ingrid Daubechies, David B. Dunson, Marie Danková, Pavel Rajmic, Kévin Degraux, Valerio Cambareri, Bert Geelen, Gauthier Lafruit, Gianluca Setti, Jean-François Determe, Jérôme Louveaux, François Horlin, Angélique Drémeau, Patrick Héas, Cédric Herzet, Vincent Duval, Gabriel Peyré, Alhussein Fawzi, Mike E. Davies 0001, Nicolas Gillis, Stephen A. Vavasis, Charles Soussen, Luc Le Magoarou, Jingwei Liang, Jalal Fadili, Antoine Liutkus, David Martina, Sylvain Gigan, Laurent Daudet, Mauro Maggioni, Stanislav Minsker, Nate Strawn, C. Mory, Fred Maurice Ngolè Mboula, Jean-Luc Starck, Ignace Loris, Samuel Vaiter, Mohammad Golbabaee, and Dejan Vukobratovic
- Published
- 2014
21. Balancing Sparsity and Rank Constraints in Quadratic Basis Pursuit.
- Author
-
Cagdas Bilen, Gilles Puy, Rémi Gribonval, and Laurent Daudet
- Published
- 2014
22. Convex Optimization Approaches for Blind Sensor Calibration using Sparsity.
- Author
-
Cagdas Bilen, Gilles Puy, Rémi Gribonval, and Laurent Daudet
- Published
- 2013
23. Compressed Sensing for Moving Imagery in Medical Imaging
- Author
-
Cagdas Bilen, Yao Wang 0001, and Ivan W. Selesnick
- Published
- 2012
24. Improving Sound Event Detection Metrics: Insights from DCASE 2020
- Author
-
Francesco Tuveri, Sacha Krstulovic, Cagdas Bilen, Romain Serizel, Juan Azcarreta, Giacomo Ferroni, Nicolas Turpault, Audio Analytic, Speech Modeling for Facilitating Oral-Based Communication (MULTISPEECH), Inria Nancy - Grand Est, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Department of Natural Language Processing & Knowledge Discovery (LORIA - NLPKD), Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS), UL/INRIA’s work for this article was partly supported by the French National Research Agency (project LEAUDS 'Learning to under-stand audioscenes' ANR-18-CE23-0020) and by the French region Grand-Est., Grid'5000, and ANR-18-CE23-0020,LEAUDS,Apprentissage statistique pour la compréhension de scènes audio(2018)
- Subjects
FOS: Computer and information sciences ,Sound (cs.SD) ,Dependency (UML) ,Computer science ,02 engineering and technology ,computer.software_genre ,Computer Science - Sound ,[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI] ,Audio and Speech Processing (eess.AS) ,Robustness (computer science) ,FOS: Electrical engineering, electronic engineering, information engineering ,0202 electrical engineering, electronic engineering, information engineering ,Event (probability theory) ,Operating point ,Signal processing ,Intersection (set theory) ,Sound detection ,segment vs event criteria ,sound event detection ,evaluation metrics ,Ranking ,[INFO.INFO-SD]Computer Science [cs]/Sound [cs.SD] ,020201 artificial intelligence & image processing ,Data mining ,[SPI.SIGNAL]Engineering Sciences [physics]/Signal and Image processing ,computer ,polyphonic sound detection score ,Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
International audience; The ranking of sound event detection (SED) systems may be biased by assumptions inherent to evaluation criteria and to the choice of an operating point. This paper compares conventional event-based and segment-based criteria against the Polyphonic Sound Detection Score (PSDS)'s intersection-based criterion, over a selection of systems from DCASE 2020 Challenge Task 4. It shows that, by relying on collars , the conventional event-based criterion introduces different strictness levels depending on the length of the sound events, and that the segment-based criterion may lack precision and be application dependent. Alternatively, PSDS's intersection-based criterion overcomes the dependency of the evaluation on sound event duration and provides robustness to labelling subjectivity, by allowing valid detections of interrupted events. Furthermore, PSDS enhances the comparison of SED systems by measuring sound event modelling performance independently from the systems' operating points.
- Published
- 2020
- Full Text
- View/download PDF
25. A Framework for the Robust Evaluation of Sound Event Detection
- Author
-
Francesco Tuveri, Cagdas Bilen, Juan Azcarreta, Giacomo Ferroni, and Sacha Krstulovic
- Subjects
FOS: Computer and information sciences ,Sound (cs.SD) ,Event (computing) ,Computer science ,Sound detection ,Stability (learning theory) ,computer.software_genre ,Computer Science - Sound ,Task (project management) ,Reduction (complexity) ,Audio and Speech Processing (eess.AS) ,FOS: Electrical engineering, electronic engineering, information engineering ,Polyphony ,Data mining ,Baseline (configuration management) ,computer ,Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
This work defines a new framework for performance evaluation of polyphonic sound event detection (SED) systems, which overcomes the limitations of the conventional collar-based event decisions, event F-scores and event error rates. The proposed framework introduces a definition of event detection that is more robust against labelling subjectivity. It also resorts to polyphonic receiver operating characteristic (ROC) curves to deliver more global insight into system performance than F1-scores, and proposes a reduction of these curves into a single polyphonic sound detection score (PSDS), which allows system comparison independently from operating points (OPs). The presented method also delivers better insight into data biases and classification stability across sound classes. Furthermore, it can be tuned to varying applications in order to match a variety of user experience requirements. The benefits of the proposed approach are demonstrated by re-evaluating the baseline and two of the top-performing systems from DCASE 2019 Task 4., Accepted to ICASSP 2020
- Published
- 2019
26. Solving Time Domain Audio Inverse Problems using Nonnegative Tensor Factorization
- Author
-
Alexey Ozerov, Cagdas Bilen, Patrick Pérez, Laboratoire des sciences de l'ingénieur, de l'informatique et de l'imagerie (ICube), Institut National des Sciences Appliquées - Strasbourg (INSA Strasbourg), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Strasbourg (UNISTRA)-Centre National de la Recherche Scientifique (CNRS)-École Nationale du Génie de l'Eau et de l'Environnement de Strasbourg (ENGEES)-Réseau nanophotonique et optique, Centre National de la Recherche Scientifique (CNRS)-Université de Strasbourg (UNISTRA)-Université de Haute-Alsace (UHA) Mulhouse - Colmar (Université de Haute-Alsace (UHA))-Centre National de la Recherche Scientifique (CNRS)-Université de Strasbourg (UNISTRA)-Université de Haute-Alsace (UHA) Mulhouse - Colmar (Université de Haute-Alsace (UHA))-Matériaux et nanosciences d'Alsace (FMNGE), Institut de Chimie du CNRS (INC)-Université de Strasbourg (UNISTRA)-Université de Haute-Alsace (UHA) Mulhouse - Colmar (Université de Haute-Alsace (UHA))-Institut National de la Santé et de la Recherche Médicale (INSERM)-Centre National de la Recherche Scientifique (CNRS)-Institut de Chimie du CNRS (INC)-Université de Strasbourg (UNISTRA)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Centre National de la Recherche Scientifique (CNRS), Technicolor R & I [Cesson Sévigné], Technicolor, ANR-14-CE27-0002,MAD,Inpainting de données audio manquantes(2014), Université de Strasbourg (UNISTRA), Ozerov, Alexey, Appel à projets générique - Inpainting de données audio manquantes - - MAD2014 - ANR-14-CE27-0002 - Appel à projets générique - VALID, École Nationale du Génie de l'Eau et de l'Environnement de Strasbourg (ENGEES)-Université de Strasbourg (UNISTRA)-Institut National des Sciences Appliquées - Strasbourg (INSA Strasbourg), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Institut National de Recherche en Informatique et en Automatique (Inria)-Les Hôpitaux Universitaires de Strasbourg (HUS)-Centre National de la Recherche Scientifique (CNRS)-Matériaux et Nanosciences Grand-Est (MNGE), Université de Strasbourg (UNISTRA)-Université de Haute-Alsace (UHA) Mulhouse - Colmar (Université de Haute-Alsace (UHA))-Institut National de la Santé et de la Recherche Médicale (INSERM)-Institut de Chimie du CNRS (INC)-Centre National de la Recherche Scientifique (CNRS)-Université de Strasbourg (UNISTRA)-Université de Haute-Alsace (UHA) Mulhouse - Colmar (Université de Haute-Alsace (UHA))-Institut National de la Santé et de la Recherche Médicale (INSERM)-Institut de Chimie du CNRS (INC)-Centre National de la Recherche Scientifique (CNRS)-Réseau nanophotonique et optique, and Université de Strasbourg (UNISTRA)-Université de Haute-Alsace (UHA) Mulhouse - Colmar (Université de Haute-Alsace (UHA))-Centre National de la Recherche Scientifique (CNRS)-Université de Strasbourg (UNISTRA)-Centre National de la Recherche Scientifique (CNRS)
- Subjects
[INFO.INFO-TS] Computer Science [cs]/Signal and Image Processing ,Computer science ,020206 networking & telecommunications ,02 engineering and technology ,Inverse problem ,computer.software_genre ,Non-negative matrix factorization ,030507 speech-language pathology & audiology ,03 medical and health sciences ,Compressed sensing ,[INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing ,Computer Science::Sound ,Frequency domain ,Signal Processing ,0202 electrical engineering, electronic engineering, information engineering ,Source separation ,Time domain ,Electrical and Electronic Engineering ,0305 other medical science ,Audio signal processing ,Joint (audio engineering) ,computer ,Algorithm - Abstract
Nonnegative matrix factorization (NMF) and nonnegative tensor factorization (NTF) are important tools for modeling nonnegative data, which gained increasing popularity in various fields, a significant one of which is audio processing. However, there are still many problems in audio processing, for which the NMF (or NTF) model has not been successfully utilized. In this paper, we propose a new algorithm based on the NMF (and NTF) in the short-time Fourier domain for solving a large class of audio inverse problems with missing or corrupted time-domain samples. The proposed approach overcomes the difficulty of employing a model in the frequency domain to recover time-domain samples with the help of probabilistic modeling. Its performance is demonstrated for the following applications: audio declipping and declicking (never solved with NMF/NTF modeling prior to this paper); joint audio declipping/declicking and source separation (never solved with NMF/NTF modeling or any other method prior to this paper); and compressive sampling recovery and compressive sampling-based informed source separation (an extremely low complexity encoding scheme that is possible with the proposed approach and has never been proposed prior to this paper).
- Published
- 2018
- Full Text
- View/download PDF
27. Supervised learning of low-rank transforms for image retrieval
- Author
-
Joaquin Zepeda, Cagdas Bilen, and Patrick Pérez
- Subjects
Computer Science::Machine Learning ,Rank (linear algebra) ,business.industry ,Computer science ,Supervised learning ,Online machine learning ,Approximation algorithm ,Pattern recognition ,02 engineering and technology ,Semi-supervised learning ,010501 environmental sciences ,Machine learning ,computer.software_genre ,01 natural sciences ,Generalization error ,Stochastic gradient descent ,0202 electrical engineering, electronic engineering, information engineering ,Unsupervised learning ,020201 artificial intelligence & image processing ,Learning to rank ,Artificial intelligence ,business ,Image retrieval ,computer ,0105 earth and related environmental sciences - Abstract
In this paper we propose a new method to automatically select the rank of linear transforms during supervised learning. Our approach relies on a sparsity-enforcing element-wise soft-thresholding operation applied after the linear transform. This novel approach to supervised rank learning has the important advantage that it is very simple to implement and incurs no extra complexity relative to linear transform learning. Furthermore, we propose a simple Stochastic Gradient Descent (SGD) implementation suitable for large scale learning, where SGD solvers have established themselves as the default workhorse. We compare our method to various other metric learning techniques in the application of image retrieval. This is one of the remaining few areas where supervised learning of low-rank linear transforms has not been fully exploited. The main reason for this is the lack of adequate datasets that are large enough, and hence we further introduce a new dataset consisting of groups of matching images derived from Cable News Network (CNN) videos using geometric verification and manual selection to find matching frames with adequate variability.
- Published
- 2016
- Full Text
- View/download PDF
28. End-to-end stereoscopic video streaming with content-adaptive rate and format control
- Author
-
Gozde Bozdagi Akar, Engin Kurutepe, Selen Pehlivan, Anil Aksay, Tanir Ozcelebi, M. Reha Civanlar, A. Murat Tekalp, Cagdas Bilen, Mathematics and Computer Science, and Interconnected Resource-aware Intelligent Systems
- Subjects
Multimedia ,Computer science ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Stereoscopy ,Video processing ,Video quality ,computer.software_genre ,Coding gain ,law.invention ,Adaptive coding ,law ,Signal Processing ,Human visual system model ,Computer vision ,Computer Vision and Pattern Recognition ,Artificial intelligence ,Electrical and Electronic Engineering ,Multiview Video Coding ,business ,computer ,Software ,Coding (social sciences) - Abstract
We address efficient compression and real-time streaming of stereoscopic video over the current Internet. We first propose content-adaptive stereo video coding (CA-SC), where additional coding gain, over that can be achieved by exploiting only inter-view correlations, is targeted by down-sampling one of the views spatially or temporally depending on the content, based on the well-known theory that the human visual system can perceive high frequencies in three-dimensional (3D) from the higher quality view. We also developed stereoscopic 3D video streaming server and clients by modifying available open source platforms, where each client can view the video in mono or stereo mode depending on its display capabilities. The performance of the end-to-end stereoscopic streaming system is demonstrated using subjective quality tests.
- Published
- 2007
29. Audio declipping via nonnegative matrix factorization
- Author
-
Patrick Pérez, Alexey Ozerov, Cagdas Bilen, Technicolor [Cesson Sévigné], Technicolor, ANR-14-CE27-0002,MAD,Inpainting de données audio manquantes(2014), Ozerov, Alexey, and Appel à projets générique - Inpainting de données audio manquantes - - MAD2014 - ANR-14-CE27-0002 - Appel à projets générique - VALID
- Subjects
Audio inpainting ,Signal processing ,Audio signal ,Noise measurement ,[INFO.INFO-TS] Computer Science [cs]/Signal and Image Processing ,business.industry ,Speech recognition ,nonneg-ative matrix factorization ,audio declipping ,Inpainting ,generalized expectation-maximization ,Pattern recognition ,computer.software_genre ,Itakura-Saito divergence ,Matrix decomposition ,Non-negative matrix factorization ,[INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing ,Source separation ,Artificial intelligence ,Audio signal processing ,business ,computer ,Mathematics - Abstract
International audience; Audio inpainting and audio declipping are important problems in audio signal processing, which are encountered in various practical applications. A number of approaches has been proposed in the literature to address these problems, most successful of which are based on sparsity of the audio signals in certain dictionary representations. Non-negative matrix factorization (NMF) is another powerful tool that has been successfully used in applications such as audio source separation. In this paper we propose a new algorithm that makes use of a low rank NMF model to perform audio inpainting and declipping. In addition to utilizing for the first time the NMF model to perform audio inpainting in presence of arbitrary losses in time domain, the proposed approach also introduces a novel way to enforce additional constraints on the signal magnitude in order to improve the performance in declipping applications. The proposed approach is shown to have a comparable performance with the state of the art dictionary based methods while providing a number of advantages.
- Published
- 2015
- Full Text
- View/download PDF
30. Joint Audio Inpainting and Source Separation
- Author
-
Cagdas Bilen, Alexey Ozerov, Patrick Pérez, Ozerov, Alexey, Appel à projets générique - Inpainting de données audio manquantes - - MAD2014 - ANR-14-CE27-0002 - Appel à projets générique - VALID, Technicolor R & I [Cesson Sévigné], Technicolor, and ANR-14-CE27-0002,MAD,Inpainting de données audio manquantes(2014)
- Subjects
[INFO.INFO-TS] Computer Science [cs]/Signal and Image Processing ,Computer science ,Inpainting ,generalized expectation-maximization ,02 engineering and technology ,computer.software_genre ,Blind signal separation ,030507 speech-language pathology & audiology ,03 medical and health sciences ,[INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing ,Clipping (photography) ,0202 electrical engineering, electronic engineering, information engineering ,Source separation ,Computer vision ,Audio signal processing ,Audio signal ,Minimum mean square error ,audio inpainting ,business.industry ,nonnegative tensor factorization ,audio source separation ,020206 networking & telecommunications ,Pattern recognition ,Artificial intelligence ,0305 other medical science ,business ,Joint (audio engineering) ,computer - Abstract
International audience; Despite being two important problems in audio signal processing that are interconnected in practice, audio inpainting and audio source separation have not been considered jointly. It is not uncommon in practice to have the mixtures to be separated which also suffer from artifacts due to clipping or other losses. In present work, we consider this problem of source separation using partially observed mixtures. We introduce a flexible framework based on non-negative tensor factorisation (NTF) to attack this new task, and we apply it to source separation with clipped mixtures. It allows us to perform declipping and source separation either in turn or jointly. We investigate experimentally these two regimes and report large performance gains compared to source separation with clipping artefacts being ignored, which is the common approach in practice.
- Published
- 2015
- Full Text
- View/download PDF
31. Compressive sampling-based informed source separation
- Author
-
Patrick Pérez, Cagdas Bilen, Alexey Ozerov, Ozerov, Alexey, Appel à projets générique - Inpainting de données audio manquantes - - MAD2014 - ANR-14-CE27-0002 - Appel à projets générique - VALID, Technicolor [Cesson Sévigné], Technicolor, and ANR-14-CE27-0002,MAD,Inpainting de données audio manquantes(2014)
- Subjects
Theoretical computer science ,[INFO.INFO-TS] Computer Science [cs]/Signal and Image Processing ,nonnegative tensor factorization ,compressive sampling ,generalized expectation-maximization ,Data_CODINGANDINFORMATIONTHEORY ,Informed source separation ,symbols.namesake ,Compressed sensing ,Fourier transform ,[INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing ,Prior probability ,symbols ,Source separation ,Time domain ,Encoder ,Algorithm ,Decoding methods ,low complexity encoder ,Coding (social sciences) ,Mathematics - Abstract
International audience; The paradigm of using a very simple encoder and a sophisticated decoder for compression of signals became popular with the theory of distributed coding and it has been exercised for the compression of various types of signals such as images and video. The theory of compressive sampling later introduced a similar concept but with the focus on guarantees of signal recovery using sparse and low rank priors lying in an incoherent domain to the domain of sampling. In this paper, we bring together the concepts introduced in distributed coding and compressive sampling with the informed source separation , in which the goal is to efficiently compress the audio sources so that they can be decoded with the knowledge of the mixture of the sources. The proposed framework uses a very simple time domain sampling scheme to encode the sources, and a sophisticated decoding algorithm that makes use of the low rank non-negative tensor factorization model of the distribution of short-time Fourier transform coefficients to recover the sources, which is a direct application of the principles of both compressive sampling and distributed coding.
- Published
- 2015
32. Convex Optimization Approaches for Blind Sensor Calibration using Sparsity
- Author
-
Gilles Puy, Rémi Gribonval, Cagdas Bilen, Laurent Daudet, Parcimonie et Nouveaux Algorithmes pour le Signal et la Modélisation Audio (PANAMA), SIGNAUX ET IMAGES NUMÉRIQUES, ROBOTIQUE (IRISA-D5), Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), CentraleSupélec-Télécom Bretagne-Université de Rennes 1 (UR1), Université de Rennes (UNIV-RENNES)-Université de Rennes (UNIV-RENNES)-Institut National de Recherche en Informatique et en Automatique (Inria)-École normale supérieure - Rennes (ENS Rennes)-Université de Bretagne Sud (UBS)-Centre National de la Recherche Scientifique (CNRS)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-CentraleSupélec-Télécom Bretagne-Université de Rennes 1 (UR1), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-Inria Rennes – Bretagne Atlantique, Institut National de Recherche en Informatique et en Automatique (Inria), Electrical Engineering Institute - EPFL, Eidgenössische Technische Hochschule - Swiss Federal Institute of Technology [Zürich] (ETH Zürich), Institut Langevin - Ondes et Images (UMR7587) (IL), Sorbonne Université (SU)-Ecole Superieure de Physique et de Chimie Industrielles de la Ville de Paris (ESPCI Paris), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Université de Paris (UP)-Centre National de la Recherche Scientifique (CNRS), ANR-08-EMER-0006,ECHANGE,Echantillonnage Acoustique Nouvelle Génération(2008), European Project: 277906,EC:FP7:ERC,ERC-2011-StG_20101014,PLEASE(2012), Inria Rennes – Bretagne Atlantique, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-SIGNAUX ET IMAGES NUMÉRIQUES, ROBOTIQUE (IRISA-D5), Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-Télécom Bretagne-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-Télécom Bretagne-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Télécom Bretagne-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS), Ecole Superieure de Physique et de Chimie Industrielles de la Ville de Paris (ESPCI Paris), and Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS)
- Subjects
FOS: Computer and information sciences ,Mathematical optimization ,Lifting ,convex optimization ,Information Theory (cs.IT) ,Computer Science - Information Theory ,[INFO.INFO-OH]Computer Science [cs]/Other [cs.OH] ,gain calibration ,Basis pursuit ,phase estimation ,Compressed sensing ,Quadratic equation ,blind calibration ,[INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing ,Distortion ,Signal Processing ,Scalability ,Convex optimization ,Electrical and Electronic Engineering ,Focus (optics) ,Phase retrieval ,[SPI.SIGNAL]Engineering Sciences [physics]/Signal and Image processing ,Mathematics - Abstract
International audience; We investigate a compressive sensing framework in which the sensors introduce a distortion to the measurements in the form of unknown gains. We focus on blind calibration, using measures performed on multiple unknown (but sparse) signals and formulate the joint recovery of the gains and the sparse signals as a convex optimization problem. We divide this problem in 3 subproblems with different conditions on the gains, specifially (i) gains with different amplitude and the same phase, (ii) gains with the same amplitude and different phase and (iii) gains with different amplitude and phase. In order to solve the first case, we propose an extension to the basis pursuit optimization which can estimate the unknown gains along with the unknown sparse signals. For the second case, we formulate a quadratic approach that eliminates the unknown phase shifts and retrieves the unknown sparse signals. An alternative form of this approach is also formulated to reduce complexity and memory requirements and provide scalability with respect to the number of input signals. Finally for the third case, we propose a formulation that combines the earlier two approaches to solve the problem. The performance of the proposed algorithms is investigated extensively through numerical simulations, which demonstrates that simultaneous signal recovery and calibration is possible with convex methods when sufficiently many (unknown, but sparse) calibrating signals are provided.
- Published
- 2014
- Full Text
- View/download PDF
33. High Speed Compressed Sensing Reconstruction in Dynamic Parallel MRI Using Augmented Lagrangian and Parallel Processing
- Author
-
Ivan Selesnick, Cagdas Bilen, and Yao Wang
- Subjects
FOS: Computer and information sciences ,Computer science ,Augmented Lagrangian method ,Information Theory (cs.IT) ,Computer Science - Information Theory ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Iterative reconstruction ,Compressed sensing ,Parallel processing (DSP implementation) ,Dynamic contrast-enhanced MRI ,Singular value decomposition ,Computer Science - Data Structures and Algorithms ,Data Structures and Algorithms (cs.DS) ,Algorithm design ,Electrical and Electronic Engineering ,Algorithm ,Image resolution - Abstract
Magnetic Resonance Imaging (MRI) is one of the fields that the compressed sensing theory is well utilized to reduce the scan time significantly leading to faster imaging or higher resolution images. It has been shown that a small fraction of the overall measurements are sufficient to reconstruct images with the combination of compressed sensing and parallel imaging. Various reconstruction algorithms has been proposed for compressed sensing, among which Augmented Lagrangian based methods have been shown to often perform better than others for many different applications. In this paper, we propose new Augmented Lagrangian based solutions to the compressed sensing reconstruction problem with analysis and synthesis prior formulations. We also propose a computational method which makes use of properties of the sampling pattern to significantly improve the speed of the reconstruction for the proposed algorithms in Cartesian sampled MRI. The proposed algorithms are shown to outperform earlier methods especially for the case of dynamic MRI for which the transfer function tends to be a very large matrix and significantly ill conditioned. It is also demonstrated that the proposed algorithm can be accelerated much further than other methods in case of a parallel implementation with graphics processing units (GPUs)., Submitted to IEEE JETCAS, Special Issue on Circuits, Systems and Algorithms for Compressed Sensing
- Published
- 2012
34. A motion compensating prior for dynamic MRI reconstruction using combination of compressed sensing and parallel imaging
- Author
-
Ivan Selesnick, Daniel K. Sodickson, Ricardo Otazo, Yao Wang, and Cagdas Bilen
- Subjects
Signal processing ,Motion compensation ,Computer science ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Real-time MRI ,Iterative reconstruction ,Compressed sensing ,Parallel processing (DSP implementation) ,Dynamic contrast-enhanced MRI ,Computer vision ,Artificial intelligence ,business ,Reference frame - Abstract
Many areas in signal processing have benefited from the emergence of compressed sensing and sparse reconstruction methods, one of which is magnetic resonance imaging (MRI). Recent studies showed that MRI acquisition can be highly accelerated with the joint use of compressed sensing and parallel imaging methods. It is also suggested that dynamic MRI can be further improved by making use of temporal correlations. Although methods using motion compensation has been proposed to exploit temporal dependence, most of these require reference frames and/or a sub-portion of k-space to be fully sampled. In this paper we propose a new approach to exploit the motion information during compressed sensing reconstruction without any requirement for reference frames, modeled motion or a specific sampling pattern on the k-space measurements.
- Published
- 2011
- Full Text
- View/download PDF
35. Layered video multicast using diversity embedded space time codes
- Author
-
Yao Wang, Cagdas Bilen, and Elza Erkip
- Subjects
Multicast ,Computer science ,business.industry ,Reliability (computer networking) ,media_common.quotation_subject ,Real-time computing ,Transmission (telecommunications) ,Time-division multiplexing ,Bandwidth (computing) ,Code (cryptography) ,Quality (business) ,business ,Communication channel ,media_common ,Computer network - Abstract
In traditional wireless multicast systems, the system is optimized to provide high reliability to users with the worst channel conditions. However this results in a waste of available bandwidth for users with a good channel conditions. In particular for transmission of video, this wasted bandwidth can be used to deliver higher quality video to some of the users. This paper addresses this problem by using diversity embedded space time codes (DESTC) to send multiple layers of video simultaneously. DESTC provides unequal error protection to different video layers, thereby delivering high quality video to users with good channel conditions while still providing acceptable quality to users with poor channel conditions. The performance of DESTC is compared to a single-layer orthogonal space-time code used for single-layer video delivery as well as layered video delivery with time division multiplexing. The results show that the use of DESTC is advantageous over both strategies in multicast video delivery.
- Published
- 2009
- Full Text
- View/download PDF
36. A standards-based, flexible, end-to-end multi-view video streaming architecture
- Author
-
C. Goktug Gurler, Engin Kurutepe, Cagdas Bilen, Anil Aksay, A. Murat Tekalp, Thomas Sikora, and Gozde Bozdagi Akar
- Subjects
Computer science ,computer.internet_protocol ,business.industry ,Network packet ,Application software ,computer.software_genre ,Backward compatibility ,End-to-end principle ,Time Protocol ,Real Time Streaming Protocol ,Session Description Protocol ,business ,computer ,Data compression ,Computer network - Abstract
In this paper we propose a novel framework for the streaming of 3-D representations in the form of Multi- View Videos (MVV). The proposed streaming system is completely standards based, flexible and backwards compatible in order to support monoscopic streaming to legacy clients. We demonstrate compatibility of the proposed system with various possible encoding schemes and operating scenarios. In the current implementation, the MVVs in the server are compressed using a simplified form of MVC with negligible loss of compression efficiency and streamed using Real Time Streaming Protocol (RTSP), Session Description Protocol (SDP) and Real Time Protocol (RTP) to the clients. We describe our extensions to SDP and discuss a preliminary RTP payload format for MVC. The clients in this implementation perform basic error concealment to reduce the effects of packet losses and decode MVC in near-real-time. The modular clients can display decoded 3-D content on a multitude of 3-D display systems.
- Published
- 2007
- Full Text
- View/download PDF
37. Motion and Disparity Aided Stereoscopic Full Frame Loss Concealment Method
- Author
-
Cagdas Bilen, Anil Aksay, and G. Bozdagi Akar
- Subjects
Redundancy (information theory) ,business.industry ,law ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Computer vision ,Stereoscopy ,Artificial intelligence ,Multiview Video Coding ,business ,Error detection and correction ,Data compression ,law.invention - Abstract
Stereoscopic video is one of the emerging research areas especially among the video coding community. Along with the studies for efficiently compressing the stereoscopic and multiview video, new error concealment and error protection methods are also necessary to overcome the problems due to erroneous channel conditions in practical applications. In this paper we propose a full frame loss concealment algorithm for stereoscopic sequences. The proposed method uses redundancy and disparity between the two views and motion information between the previously decoded frames to estimate the lost frame. The results show that, the proposed algorithm outperforms the monoscopic methods when they are applied to the same view as they are simulcast coded.
- Published
- 2007
- Full Text
- View/download PDF
38. Unequal Error Protection for Stereoscopic Video Streaming
- Author
-
Erdal Arikan, Gozde Bozdagi Akar, Cagdas Bilen, A. Serdar Tan, and Anil Aksay
- Subjects
Signal processing ,Computer science ,Computer graphics (images) ,Real-time computing ,Stereoscopic video ,Data_CODINGANDINFORMATIONTHEORY ,Forward error correction ,Luby transform code ,Decoding methods ,Transform coding - Abstract
The utilization of forward error correction (FEC) schemes for stereo video streaming is investigated. Stereo video is categorized in 3 layers and each layer is protected with different protection ratios for efficient streaming. Systematic Reed-Solomon (RS) and Luby Transform (LT) codes are utilized as the error protection schemes. Detailed simulations are performed in order to observe the optimum unequal error protection (UEP) strategies for the defined video layers. Moreover, as a result of these simulations the performance comparison of RS and LT codes for video streaming is provided.
- Published
- 2007
- Full Text
- View/download PDF
39. Error resilient layered stereoscopic video streaming
- Author
-
Erdal Arikan, Cagdas Bilen, Gozde Bozdagi Akar, Anil Aksay, and A.S. Tan
- Subjects
Scheme (programming language) ,Error detection ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Acoustic streaming ,Luby transform code ,Stereoscopic video ,Computer vision ,Programming theory ,Forward error correction ,Rapid solidification ,computer.programming_language ,business.industry ,Image coding ,Stereo vision ,Stereopsis ,Digital television ,Artificial intelligence ,Video coding ,Multiview Video Coding ,business ,computer ,Decoding methods ,Visual communication ,Transmission errors - Abstract
Date of Conference: 7-9 May 2007 Conference Name: 3DTV Conference, IEEE 2007 In this paper, error resilient stereoscopic video streaming problem is addressed. Two different Forward Error Correction (FEC) codes namely Systematic LT and RS codes are utilized to protect the stereoscopic video data against transmission errors. Initially, the stereoscopic video is categorized in 3 layers with different priorities. Then, a packetization scheme is used to increase the efficiency of error protection. A comparative analysis of RS and LT codes are provided via simulations to observe the optimum packetization and UEP strategies.
- Published
- 2007
40. Rate-distortion optimized layered stereoscopic video streaming with raptor codes
- Author
-
Erdal Arikan, Cagdas Bilen, Gozde Bozdagi Akar, Anil Aksay, and A. Serdar Tan
- Subjects
Computer science ,Real-time computing ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Channel bandwidths ,Motion estimation ,Codes ,Stereoscopic video ,Channel capacity ,Distortion ,Video transmissions ,Insertion loss ,Codec ,Image communication systems ,Forward error correction ,Rate distortions ,Source encoding ,Raptor code ,Electric breakdown ,Error protection schemes ,Approximate analytical model ,Packet videos ,Bandwidth (signal processing) ,Analytical modelling ,Electric distortion ,Signal distortion ,Minification ,Raptor coding - Abstract
Date of Conference: 12-13 November 2007 Conference Name: Packet Video 2007 - 16th International Packet Video Workshop A near optimal streaming system for stereoscopic video is proposed. Initially, the stereoscopic video is separated into three layers and the approximate analytical model of the Rate-Distortion (RD) curve of each layer is calculated from sufficient number of rate and distortion samples. The analytical modeling includes the interdependency of the defined layers. Then, the analytical models are used to derive the optimal source encoding rates for a given channel bandwidth. The distortion in the quality of the stereoscopic video that is caused by losing a NAL unit from the defined layers is estimated to minimize the average distortion of a single NAL unit loss. The minimization is performed over protection rates allocated to each layer. Raptor codes are utilized as the error protection scheme due to their novelty and suitability in video transmission. The layers are protected unequally using Raptor codes according to the parity ratios allocated to the layers. Comparison of the defined scheme with two other protection allocation schemes is provided via simulations to observe the quality of stereoscopic video.
- Published
- 2007
41. A Multi-View Video Codec Based on H.264
- Author
-
Anil Aksay, Gozde Bozdagi Akar, and Cagdas Bilen
- Subjects
Motion compensation ,Computer science ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Data_CODINGANDINFORMATIONTHEORY ,Intra-frame ,Coding gain ,Adaptive Multi-Rate audio codec ,Motion estimation ,Codec ,Computer vision ,Artificial intelligence ,business ,Data compression - Abstract
H.264 is the current state-of-the-art monoscopic video codec providing almost twice the coding efficiency with the same quality comparing the previous codecs. With the increasing interest in 3D TV, multi-view video sequences that are provided by multiple cameras capturing the three dimensional objects and/or scene are more widely used. Compressing multi-view sequences independently with H.264 (simulcast) is not efficient since the redundancy between the closer cameras is not exploited. In order to reduce these redundancies, we propose a multi-view video codec based on H.264 using disparity estimation/compensation as well as motion estimation/compensation. In order to effectively search for disparity/motion without increasing computational complexity, we modified the buffering structure of H.264 and implemented several referencing modes. Our results show that for closely located cameras, our codec outperforms simulcast H.264 coding. For sparsely located cameras, our method can still improve coding gain depending on the video characteristics.
- Published
- 2006
- Full Text
- View/download PDF
42. End-to-End Stereoscopic Video Streaming System
- Author
-
Anil Aksay, Selen Pehlivan, M.R. Civanlar, Cagdas Bilen, and Gozde Bozdagi Akar
- Subjects
Motion compensation ,Video capture ,Computer science ,Real-time computing ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Video processing ,computer.file_format ,Smacker video ,Video compression picture types ,Computer graphics (images) ,Video tracking ,Multiview Video Coding ,computer ,Data compression - Abstract
Today, stereoscopic and multi-view video are among the popular research areas in the multimedia world. In this study, we have designed and built a platform consisting of stereo-view capturing, real-time transmission and display. At the display stage, end users view video in 3D by using polarized glasses. Multi-view video is compressed in an efficient way by using multi-view video coding techniques and streamed using standard real-time transport protocols. The entire system is built by modifying available open source systems whenever possible. Receiver can view the content of the video built from multiple channels as mono or stereo depending on its display and bandwidth capabilities.
- Published
- 2006
- Full Text
- View/download PDF
43. A standards-based, flexible, end-to-end multi-view video streaming architecture.
- Author
-
Engin Kurutepe, Anil Aksay, Cagdas Bilen, Gurler, C. Goktug, Sikora, Thomas, Gozde Bozdagi Akar, and Tekalp, A. Murat
- Published
- 2007
- Full Text
- View/download PDF
44. Rate-distortion optimized layered stereoscopic video streaming with raptor codes.
- Author
-
Tan, A. Serdar, Anil Aksay, Cagdas Bilen, Gozde Bozdagi Akar, and Erdal Arikan
- Published
- 2007
- Full Text
- View/download PDF
45. Multiple description coding and its relevance to 3DTV
- Author
-
Jaakko Astola, Anil Aksay, Atanas Gotchev, Cagdas Bilen, M. Oguz Bici, Karen Egiazarian, Andrey Norkin, and Gozde Bozdagi Akar
- Subjects
Computer science ,Multiple description ,Multiple description coding ,Packet loss rate ,Relevance (information retrieval) ,Forward error correction ,Rate distortion ,Algorithm ,Motion vector
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.