891 results on '"C.-C. Jay Kuo"'
Search Results
2. A Survey on Perceptually Optimized Video Coding
- Author
-
Yun Zhang, Linwei Zhu, Gangyi Jiang, Sam Kwong, and C.-C. Jay Kuo
- Subjects
FOS: Computer and information sciences ,General Computer Science ,Image and Video Processing (eess.IV) ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,FOS: Electrical engineering, electronic engineering, information engineering ,Electrical Engineering and Systems Science - Image and Video Processing ,Computer Science - Multimedia ,Multimedia (cs.MM) ,Theoretical Computer Science - Abstract
To provide users with more realistic visual experiences, videos are developing in the trends of Ultra High Definition (UHD), High Frame Rate (HFR), High Dynamic Range (HDR), Wide Color Gammut (WCG) and high clarity. However, the data amount of videos increases exponentially, which requires high efficiency video compression for storage and network transmission. Perceptually optimized video coding aims to maximize compression efficiency by exploiting visual redundancies. In this paper, we present a broad and systematic survey on perceptually optimized video coding. Firstly, we present problem formulation and framework of the perceptually optimized video coding, which includes visual perception modelling, visual quality assessment and perceptual video coding optimization. Secondly, recent advances on visual factors, computational perceptual models and quality assessment models are presented. Thirdly, we review perceptual video coding optimizations from four key aspects, including perceptually optimized bit allocation, rate-distortion optimization, transform and quantization, filtering and enhancement. In each part, problem formulation, working flow, recent advances, advantages and challenges are presented. Fourthly, perceptual coding performances of the latest coding standards and tools are experimentally analyzed. Finally, challenging issues and future opportunities are identified., Comment: 36 pages, 12 figures, 6 tables, accepted by ACM Computing Surveys
- Published
- 2023
- Full Text
- View/download PDF
3. Joint Graph Attention and Asymmetric Convolutional Neural Network for Deep Image Compression
- Author
-
Zhisen Tang, Hanli Wang, Xiaokai Yi, Yun Zhang, Sam Kwong, and C.-C. Jay Kuo
- Subjects
Media Technology ,Electrical and Electronic Engineering - Published
- 2023
- Full Text
- View/download PDF
4. Brain MR Atlas Construction Using Symmetric Deep Neural Inpainting
- Author
-
Fangxu Xing, Xiaofeng Liu, C.-C. Jay Kuo, Georges El Fakhri, and Jonghye Woo
- Subjects
Health Information Management ,Image Processing, Computer-Assisted ,Brain ,Humans ,Health Informatics ,Electrical and Electronic Engineering ,Magnetic Resonance Imaging ,Article ,Computer Science Applications - Abstract
Modeling statistical properties of anatomical structures using magnetic resonance imaging is essential for revealing common information of a target population and unique properties of specific subjects. In brain imaging, a statistical brain atlas is often constructed using a number of healthy subjects. When tumors are present, however, it is difficult to either provide a common space for various subjects or align their imaging data due to the unpredictable distribution of lesions. Here we propose a deep learning-based image inpainting method to replace the tumor regions with normal tissue intensities using only a patient population. Our framework has three major innovations: 1) incompletely distributed datasets with random tumor locations can be used for training; 2) irregularly-shaped tumor regions are properly learned, identified, and corrected; and 3) a symmetry constraint between the two brain hemispheres is applied to regularize inpainted regions. Henceforth, regular atlas construction and image registration methods can be applied using inpainted data to obtain tissue deformation, thereby achieving group-specific statistical atlases and patient-to-atlas registration. Our framework was tested using the public database from the Multimodal Brain Tumor Segmentation challenge. Results showed increased similarity scores as well as reduced reconstruction errors compared with three existing image inpainting methods. Patient-to-atlas registration also yielded better results with improved normalized cross-correlation and mutual information and a reduced amount of deformation over the tumor regions.
- Published
- 2022
- Full Text
- View/download PDF
5. Classification via Subspace Learning Machine (SLM): Methodology and Performance Evaluation
- Author
-
Hongyu Fu, Yijing Yang, Vinod K. Mishra, and C.-C. Jay Kuo
- Published
- 2023
- Full Text
- View/download PDF
6. MP55-18 A NOVEL MACHINE LEARNING FRAMEWORK TO AUTOMATED CHARACTERIZE PROSTATE IMAGING REPORTING AND DATA SYSTEM (PIRADS) ON MRI
- Author
-
Giovanni e. Cacciamani, null masatomo kaneko, null yijing yang, null vasileios magoulianitis, null jintang xue, null jiaxin yang, null jinyuan liu, null maria sarah L. Lenon, Passant Mohamed, Darryl H. Hwang, null karan gill, Manju Aron, Vinay Duddalwar, Suzanne L. Palmer, C.-C. Jay Kuo, Inderbir Gill, Andre Luis Abreu, and Chrysostomos L. Nikias
- Subjects
Urology - Published
- 2023
- Full Text
- View/download PDF
7. MP09-06 ASSESSMENT OF A NOVEL BPMRI-BASED MACHINE LEARNING FRAMEWORK TO AUTOMATE THE DETECTION OF CLINICALLY SIGNIFICANT PROSTATE CANCER USING THE PI-CAI (PROSTATE IMAGING: CANCER AI) CHALLENGE DATASET
- Author
-
Andre Luis Abreu, Giovanni Cacciamani, Masatomo Kaneko, Vasileios Magouliantis, Yijing Yang, Vinay Duddalwar, C-C Jay Kuo, Inderbir Gill, and Chrysostomos L. Nikias
- Subjects
Urology - Published
- 2023
- Full Text
- View/download PDF
8. MP09-05 AUTOMATED PROSTATE GLAND AND PROSTATE ZONES SEGMENTATION USING A NOVEL MRI-BASED MACHINE LEARNING FRAMEWORK AND CREATION OF SOFTWARE INTERFACE FOR USERS ANNOTATION
- Author
-
Masatomo Kaneko, GIovanni E. Cacciamani, Yijing Yang, Vasileios Magoulianitis, Jintang Xue, Jiaxin Yang, Jinyuan Liu, Maria Sarah L. Lenon, Passant Mohamed, Darryl H. Hwang, Karan Gill, Manju Aron, Vinay Duddalwar, Suzanne L. Palmer, C.-C. Jay Kuo, Andre Luis Abreu, Inderbir Gill, and Chrysostomos L. Nikias
- Subjects
Urology - Published
- 2023
- Full Text
- View/download PDF
9. MP55-20 A NOVEL MACHINE LEARNING FRAMEWORK FOR AUTOMATED DETECTION OF PROSTATE CANCER LESIONS CONFIRMED ON MRI-INFORMED TARGET BIOPSY
- Author
-
Masatomo Kaneko, Giovanni E. Cacciamani, Vasileios Magoulianitis, Yijing Yang, Jintang Xue, Jiaxin Yang, Jinyuan Liu, Maria Sarah L. Lenon, Passant Mohamed, Darryl H. Hwang, Karan Gill, Manju Aron, Vinay Duddalwar, Suzanne L. Palmer, C.-C. Jay Kuo, Inderbir Gill, Andre Luis Abreu, and Chrysostomos L. Nikias
- Subjects
Urology - Published
- 2023
- Full Text
- View/download PDF
10. High Efficiency Intra Video Coding Based on Data-Driven Transform
- Author
-
Na Li, Yun Zhang, and C.-C. Jay Kuo
- Subjects
Media Technology ,Electrical and Electronic Engineering - Published
- 2022
- Full Text
- View/download PDF
11. VoxelHop: Successive Subspace Learning for ALS Disease Classification Using Structural MRI
- Author
-
Xiaofeng Liu, Suma Babu, Fangxu Xing, Georges El Fakhri, C.-C. Jay Kuo, Thomas M Jenkins, Chao Yang, and Jonghye Woo
- Subjects
FOS: Computer and information sciences ,Computer science ,Computer Vision and Pattern Recognition (cs.CV) ,Concatenation ,Computer Science - Computer Vision and Pattern Recognition ,Convolutional neural network ,Article ,Health Information Management ,Robustness (computer science) ,FOS: Electrical engineering, electronic engineering, information engineering ,Humans ,Electrical and Electronic Engineering ,business.industry ,Dimensionality reduction ,Deep learning ,Amyotrophic Lateral Sclerosis ,Image and Video Processing (eess.IV) ,Pattern recognition ,Electrical Engineering and Systems Science - Image and Video Processing ,Magnetic Resonance Imaging ,Backpropagation ,Regression ,Computer Science Applications ,Neural Networks, Computer ,Artificial intelligence ,business ,Subspace topology ,Biotechnology - Abstract
Deep learning has great potential for accurate detection and classification of diseases with medical imaging data, but the performance is often limited by the number of training datasets and memory requirements. In addition, many deep learning models are considered a "black-box," thereby often limiting their adoption in clinical applications. To address this, we present a successive subspace learning model, termed VoxelHop, for accurate classification of Amyotrophic Lateral Sclerosis (ALS) using T2-weighted structural MRI data. Compared with popular convolutional neural network (CNN) architectures, VoxelHop has modular and transparent structures with fewer parameters without any backpropagation, so it is well-suited to small dataset size and 3D imaging data. Our VoxelHop has four key components, including (1) sequential expansion of near-to-far neighborhood for multi-channel 3D data; (2) subspace approximation for unsupervised dimension reduction; (3) label-assisted regression for supervised dimension reduction; and (4) concatenation of features and classification between controls and patients. Our experimental results demonstrate that our framework using a total of 20 controls and 26 patients achieves an accuracy of 93.48 % and an AUC score of 0.9394 in differentiating patients from controls, even with a relatively small number of datasets, showing its robustness and effectiveness. Our thorough evaluations also show its validity and superiority to the state-of-the-art 3D CNN classification approaches. Our framework can easily be generalized to other classification tasks using different imaging modalities.
- Published
- 2022
- Full Text
- View/download PDF
12. Deep Learning Based Just Noticeable Difference and Perceptual Quality Prediction Models for Compressed Video
- Author
-
Xiaoping Fan, Yun Zhang, You Yang, Huanhua Liu, C.-C. Jay Kuo, and Sam Kwong
- Subjects
Computer science ,Just-noticeable difference ,business.industry ,Deep learning ,Pattern recognition ,Feature (computer vision) ,Distortion ,Human visual system model ,Media Technology ,Key frame ,Artificial intelligence ,Electrical and Electronic Engineering ,Quantization (image processing) ,business ,Coding (social sciences) - Abstract
Human visual system has a limitation of sensitivity in detecting small distortion in an image/video and the minimum perceptual threshold is so called Just Noticeable Difference (JND). JND modelling is challenging since it highly depends on visual contents and perceptual factors are not fully understood. In this paper, we propose deep learning based JND and perceptual quality prediction models, which are able to predict the Satisfied User Ratio (SUR) and Video Wise JND (VWJND) of compressed videos with different resolutions and coding parameters. Firstly, the SUR prediction is modeled as a regression problem that fits deep learning tools. Then, Video Wise Spatial SUR method (VW-SSUR) is proposed to predict the SUR value for compressed video, which mainly considers the spatial distortion. Thirdly, we further propose Video Wise Spatial-Temporal SUR (VW-STSUR) method to improve the SUR prediction accuracy by considering the spatial and temporal information. Two fusion schemes that fuse the spatial and temporal information in quality score level and in feature level, respectively, are investigated. Finally, key factors including key frame and patch selections, cross resolution prediction and complexity are analyzed. Experimental results demonstrate the proposed VW-SSUR method outperforms in both SUR and VWJND prediction as compared with the state-of-the-art schemes. Moreover, the proposed VW-STSUR further improves the accuracy as compared with the VW-SSUR and the conventional JND models, where the mean SUR prediction error is 0.049, and mean VWJND prediction error is 1.69 in quantization parameter and 0.84 dB in peak signal-to-noise ratio.
- Published
- 2022
- Full Text
- View/download PDF
13. Subtype-Aware Dynamic Unsupervised Domain Adaptation
- Author
-
Xiaofeng Liu, Fangxu Xing, Jane You, Jun Lu, C.-C. Jay Kuo, Georges El Fakhri, and Jonghye Woo
- Subjects
Signal Processing (eess.SP) ,FOS: Computer and information sciences ,Computer Science - Machine Learning ,Computer Science - Artificial Intelligence ,Computer Networks and Communications ,Computer Vision and Pattern Recognition (cs.CV) ,Image and Video Processing (eess.IV) ,Computer Science - Computer Vision and Pattern Recognition ,Electrical Engineering and Systems Science - Image and Video Processing ,Machine Learning (cs.LG) ,Computer Science Applications ,Artificial Intelligence (cs.AI) ,Artificial Intelligence ,FOS: Electrical engineering, electronic engineering, information engineering ,Electrical Engineering and Systems Science - Signal Processing ,Software - Abstract
Unsupervised domain adaptation (UDA) has been successfully applied to transfer knowledge from a labeled source domain to target domains without their labels. Recently introduced transferable prototypical networks (TPN) further addresses class-wise conditional alignment. In TPN, while the closeness of class centers between source and target domains is explicitly enforced in a latent space, the underlying fine-grained subtype structure and the cross-domain within-class compactness have not been fully investigated. To counter this, we propose a new approach to adaptively perform a fine-grained subtype-aware alignment to improve performance in the target domain without the subtype label in both domains. The insight of our approach is that the unlabeled subtypes in a class have the local proximity within a subtype, while exhibiting disparate characteristics, because of different conditional and label shifts. Specifically, we propose to simultaneously enforce subtype-wise compactness and class-wise separation, by utilizing intermediate pseudo-labels. In addition, we systematically investigate various scenarios with and without prior knowledge of subtype numbers, and propose to exploit the underlying subtype structure. Furthermore, a dynamic queue framework is developed to evolve the subtype cluster centroids steadily using an alternative processing scheme. Experimental results, carried out with multi-view congenital heart disease data and VisDA and DomainNet, show the effectiveness and validity of our subtype-aware UDA, compared with state-of-the-art UDA methods., IEEE Transactions on Neural Networks and Learning Systems (TNNLS)
- Published
- 2022
- Full Text
- View/download PDF
14. Task-Driven Video Compression for Humans and Machines: Framework Design and Optimization
- Author
-
Xiaokai Yi, Hanli Wang, Sam Kwong, and C.-C. Jay Kuo
- Subjects
Signal Processing ,Media Technology ,Electrical and Electronic Engineering ,Computer Science Applications - Published
- 2022
- Full Text
- View/download PDF
15. TypeEA: Type-Associated Embedding for Knowledge Graph Entity Alignment
- Author
-
C.-C. Jay Kuo, Bin Wang, Yun Cheng Wang, and Xiou Ge
- Subjects
Signal Processing ,Information Systems - Published
- 2023
- Full Text
- View/download PDF
16. SALVE: Self-Supervised Adaptive Low-Light Video Enhancement
- Author
-
C.-C. Jay Kuo and Zohreh Azizi
- Subjects
Signal Processing ,Information Systems - Published
- 2023
- Full Text
- View/download PDF
17. S3I-PointHop: SO(3)-Invariant PointHop for 3D Point Cloud Classification
- Author
-
Pranav Kadam, Hardik Prajapati, Min Zhang, Jintang Xue, Shan Liu, and C.-C. Jay Kuo
- Subjects
FOS: Computer and information sciences ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Many point cloud classification methods are developed under the assumption that all point clouds in the dataset are well aligned with the canonical axes so that the 3D Cartesian point coordinates can be employed to learn features. When input point clouds are not aligned, the classification performance drops significantly. In this work, we focus on a mathematically transparent point cloud classification method called PointHop, analyze its reason for failure due to pose variations, and solve the problem by replacing its pose dependent modules with rotation invariant counterparts. The proposed method is named SO(3)-Invariant PointHop (or S3I-PointHop in short). We also significantly simplify the PointHop pipeline using only one single hop along with multiple spatial aggregation techniques. The idea of exploiting more spatial information is novel. Experiments on the ModelNet40 dataset demonstrate the superiority of S3I-PointHop over traditional PointHop-like methods., Comment: 5 pages, 3 figures
- Published
- 2023
- Full Text
- View/download PDF
18. An Unsupervised Parameter-Free Nuclei Segmentation Method for Histology Images
- Author
-
Vasileios Magoulianitis, Peida Han, Yijing Yang, and C.-C. Jay Kuo
- Published
- 2022
- Full Text
- View/download PDF
19. Data-Driven Transform-Based Compressed Image Quality Assessment
- Author
-
Xinfeng Zhang, C.-C. Jay Kuo, and Sam Kwong
- Subjects
business.industry ,Image quality ,Computer science ,Feature extraction ,Pattern recognition ,Weighting ,Feature (computer vision) ,Human visual system model ,Media Technology ,Codec ,Artificial intelligence ,Electrical and Electronic Engineering ,business ,Energy (signal processing) ,Image compression - Abstract
Image quality assessment is a critical problem for image compression, which can be utilized as a guidance for image compression and codec evaluation. In this paper, we propose a full reference image quality assessment (IQA) algorithm to measure the perceptual quality of compressed images. The proposed IQA model utilizes a data-driven transform, multi-stage Karhunen-Loeve Transform (MS-KLT), as a feature extractor to decompose both reference and compressed images into feature domain, where the importance of feature distortions in different spectral components to human visual system (HVS) is easy to distinguish. Accordingly, an efficient weighting strategy is proposed to reflect the importance of feature distortions based on the energy of transformed coefficients. Considering HVS characteristics, weighted spatial masking effect is derived from both local and global perspectives. In addition, to avoid influences of random noises, a local adaptive low-pass filtering process is applied as a pre-processing operation. Extensive experimental results on popular datasets show that our proposed method correlates better with the subjective results compared with the state-of-the-art algorithms. Moreover, the proposed method behaves more robustly compared with existing methods, and achieves top-ranking performance on different IQA datasets.
- Published
- 2021
- Full Text
- View/download PDF
20. PRISMA AI reporting guidelines for systematic reviews and meta-analyses on AI in healthcare
- Author
-
Giovanni E. Cacciamani, Timothy N. Chu, Daniel I. Sanford, Andre Abreu, Vinay Duddalwar, Assad Oberai, C.-C. Jay Kuo, Xiaoxuan Liu, Alastair K. Denniston, Baptiste Vasey, Peter McCulloch, Robert F. Wolff, Sue Mallett, John Mongan, Charles E. Kahn, Viknesh Sounderajah, Ara Darzi, Philipp Dahm, Karel G. M. Moons, Eric Topol, Gary S. Collins, David Moher, Inderbir S. Gill, and Andrew J. Hung
- Subjects
General Medicine ,General Biochemistry, Genetics and Molecular Biology - Published
- 2023
- Full Text
- View/download PDF
21. BERTHop: An Effective Vision-and-Language Model for Chest X-ray Disease Diagnosis
- Author
-
Masoud Monajatipoor, Mozhdeh Rouhsedaghat, Liunian Harold Li, C.-C. Jay Kuo, Aichi Chien, and Kai-Wei Chang
- Subjects
Article - Abstract
Vision-and-language (V&L) models take image and text as input and learn to capture the associations between them. These models can potentially deal with the tasks that involve understanding medical images along with their associated text. However, applying V&L models in the medical domain is challenging due to the expensiveness of data annotations and the requirements of domain knowledge. In this paper, we identify that the visual representation in general V&L models is not suitable for processing medical data. To overcome this limitation, we propose BERTHop, a transformer-based model based on PixelHop++ and VisualBERT for better capturing the associations between clinical notes and medical images. Experiments on the OpenI dataset, a commonly used thoracic disease diagnosis benchmark, show that BERTHop achieves an average Area Under the Curve (AUC) of 98.12% which is 1.62% higher than state-of-the-art while it is trained on a 9× smaller dataset.()
- Published
- 2022
22. Unsupervised Domain Adaptation for Segmentation with Black-box Source Model
- Author
-
Xiaofeng Liu, Chaehwa Yoo, Fangxu Xing, C.-C. Jay Kuo, Georges El Fakhri, Je-Won Kang, and Jonghye Woo
- Subjects
Article - Abstract
Unsupervised domain adaptation (UDA) has been widely used to transfer knowledge from a labeled source domain to an unlabeled target domain to counter the difficulty of labeling in a new domain. The training of conventional solutions usually relies on the existence of both source and target domain data. However, privacy of the large-scale and well-labeled data in the source domain and trained model parameters can become the major concern of cross center/domain collaborations. In this work, to address this, we propose a practical solution to UDA for segmentation with a black-box segmentation model trained in the source domain only, rather than original source data or a white-box source model. Specifically, we resort to a knowledge distillation scheme with exponential mixup decay (EMD) to gradually learn target-specific representations. In addition, unsupervised entropy minimization is further applied to regularization of the target domain confidence. We evaluated our framework on the BraTS 2018 database, achieving performance on par with white-box source model adaptation approaches.
- Published
- 2022
23. ExpressionHop: A Lightweight Human Facial Expression Classifier
- Author
-
Chengwei Wei, C.-C. Jay Kuo, Rafael Luiz Testa, Ariane Machado-Lima, and Fatima L. S. Nunes
- Published
- 2022
- Full Text
- View/download PDF
24. Geometrical Interpretation and Design of Multilayer Perceptrons
- Author
-
Ruiyuan Lin, Zhiruo Zhou, Suya You, Raghuveer Rao, and C.-C. Jay Kuo
- Subjects
Artificial Intelligence ,Computer Networks and Communications ,Software ,Computer Science Applications - Abstract
The multilayer perceptron (MLP) neural network is interpreted from the geometrical viewpoint in this work, that is, an MLP partition an input feature space into multiple nonoverlapping subspaces using a set of hyperplanes, where the great majority of samples in a subspace belongs to one object class. Based on this high-level idea, we propose a three-layer feedforward MLP (FF-MLP) architecture for its implementation. In the first layer, the input feature space is split into multiple subspaces by a set of partitioning hyperplanes and rectified linear unit (ReLU) activation, which is implemented by the classical two-class linear discriminant analysis (LDA). In the second layer, each neuron activates one of the subspaces formed by the partitioning hyperplanes with specially designed weights. In the third layer, all subspaces of the same class are connected to an output node that represents the object class. The proposed design determines all MLP parameters in a feedforward one-pass fashion analytically without backpropagation. Experiments are conducted to compare the performance of the traditional backpropagation-based MLP (BP-MLP) and the new FF-MLP. It is observed that the FF-MLP outperforms the BP-MLP in terms of design time, training time, and classification performance in several benchmarking datasets. Our source code is available at https://colab.research.google.com/drive/1Gz0L8A-nT4ijrUchrhEXXsnaacrFdenn?usp = sharing.
- Published
- 2022
25. On Relationship of Multilayer Perceptrons and Piecewise Polynomial Approximators
- Author
-
Suya You, Ruiyuan Lin, Raghuveer M. Rao, and C.-C. Jay Kuo
- Subjects
Polynomial ,Applied Mathematics ,Computer Science::Neural and Evolutionary Computation ,Filter (signal processing) ,Perceptron ,Nonlinear system ,symbols.namesake ,Computer Science::Computer Vision and Pattern Recognition ,Multilayer perceptron ,Signal Processing ,Piecewise ,Taylor series ,symbols ,Applied mathematics ,Point (geometry) ,Electrical and Electronic Engineering ,Mathematics - Abstract
The relationship between a multilayer perceptron (MLP) regressor and a piecewise polynomial approximator is investigated in this work. We propose an MLP construction method, including the choice of activation, the specification of neuron numbers and filter weights. Through the construction, a one-to-one correspondence between an MLP and a piecewise polynomial is established. Especially, we point out that the form of nonlinear activation is related to the polynomial order. Since the approximation capability of piecewise polynomials is well understood, our study sheds new light on the universal approximation capability of an MLP.
- Published
- 2021
- Full Text
- View/download PDF
26. Saak Transform-Based Machine Learning for Light-Sheet Imaging of Cardiac Trabeculation
- Author
-
René R. Sevag Packard, Yanan Fei, Dengfeng Kuang, Kyung In Baek, Tzung K. Hsiai, Zhaoqiang Wang, Mehrdad Roustaei, Varun Gudapati, Sibo Song, Ruiyuan Lin, C.-C. Jay Kuo, Yichen Ding, and Chih-Chiang Chang
- Subjects
Neural Networks ,Artificial Intelligence and Image Processing ,principal component analysis ,Computer science ,Image Processing ,0206 medical engineering ,Feature extraction ,Biomedical Engineering ,Bioengineering ,Image processing ,02 engineering and technology ,Cardiovascular ,Machine learning ,computer.software_genre ,Fluorescence ,Article ,Edge detection ,Machine Learning ,Computer ,Computer-Assisted ,Image Processing, Computer-Assisted ,Segmentation ,Electrical and Electronic Engineering ,Microscopy ,Image segmentation ,business.industry ,Heart ,Random forests ,Transforms ,020601 biomedical engineering ,Kernel ,Heart Disease ,Networking and Information Technology R&D (NITRD) ,Microscopy, Fluorescence ,cardiology ,Biomedical Imaging ,Neural Networks, Computer ,Artificial intelligence ,Biomedical optical imaging ,business ,computer ,Algorithms ,Subspace topology - Abstract
Objective: Recent advances in light-sheet fluorescence microscopy (LSFM) enable 3-dimensional (3-D) imaging of cardiac architecture and mechanics in toto . However, segmentation of the cardiac trabecular network to quantify cardiac injury remains a challenge. Methods: We hereby employed “subspace approximation with augmented kernels (Saak) transform” for accurate and efficient quantification of the light-sheet image stacks following chemotherapy-treatment. We established a machine learning framework with augmented kernels based on the Karhunen-Loeve Transform (KLT) to preserve linearity and reversibility of rectification. Results: The Saak transform-based machine learning enhances computational efficiency and obviates iterative optimization of cost function needed for neural networks, minimizing the number of training datasets for segmentation in our scenario. The integration of forward and inverse Saak transforms can also serve as a light-weight module to filter adversarial perturbations and reconstruct estimated images, salvaging robustness of existing classification methods. The accuracy and robustness of the Saak transform are evident following the tests of dice similarity coefficients and various adversary perturbation algorithms, respectively. The addition of edge detection further allows for quantifying the surface area to volume ratio (SVR) of the myocardium in response to chemotherapy-induced cardiac remodeling. Conclusion: The combination of Saak transform, random forest, and edge detection augments segmentation efficiency by 20-fold as compared to manual processing. Significance: This new methodology establishes a robust framework for post light-sheet imaging processing, and creating a data-driven machine learning for automated quantification of cardiac ultra-structure.
- Published
- 2021
- Full Text
- View/download PDF
27. Shape-Preserving Stereo Object Remapping via Object-Consistent Grid Warping
- Author
-
Bernard Ghanem, C.-C. Jay Kuo, Wen Gao, Bing Li, Shan Liu, Chia-Wen Lin, and Cheng Zheng
- Subjects
Flexibility (engineering) ,business.industry ,Computer science ,Distortion (optics) ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Object (computer science) ,Grid ,Computer Graphics and Computer-Aided Design ,Visualization ,Consistency (database systems) ,Transformation (function) ,Computer vision ,Artificial intelligence ,Image warping ,business ,Software - Abstract
Viewing various stereo images under different viewing conditions has escalated the need for effective object-level remapping techniques. In this paper, we propose a new object spatial mapping scheme, which adjusts the depth and size of the selected object to match user preference and viewing conditions. Existing warping-based methods often distort the shape of important objects or cannot faithfully adjust the depth/size of the selected object due to improper warping such as local rotations. In this paper, by explicitly reducing the transformation freedom degree of warping, we propose an optimization model based on axis-aligned warping for object spatial remapping. The proposed axis-aligned warping based optimization model can simultaneously adjust the depths and sizes of selected objects to their target values without introducing severe shape distortions. Moreover, we propose object consistency constraints to ensure the size/shape of parts inside a selected object to be consistently adjusted. Such constraints improve the size/shape adjustment performance while remaining robust to some extent to incomplete object extraction. Experimental results demonstrate that the proposed method achieves high flexibility and effectiveness in adjusting the size and depth of objects compared with existing methods.
- Published
- 2021
- Full Text
- View/download PDF
28. SynWMD: Syntax-aware Word Mover's Distance for Sentence Similarity Evaluation
- Author
-
Chengwei Wei, Bin Wang, and C.-C. Jay Kuo
- Subjects
FOS: Computer and information sciences ,History ,Computer Science - Computation and Language ,Polymers and Plastics ,Business and International Management ,Computation and Language (cs.CL) ,Industrial and Manufacturing Engineering - Abstract
Word Mover's Distance (WMD) computes the distance between words and models text similarity with the moving cost between words in two text sequences. Yet, it does not offer good performance in sentence similarity evaluation since it does not incorporate word importance and fails to take inherent contextual and structural information in a sentence into account. An improved WMD method using the syntactic parse tree, called Syntax-aware Word Mover's Distance (SynWMD), is proposed to address these two shortcomings in this work. First, a weighted graph is built upon the word co-occurrence statistics extracted from the syntactic parse trees of sentences. The importance of each word is inferred from graph connectivities. Second, the local syntactic parsing structure of words is considered in computing the distance between words. To demonstrate the effectiveness of the proposed SynWMD, we conduct experiments on 6 textual semantic similarity (STS) datasets and 4 sentence classification datasets. Experimental results show that SynWMD achieves state-of-the-art performance on STS tasks. It also outperforms other WMD-based methods on sentence classification tasks.
- Published
- 2022
29. Fake Satellite Image Detection via Parallel Subspace Learning (PSL)
- Author
-
Hong-Shuo Chen, Kaitai Zhang, Shuowen Hu, Suya You, and C.-C. Jay Kuo
- Published
- 2022
- Full Text
- View/download PDF
30. Perceptually Weighted Mean Squared Error Based Rate-Distortion Optimization for HEVC
- Author
-
Xiuzhe Wu, Sudeng Hu, C.-C. Jay Kuo, Hanli Wang, and Sam Kwong
- Subjects
Signal processing ,Mean squared error ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,020206 networking & telecommunications ,02 engineering and technology ,Video quality ,Rate–distortion optimization ,Nonlinear distortion ,Human visual system model ,0202 electrical engineering, electronic engineering, information engineering ,Media Technology ,Perceptual Distortion ,Electrical and Electronic Engineering ,Algorithm ,Coding (social sciences) - Abstract
The process of rate-distortion (RD) optimization plays a key role for video coding, which aims to achieve a tradeoff between compression efficiency and video quality distortion. Although the conventional objective distortion metric mean squared error performs well in computational complexity, it is not always in accordance with the perceptual quality perceived by human visual system (HVS). Taking the characteristics of HVS into consideration, a perceptually weighted mean squared error (PWMSE) based RD model is proposed in this work. First, a low-pass filter is employed to process the original distortion information in order to simulate visual signal processing and obtain the perceptual distortion. Then, masking modulation in both temporal and spatial domains is introduced into distortion model, and a novel Lagrange multiplier is derived accordingly. The proposed PWMSE based RD model is applied to the high efficiency video coding standard, and comparative experimental results demonstrate its effectiveness. The project page can be found in https://mic.tongji.edu.cn/ab/40/c9778a174912/page.htm .
- Published
- 2020
- Full Text
- View/download PDF
31. Perceptual Image Compression with Block-Level Just Noticeable Difference Prediction
- Author
-
C.-C. Jay Kuo, Sam Kwong, Hanli Wang, and Tao Tian
- Subjects
Standard test image ,Computer Networks and Communications ,Computer science ,business.industry ,Just-noticeable difference ,020206 networking & telecommunications ,Pattern recognition ,02 engineering and technology ,Convolutional neural network ,Image (mathematics) ,Otsu's method ,symbols.namesake ,Hardware and Architecture ,0202 electrical engineering, electronic engineering, information engineering ,symbols ,Discrete cosine transform ,Preprocessor ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,Block (data storage) - Abstract
A block-level perceptual image compression framework is proposed in this work, including a block-level just noticeable difference (JND) prediction model and a preprocessing scheme. Specifically speaking, block-level JND values are first deduced by utilizing the OTSU method based on the variation of block-level structural similarity values between two adjacent picture-level JND values in the MCL-JCI dataset. After the JND value for each image block is generated, a convolutional neural network–based prediction model is designed to forecast block-level JND values for a given target image. Then, a preprocessing scheme is devised to modify the discrete cosine transform coefficients during JPEG compression on the basis of the distribution of block-level JND values of the target test image. Finally, the test image is compressed by the max JND value across all of its image blocks in the light of the initial quality factor setting. The experimental results demonstrate that the proposed block-level perceptual image compression method is able to achieve 16.75% bit saving as compared to the state-of-the-art method with similar subjective quality. The project page can be found at https://mic.tongji.edu.cn/43/3f/c9778a148287/page.htm.
- Published
- 2020
- Full Text
- View/download PDF
32. Just Noticeable Difference Level Prediction for Perceptual Image Compression
- Author
-
C.-C. Jay Kuo, Hanli Wang, Tao Tian, Sam Kwong, and Lingxuan Zuo
- Subjects
Scheme (programming language) ,Computer science ,business.industry ,Just-noticeable difference ,media_common.quotation_subject ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,020206 networking & telecommunications ,Pattern recognition ,02 engineering and technology ,Convolutional neural network ,Image (mathematics) ,Visualization ,Compression (functional analysis) ,0202 electrical engineering, electronic engineering, information engineering ,Media Technology ,Quality (business) ,Artificial intelligence ,Electrical and Electronic Engineering ,Perceptual image ,business ,computer ,media_common ,computer.programming_language - Abstract
A perceptual image compression framework is proposed in this work, including an adaptive picture-level just noticeable difference (PJND) prediction model and a perceptual coding scheme. Specifically speaking, a convolutional neural network (CNN) model is designed with the existing subjective image database to predict the PJND label for a given image. Then, the support vector regression model is utilized to determine the number of PJND levels. After that, a just noticeable difference generation algorithm is developed to compute the corresponding quality factor for each PJND level. Moreover, an effective perceptual coding scheme is devised for perceptual image compression. Finally, the accuracy of the proposed PJND prediction model and the performance of the proposed perceptual coding scheme are evaluated. The experimental results show that the proposed CNN based PJND prediction model achieves good prediction accuracy and the proposed perceptual coding scheme produces state-of-the-art rate distortion performances.
- Published
- 2020
- Full Text
- View/download PDF
33. Image Coding With Data-Driven Transforms: Methodology, Performance and Potential
- Author
-
Ioannis Katsavounidis, Shan Liu, Xinfeng Zhang, C.-C. Jay Kuo, Xiaoguang Li, Yang Haitao, Shaw-Min Lei, and Chao Yang
- Subjects
Computer science ,business.industry ,Quantization (signal processing) ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Pattern recognition ,02 engineering and technology ,computer.file_format ,Computer Graphics and Computer-Aided Design ,JPEG ,Kernel (image processing) ,Computer Science::Computer Vision and Pattern Recognition ,Frequency domain ,JPEG 2000 ,0202 electrical engineering, electronic engineering, information engineering ,Discrete cosine transform ,020201 artificial intelligence & image processing ,Artificial intelligence ,Quantization (image processing) ,business ,computer ,Software ,Transform coding ,Image compression - Abstract
Image compression has always been an important topic in the last decades due to the explosive increase of images. The popular image compression formats are based on different transforms which convert images from the spatial domain into compact frequency domain to remove the spatial correlation. In this paper, we focus on the exploration of data-driven transform, Karhunen-Loéve transform (KLT), the kernels of which are derived from specific images via Principal Component Analysis (PCA), and design a high efficient KLT based image compression algorithm with variable transform sizes. To explore the optimal compression performance, the multiple transform sizes and categories are utilized and determined adaptively according to their rate-distortion (RD) costs. Moreover, comprehensive analyses on the transform coefficients are provided and a band-adaptive quantization scheme is proposed based on the coefficient RD performance. Extensive experiments are performed on several class-specific images as well as general images, and the proposed method achieves significant coding gain over the popular image compression standards including JPEG, JPEG 2000, and the state-of-the-art dictionary learning based methods.
- Published
- 2020
- Full Text
- View/download PDF
34. Perceptual Temporal Incoherence-Guided Stereo Video Retargeting
- Author
-
Wen Gao, Bing Li, C.-C. Jay Kuo, Chia-Wen Lin, Shan Liu, and Tiejun Huang
- Subjects
business.industry ,Computer science ,media_common.quotation_subject ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,02 engineering and technology ,Computer Graphics and Computer-Aided Design ,Visualization ,Perception ,Distortion ,Retargeting ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Computer vision ,Artificial intelligence ,business ,Software ,media_common ,Coherence (physics) - Abstract
Stereo video retargeting aims at minimizing shape and depth distortions with temporal coherence in resizing a stereo video content to a desired size. Existing methods extend stereo image retargeting schemes to stereo video retargeting by adding additional temporal constraints that demand temporal coherence in all corresponding regions. However, such a straightforward extension incurs conflicts among multiple requirements (i.e., shape and depth preservation and their temporal coherence), thus failing to meet one or more of these requirements satisfactorily. To mitigate conflicts among depth, shape, and temporal constraints and avoid degrading temporal coherence perceptually, we relax temporal constraints for non-paired regions at frame boundaries, derive new temporal constraints to improve human viewing experience of a 3D scene, and propose an efficient grid-based implementation for stereo video retargeting. Experimental results demonstrate that our method achieves superior visual quality over existing methods.
- Published
- 2020
- Full Text
- View/download PDF
35. Satisfied-User-Ratio Modeling for Compressed Video
- Author
-
Haiqiang Wang, Wei Xu, Xinfeng Zhang, C.-C. Jay Kuo, and Chao Yang
- Subjects
Auditory masking ,Computer science ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,02 engineering and technology ,Video quality ,Computer Graphics and Computer-Aided Design ,Uncompressed video ,Feature (computer vision) ,0202 electrical engineering, electronic engineering, information engineering ,Codec ,020201 artificial intelligence & image processing ,Computer vision ,Artificial intelligence ,business ,Software ,Internet video - Abstract
With explosive increase of internet video services, perceptual modeling for video quality has attracted more attentions to provide high quality-of-experience (QoE) for end-users subject to bandwidth constraints, especially for compressed video quality. In this paper, a novel perceptual model for satisfied-user-ratio (SUR) on compressed video quality is proposed by exploiting compressed video bitrate changes and spatial-temporal statistical characteristics extracted from both uncompressed original video and reference video. In the proposed method, an efficient video feature set is explored and established to model SUR curves against bitrate variations by leveraging the Gaussian Processes Regression (GPR) framework. In particular, the proposed model is based on the recently released large-scale video quality dataset, VideoSet, and takes both spatial and temporal masking effects into consideration. To make it more practical, we further optimize the proposed method from three aspects including feature source simplification, computation complexity reduction and video codec adaption. Based on experimental results on VideoSet, the proposed method can accurately model SUR curves for various video contents and predict their required bitrates at given SUR values. Subjective experiments are conducted to further verify the generalization ability of the proposed SUR model.
- Published
- 2020
- Full Text
- View/download PDF
36. SBERT-WK: A Sentence Embedding Method by Dissecting BERT-Based Word Models
- Author
-
Bin Wang and C.-C. Jay Kuo
- Subjects
Thesaurus (information retrieval) ,Context model ,Acoustics and Ultrasonics ,Computer science ,business.industry ,computer.software_genre ,030507 speech-language pathology & audiology ,03 medical and health sciences ,Computational Mathematics ,Similarity (psychology) ,Computer Science (miscellaneous) ,Task analysis ,Embedding ,Artificial intelligence ,Electrical and Electronic Engineering ,0305 other medical science ,Representation (mathematics) ,business ,computer ,Sentence ,Natural language processing ,Word (computer architecture) - Abstract
Sentence embedding is an important research topic in natural language processing (NLP) since it can transfer knowledge to downstream tasks. Meanwhile, a contextualized word representation, called BERT, achieves the state-of-the-art performance in quite a few NLP tasks. Yet, it is an open problem to generate a high quality sentence representation from BERT-based word models. It was shown in previous study that different layers of BERT capture different linguistic properties. This allows us to fuse information across layers to find better sentence representations. In this work, we study the layer-wise pattern of the word representation of deep contextualized models. Then, we propose a new sentence embedding method by dissecting BERT-based word models through geometric analysis of the space spanned by the word representation. It is called the SBERT-WK method. No further training is required in SBERT-WK. We evaluate SBERT-WK on semantic textual similarity and downstream supervised tasks. Furthermore, ten sentence-level probing tasks are presented for detailed linguistic analysis. Experiments show that SBERT-WK achieves the state-of-the-art performance. Our codes are publicly available.
- Published
- 2020
- Full Text
- View/download PDF
37. GraphHop: An Enhanced Label Propagation Method for Node Classification
- Author
-
Tian Xie, C.-C. Jay Kuo, and Bin Wang
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Artificial Intelligence ,Computer Networks and Communications ,Software ,Machine Learning (cs.LG) ,Computer Science Applications - Abstract
A scalable semisupervised node classification method on graph-structured data, called GraphHop, is proposed in this work. The graph contains all nodes' attributes and link connections but labels of only a subset of nodes. Graph convolutional networks (GCNs) have provided superior performance in node label classification over the traditional label propagation (LP) methods for this problem. Nevertheless, current GCN algorithms suffer from a considerable amount of labels for training because of high model complexity or cannot be easily generalized to large-scale graphs due to the expensive cost of loading the entire graph and node embeddings. Besides, nonlinearity makes the optimization process a mystery. To this end, an enhanced LP method, called GraphHop, is proposed to tackle these problems. GraphHop can be viewed as a smoothening LP algorithm, in which each propagation alternates between two steps: label aggregation and label update. In the label aggregation step, multihop neighbor embeddings are aggregated to the center node. In the label update step, new embeddings are learned and predicted for each node based on aggregated results from the previous step. The two-step iteration improves the graph signal smoothening capacity. Furthermore, to encode attributes, links, and labels on graphs effectively under one framework, we adopt a two-stage training process, i.e., the initialization stage and the iteration stage. Thus, the smooth attribute information extracted from the initialization stage is consistently imposed in the propagation process in the iteration stage. Experimental results show that GraphHop outperforms state-of-the-art graph learning methods on a wide range of tasks in graphs of various sizes (e.g., multilabel and multiclass classification on citation networks, social graphs, and commodity consumption graphs).
- Published
- 2022
38. The Future of Video Coding
- Author
-
Jiaying Liu, Wen-Hsiao Peng, Hsueh-Ming Hang, Shan Liu, Dong Xu, Gary J. Sullivan, C.-C. Jay Kuo, and Nam Ling
- Subjects
Signal Processing ,Information Systems - Published
- 2022
- Full Text
- View/download PDF
39. Acceleration of Subspace Learning Machine via Particle Swarm Optimization and Parallel Processing
- Author
-
Hongyu Fu, Yijing Yang, Yuhuai Liu, Joseph Lin, Ethan Harrison, Vinod K. Mishra, and C.-C. Jay Kuo
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Machine Learning (cs.LG) - Abstract
Built upon the decision tree (DT) classification and regression idea, the subspace learning machine (SLM) has been recently proposed to offer higher performance in general classification and regression tasks. Its performance improvement is reached at the expense of higher computational complexity. In this work, we investigate two ways to accelerate SLM. First, we adopt the particle swarm optimization (PSO) algorithm to speed up the search of a discriminant dimension that is expressed as a linear combination of current dimensions. The search of optimal weights in the linear combination is computationally heavy. It is accomplished by probabilistic search in original SLM. The acceleration of SLM by PSO requires 10-20 times fewer iterations. Second, we leverage parallel processing in the SLM implementation. Experimental results show that the accelerated SLM method achieves a speed up factor of 577 in training time while maintaining comparable classification/regression performance of original SLM.
- Published
- 2022
- Full Text
- View/download PDF
40. UHP-SOT++: An Unsupervised Lightweight Single Object Tracker
- Author
-
C.-C. Jay Kuo, Suya You, Hongyu Fu, and Zhiruo Zhou
- Subjects
Signal Processing ,Information Systems - Published
- 2022
- Full Text
- View/download PDF
41. PAGER: Progressive Attribute-Guided Extendable Robust Image Generation
- Author
-
C.-C. Jay Kuo and Zohreh Azizi
- Subjects
Signal Processing ,Information Systems - Published
- 2022
- Full Text
- View/download PDF
42. Recurrent Neural Networks and Their Memory Behavior: A Survey
- Author
-
C.-C. Jay Kuo and Yuanhang Su
- Subjects
Signal Processing ,Information Systems - Published
- 2022
- Full Text
- View/download PDF
43. Just Rank: Rethinking Evaluation with Word and Sentence Similarities
- Author
-
Bin Wang, C.-C. Jay Kuo, and Haizhou Li
- Subjects
FOS: Computer and information sciences ,Computer Science - Computation and Language ,Artificial Intelligence (cs.AI) ,Computer Science - Artificial Intelligence ,Computation and Language (cs.CL) - Abstract
Word and sentence embeddings are useful feature representations in natural language processing. However, intrinsic evaluation for embeddings lags far behind, and there has been no significant update since the past decade. Word and sentence similarity tasks have become the de facto evaluation method. It leads models to overfit to such evaluations, negatively impacting embedding models' development. This paper first points out the problems using semantic similarity as the gold standard for word and sentence embedding evaluations. Further, we propose a new intrinsic evaluation method called EvalRank, which shows a much stronger correlation with downstream tasks. Extensive experiments are conducted based on 60+ models and popular datasets to certify our judgments. Finally, the practical evaluation toolkit is released for future benchmarking purposes., Comment: Accepted as Main Conference for ACL 2022. Code: https://github.com/BinWang28/EvalRank-Embedding-Evaluation
- Published
- 2022
- Full Text
- View/download PDF
44. Enhancing Image Rescaling using Dual Latent Variables in Invertible Neural Network
- Author
-
Min Zhang, Zhihong Pan, Xin Zhou, and C.-C. Jay Kuo
- Subjects
FOS: Computer and information sciences ,Computer Vision and Pattern Recognition (cs.CV) ,I.4.5 ,Image and Video Processing (eess.IV) ,Computer Science - Computer Vision and Pattern Recognition ,FOS: Electrical engineering, electronic engineering, information engineering ,Electrical Engineering and Systems Science - Image and Video Processing - Abstract
Normalizing flow models have been used successfully for generative image super-resolution (SR) by approximating complex distribution of natural images to simple tractable distribution in latent space through Invertible Neural Networks (INN). These models can generate multiple realistic SR images from one low-resolution (LR) input using randomly sampled points in the latent space, simulating the ill-posed nature of image upscaling where multiple high-resolution (HR) images correspond to the same LR. Lately, the invertible process in INN has also been used successfully by bidirectional image rescaling models like IRN and HCFlow for joint optimization of downscaling and inverse upscaling, resulting in significant improvements in upscaled image quality. While they are optimized for image downscaling too, the ill-posed nature of image downscaling, where one HR image could be downsized to multiple LR images depending on different interpolation kernels and resampling methods, is not considered. A new downscaling latent variable, in addition to the original one representing uncertainties in image upscaling, is introduced to model variations in the image downscaling process. This dual latent variable enhancement is applicable to different image rescaling models and it is shown in extensive experiments that it can improve image upscaling accuracy consistently without sacrificing image quality in downscaled LR images. It is also shown to be effective in enhancing other INN-based models for image restoration applications like image hiding., Comment: Accepted by ACM Multimedia 2022
- Published
- 2022
- Full Text
- View/download PDF
45. American Sign Language Fingerspelling Recognition in the Wild with Spatio Temporal Feature Extraction and Multi-task Learning
- Author
-
Peerawat Pannattee, Wuttipong Kumwilaisak, Chatchawarn Hansakunbuntheung, Nattanun Thatphithakkul, and C.-C. Jay Kuo
- Published
- 2022
- Full Text
- View/download PDF
46. Bridging Gap between Image Pixels and Semantics via Supervision: A Survey
- Author
-
C.-C. Jay Kuo and Jiali Duan
- Subjects
Signal Processing ,Information Systems - Published
- 2022
- Full Text
- View/download PDF
47. RGGID: A Robust and Green GAN-Fake Image Detector
- Author
-
C.-C. Jay Kuo, Hong-Shuo Chen, Ronald Salloum, Xinyu Wang, and Yao Zhu
- Subjects
Signal Processing ,Information Systems - Published
- 2022
- Full Text
- View/download PDF
48. Green learning: Introduction, examples and outlook
- Author
-
C.-C. Jay Kuo and Azad M. Madni
- Subjects
Signal Processing ,Media Technology ,Computer Vision and Pattern Recognition ,Electrical and Electronic Engineering - Published
- 2023
- Full Text
- View/download PDF
49. Segmentation of Cardiac Structures via Successive Subspace Learning with Saab Transform from Cine MRI
- Author
-
Xiaofeng Liu, Fangxu Xing, Hanna K. Gaggin, Weichung Wang, C.-C. Jay Kuo, Georges El Fakhri, and Jonghye Woo
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Computer Vision and Pattern Recognition (cs.CV) ,Heart Ventricles ,Image and Video Processing (eess.IV) ,Computer Science - Computer Vision and Pattern Recognition ,FOS: Electrical engineering, electronic engineering, information engineering ,Image Processing, Computer-Assisted ,Magnetic Resonance Imaging, Cine ,Heart ,Neural Networks, Computer ,Electrical Engineering and Systems Science - Image and Video Processing ,Machine Learning (cs.LG) - Abstract
Assessment of cardiovascular disease (CVD) with cine magnetic resonance imaging (MRI) has been used to non-invasively evaluate detailed cardiac structure and function. Accurate segmentation of cardiac structures from cine MRI is a crucial step for early diagnosis and prognosis of CVD, and has been greatly improved with convolutional neural networks (CNN). There, however, are a number of limitations identified in CNN models, such as limited interpretability and high complexity, thus limiting their use in clinical practice. In this work, to address the limitations, we propose a lightweight and interpretable machine learning model, successive subspace learning with the subspace approximation with adjusted bias (Saab) transform, for accurate and efficient segmentation from cine MRI. Specifically, our segmentation framework is comprised of the following steps: (1) sequential expansion of near-to-far neighborhood at different resolutions; (2) channel-wise subspace approximation using the Saab transform for unsupervised dimension reduction; (3) class-wise entropy guided feature selection for supervised dimension reduction; (4) concatenation of features and pixel-wise classification with gradient boost; and (5) conditional random field for post-processing. Experimental results on the ACDC 2017 segmentation database, showed that our framework performed better than state-of-the-art U-Net models with 200$\times$ fewer parameters in delineating the left ventricle, right ventricle, and myocardium, thus showing its potential to be used in clinical practice., Comment: 43rd Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC 2021)
- Published
- 2021
50. Guest Editorial Introduction to the Special Issue on Recent Advances in Point Cloud Processing and Compression
- Author
-
Zhu Li, Shan Liu, Frederic Dufaux, Li Li, Ge Li, C.-C. Jay Kuo, University of Missouri [Kansas City] (UMKC), University of Missouri System, University of Electronic Science and Technology of China (UESTC), Laboratoire des signaux et systèmes (L2S), CentraleSupélec-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS), Laboratory of Wildlife Management and Ecosystem Health, Yunnan University of Finance and Economics, Kunming, DeepBlueAI [Shanghai], and University of Southern California (USC)
- Subjects
[INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing ,[INFO.INFO-TI]Computer Science [cs]/Image Processing [eess.IV] ,0202 electrical engineering, electronic engineering, information engineering ,Media Technology ,020201 artificial intelligence & image processing ,02 engineering and technology ,Electrical and Electronic Engineering ,[SPI.SIGNAL]Engineering Sciences [physics]/Signal and Image processing ,ComputingMilieux_MISCELLANEOUS - Abstract
International audience
- Published
- 2021
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.