Author: "Ruizhe Ma" / Topic: 02 engineering and technology - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Ruizhe Ma"' showing total 13 results

Start Over Author "Ruizhe Ma" Topic 02 engineering and technology

13 results on '"Ruizhe Ma"'

1. On the Mining of the Minimal Set of Time Series Data Shapelets

Author: Soukaina Filali Boubrahimi, Ruizhe Ma, Rafal A. Angryk, and Shah Muhammad Hamdi
Subjects: Artificial neural network, Computer science, business.industry, Big data, 02 engineering and technology, Machine learning, computer.software_genre, Convolutional neural network, Set (abstract data type), Discriminative model, 020204 information systems, Classifier (linguistics), 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Pruning (decision trees), Artificial intelligence, Time series, business, computer
Abstract: Shapelets, also known as motifs, are time series sequences that have the property of discriminating between time series classes. Lately, shapelets studies have gained a lot of momentum due to their interpretable nature. As opposed to traditional time series classifiers, shapelet-based learners provide a visual representation of the pattern that triggers the classification decision. One of the most challenging issues of shapelet-based classifiers is the generation of a large number of shapelet outputs. To the best of our knowledge, this is the first effort that addresses the high numerosity problem of mined shapelets issue by mining the minimal set of discriminative shapelets for time series data. We propose a new shapelet mining learner, 1DCNN, that has the property of learning shapelets of different lengths using a black-box neural network model. 1DCNN optimizes the entire classification schema by learning the shapes of the representative patterns. Our proposed model uses network pruning to sparsify the network and keep only the most discriminative shapelets without compromising the classification accuracy. We validated our model using 59 real-world time series datasets from the UCR repository. Our experimental results show the effectiveness and efficiency of our approach in comparison with other competing baselines models. For fairness purposes, we did not compare 1DCNN with ensemble based approaches that encapsulates many learners. Our results show that the performance of our model is superior to all other baselines pertaining to the shapelet-based classifier category, with up to 95% less Floating Points Operations per Second (FLOPs) required by the network.
Published: 2020
Full Text: View/download PDF

2. Evaluation of Hierarchical Structures for Time Series Data

Author: Zongmin Ma, Ruizhe Ma, Rafal A. Angryk, and Soukaina Filali Boubrahimi
Subjects: Hierarchical agglomerative clustering, Computer science, 02 engineering and technology, Similarity measure, computer.software_genre, Partition (database), Visualization, Hierarchical clustering, ComputingMethodologies_PATTERNRECOGNITION, 020204 information systems, 0202 electrical engineering, electronic engineering, information engineering, Unsupervised learning, 020201 artificial intelligence & image processing, Data mining, Time series, Heuristics, Cluster analysis, computer
Abstract: Clustering is an effective unsupervised machine learning method that can be used as a stand-alone heuristic or as a part of a data mining process. The goal of clustering analysis is to partition data into groups with high intra-cluster association, and low inter-cluster association. Hierarchical clustering requires minimal parameters, has flexibility with similarity measure, and has strong visualization power, all of which makes it ideal for exploratory analysis. Hierarchical clustering is a particular branch of clustering algorithms where the results are not given as partitions, but rather a nested structure, which can represent the ordering among elements within a dataset. The study of the performance of hierarchical structure on time series data is limited. In this paper, we examine the hierarchical structure of time series datasets. The most popular hierarchical structured clustering heuristics include the Hierarchical Agglomerative Clustering, which is distance based; and Ordering Points To Identify the Clustering Structure, which is density based. Both share many similar characteristics and are suitable for time series data processing. We examine the performance of different hierarchical clustering algorithms with time series both internally as well as externally.
Published: 2020
Full Text: View/download PDF

3. Two-stage prediction of machinery fault trend based on deep learning for time series analysis

Author: Li Yan, Hongling Xu, Ruizhe Ma, and Zongmin Ma
Subjects: Computer science, Feature extraction, 02 engineering and technology, computer.software_genre, Fault (power engineering), Artificial Intelligence, Classifier (linguistics), 0202 electrical engineering, electronic engineering, information engineering, Electrical and Electronic Engineering, Time series, business.industry, Applied Mathematics, Deep learning, Mode (statistics), 020206 networking & telecommunications, Random forest, Computational Theory and Mathematics, Signal Processing, 020201 artificial intelligence & image processing, Computer Vision and Pattern Recognition, Data mining, Artificial intelligence, Statistics, Probability and Uncertainty, business, computer, Failure mode and effects analysis
Abstract: Fault prediction technology provides a way to reduce the loss caused by equipment failure. Currently, many efforts of failure prediction pay more attention to the fault trend and the remaining useful life. This paper devotes to predicting a specific failure mode of device in the future. We propose a two-step prediction method with two sub-models to predict the fault mode. The first model, named regression model, combines Attention-based-LSTM (Long Short-Term Memory) and Random Forest (RF) to predict time series trends, where wavelet packet analysis is adopted to extract elementary features of time series and form multivariable time series. Based on the status data forecasted by the first model, the second model, named classification model, combines Attention-based-LSTM and Extra-Tree (ET) to classify fault mode. The classified fault will occur in the device at some point in the future. In the regression model, Attention-based-LSTM performs deep level feature extraction of vibration sensor signals, and the result is fed to the last layer (i.e., an RF layer), which takes responsibility for predicting the future sequence. The future sequence is then provided to the classification model, where Attention-based-LSTM is used to extract implicit features, and ET classifier is used to accept these features and figure out the failure phenomenon. Our approach for predicting failure modes includes the severity and type of fault, which are rarely investigated synthetically. The experimental results of bearing data show that the sub-models can achieve high prediction accuracy and our method can predict the future failure mode with precision.
Published: 2021
Full Text: View/download PDF

4. RDF approximate queries based on semantic similarity

Author: Ruizhe Ma, Jingwei Cheng, Li Yan, and Dazhen Li
Subjects: Numerical Analysis, Information retrieval, InformationSystems_INFORMATIONSTORAGEANDRETRIEVAL, InformationSystems_DATABASEMANAGEMENT, 02 engineering and technology, computer.file_format, Query optimization, Query language, Computer Science Applications, Theoretical Computer Science, Computational Mathematics, Query expansion, Computational Theory and Mathematics, Semantic similarity, Web query classification, 020204 information systems, 0202 electrical engineering, electronic engineering, information engineering, SPARQL, 020201 artificial intelligence & image processing, Sargable, computer, Software, Mathematics, RDF query language, computer.programming_language
Abstract: In this paper, we propose a query relaxation approach to handle the problem of empty or too little answers returned from RDF query. We apply RDF entailment to triple patterns in the original query to get more general answers. We propose the notion of semantic similarity degree so that the returned answers are semantically close to the original query. We present how to compute the semantic similarity degree of a relaxed query with respect to the original query. With the semantic similarity degrees, we can choose the relaxed queries that are semantically close to the original query. On this basis, we give the query relaxation algorithm. We verify our approach by experiments. It is shown that our approach of RDF query relaxation based on the semantic similarity degree has some good performances and is feasible.
Published: 2017
Full Text: View/download PDF

5. Thermally Actuated Hierarchical Lattices With Large Linear and Rotational Expansion

Author: Hang Xu, Amr Farag, Damiano Pasini, and Ruizhe Ma
Subjects: 0303 health sciences, Materials science, Mechanical Engineering, Numerical analysis, Metamaterial, Stiffness, 02 engineering and technology, Mechanics, Deformation (meteorology), 021001 nanoscience & nanotechnology, Condensed Matter Physics, Rotation, Thermal expansion, 03 medical and health sciences, Mechanics of Materials, Solid mechanics, medicine, medicine.symptom, 0210 nano-technology, Anisotropy, 030304 developmental biology
Abstract: This paper presents thermally actuated hierarchical metamaterials with large linear and rotational motion made of passive solids. Their working principle relies on the definition of a triangular bi-material unit that uses temperature changes to locally generate in its internal members distinct rates of expansion that translate into anisotropic motions at the unit level and large deployment at the global scale. Obtained from solid mechanics theory, thermal experiments on fabricated proof-of-concepts and numerical analysis, the results show that introducing recursive patterns of just two orders of the hierarchy is highly effective in amplifying linear actuation at levels of nearly nine times the initial height, and rotational actuation of almost 18.5 times the initial skew angle.
Published: 2019
Full Text: View/download PDF

6. A Scalable Segmented Dynamic Time Warping for Time Series Classification

Author: Ruizhe Ma, Azim Ahmadzadeh, Rafal A. Angryk, and Soukaina Filali Boubrahimi
Subjects: Dynamic time warping, Speedup, Similarity (geometry), Series (mathematics), Heuristic (computer science), Computer science, business.industry, Pattern recognition, 02 engineering and technology, ComputingMethodologies_PATTERNRECOGNITION, Computer Science::Sound, 020204 information systems, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Pruning (decision trees), Artificial intelligence, Image warping, Time series, business
Abstract: The Dynamic Time Warping (DTW) algorithm is an elastic distance measure that has demonstrated good performance with sequence-based data, and in particular, time series data. Two major drawbacks of DTW are the possibility of pathological warping paths and the high computational cost. Improvement techniques such as pruning off impossible mappings or lowering data dimensions have been proposed to counter these issues. The existing DTW improvement techniques, however, are either limited in effect or use accuracy as a trade-off. In this paper, we introduce segmented-DTW (segDTW). A novel and scalable approach that would speed up the DTW algorithm, especially for longer sequences. Our heuristic approaches the time series mapping problem by identifying global similarity before local similarity. This global to local process initiates with easily identified global peaks. Based on these peaks, time series sequences are segmented to sub-sequences, and DTW is applied in a divide-and-compute fashion. By doing so, the computation naturally expands to the parallel case. Due to the paired peaks, our method can avoid some pathological warpings and is highly scalable. We tested our method on a variety of datasets and obtained a gradient of speedup relative to the time series sequence length while maintaining comparable classification accuracy.
Published: 2019
Full Text: View/download PDF

7. Segmentation of Time Series in Improving Dynamic Time Warping

Author: Azim Ahmadzadeh, Ruizhe Ma, Soukaina Filali Boubrahimi, and Rafal A. Angryk
Subjects: Dynamic time warping, Series (mathematics), Computer science, business.industry, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, Pattern recognition, Feature selection, 02 engineering and technology, 01 natural sciences, Weighting, 010104 statistics & probability, Computer Science::Sound, Feature (computer vision), 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Segmentation, Artificial intelligence, 0101 mathematics, Time series, Image warping, business
Abstract: Since its introduction to the computer science community, the Dynamic Time Warping (DTW) algorithm has demonstrated good performance with time series data. While this elastic measure is known for its effectiveness with time series sequence comparisons, the possibility of pathological warping paths weakens the algorithms potential considerably. Techniques centering on pruning off impossible mappings or lowering data dimensions such as windowing, slope weighting, step pattern, and approximation have been proposed over the years to reduce the possibility of pathological warping paths with Dynamic Time Warping. However, because the current DTW improvement techniques are mostly global methods, they are either limited in effect or limit the warping path excessively. We believe segmenting time series at significant feature points will alleviate some of the pathological warpings, and at the same time allowing us to obtain more intuitive warpings. Our heuristic approaches the problem from the human perspective of sequence comparison: by identifying global similarity before local similarities. We use easily identifiable peaks as the significant feature. The final distance is the DTW distance sum of all segments of time series. In this paper, we explore the impact of different peak identification parameters on Dynamic Time Warping and demonstrate how segmentation can help to avoid pathological warpings.
Published: 2018
Full Text: View/download PDF

8. Neuro-Ensemble for Time Series Data Classification

Author: Soukaina Filali Boubrahimi, Ruizhe Ma, and Rafal A. Angryk
Subjects: Artificial neural network, Computer science, business.industry, Feature vector, Computation, 02 engineering and technology, Perceptron, Machine learning, computer.software_genre, Data modeling, Statistical classification, ComputingMethodologies_PATTERNRECOGNITION, 020204 information systems, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Artificial intelligence, Time series, business, computer, Classifier (UML)
Abstract: Combining a set of classification algorithms is a powerful technique in improving the accuracy of individual classifiers. There are two main paradigms in combining classifiers: classifier selection, where each classifier is considered as an expert in some local area of the feature space, and classifier fusion, where all classifiers are trained over the entire feature space and they are considered as competitive and complementary to each other. In this paper, we propose a new ensemble technique, NeuroEnsemble, that follows the classifier fusion paradigm applied on time series data. The Neuro-Ensemble exploits the idea that different classifiers participating in the ensemble have varying degrees of expertise on learning different class labels and it optimizes the ensemble using a shallow Multi-Layer Perceptron (MLP) based meta-learner to capture the expertise of individual classifiers. Every neuron in the MLP represents a classifier that contributes with a vote and performs activation and state computations. This work is the first attempt to train a neural network for learning the expertise of each classifier in an ensemble and optimize the entire classification schema based on class-level expertise weights. We validated our Neuro-Ensemble on 43 real-world time series datasets from the UCR repository. Our experimental results show the effectiveness and efficiency of our approach in comparison with individual baseline learners and ensemble techniques.
Published: 2018
Full Text: View/download PDF

9. Scalable kNN Search Approximation for Time Series Data

Author: Shah Muhammad Hamdi, Ruizhe Ma, Rafal A. Angryk, Soukaina Filali Boubrahimi, and Berkay Aydin
Subjects: 0301 basic medicine, Training set, Computer science, business.industry, Pattern recognition, 02 engineering and technology, k-nearest neighbors algorithm, 03 medical and health sciences, ComputingMethodologies_PATTERNRECOGNITION, 030104 developmental biology, Discriminative model, Lazy learning, Scalability, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Artificial intelligence, Cluster analysis, business, Classifier (UML)
Abstract: k Nearest Neighbor ( $k$ NN) is a widely used classifier in time series data analytics due to its interpretability. $k$ NN is often referred to as a lazy learning algorithm as it does not learn any discriminative function nor does it generate any rules from the training data. Instead, kNN classifier requires a search over all the training data for classifying a single test sample which makes it computationally demanding and hard to adopt for real world application. These applications are, sometimes, time-critical such as solar flare prediction which might have irreversible impacts on Earth. Therefore, scaling the nearest neighbors search to large datasets is crucial. In this paper, we propose a new scalable methodology to mitigate the problem of kNN high computational cost by approximating the nearest neighbor(s) with the help of clustering as a preprocessing step. We tested our idea on a comprehensive set of datasets with varying sizes and labels. Our results show that the performance of our approximate technique is comparable to the exact $k$ NN classifier with up to 10x speed-up.
Published: 2018
Full Text: View/download PDF

10. A Hardware/Software Co-design Method for Approximate Semi-Supervised K-Means Clustering

Author: Fabrizio Lombardi, Pengfei Huang Huang, Weiqiang Liu, Chenghua Wang, and Ruizhe Ma
Subjects: Computer science, business.industry, Computation, k-means clustering, Approximation algorithm, 02 engineering and technology, Image segmentation, 020202 computer hardware & architecture, Software, Computer engineering, Feature (computer vision), 020204 information systems, 0202 electrical engineering, electronic engineering, information engineering, Convergence problem, Cluster analysis, business
Abstract: As one of the most promising energy-efficient emerging paradigms for designing digital systems, approximate computing has attracted a significant attention in recent years. Applications utilizing approximate computing can tolerate some loss of quality in the computed results for attaining high performance. Approximate arithmetic circuits have been extensively studied; however, their application at system level has not been extensively pursued. Furthermore, when approximate arithmetic circuits are applied at system level, error-accumulation effects and a convergence problem may occur in computation. Semi-supervised learning can improve accuracy and performance by using unlabeled examples. In this paper, a hardware/software co-design method for approximate semi-supervised k-means clustering is proposed. It makes use of feature constraints to guide the approximate computation at various accuracy levels in each iteration of the learning process. Compared with a baseline design, the proposed method reduces the power-delay product by over 67% while only a small loss of accuracy is introduced. A case study of image segmentation validates the effectiveness of the proposed method.
Published: 2018
Full Text: View/download PDF

11. Time Series Distance Density Cluster with Statistical Preprocessing

Author: Ruizhe Ma, Rafal A. Angryk, and Soukaina Filali Boubrahimi
Subjects: Dynamic time warping, Series (mathematics), Computer science, Heuristic, Univariate, 02 engineering and technology, Feature (computer vision), 020204 information systems, 0202 electrical engineering, electronic engineering, information engineering, Preprocessor, 020201 artificial intelligence & image processing, Time series, Cluster analysis, Algorithm
Abstract: Previous Distance Density Clustering has shown some promising results for univariate time series datasets. However, due to the nature of time series data and from using Dynamic Time Warping algorithm as the distance measure; Distance Density Clustering is not an efficient heuristic with larger datasets. In this paper we propose a preprocessing step that could augment the algorithm to the parallel case, and speed up the Distance Density Clustering process considerably. We use time series sequence feature: peak numbers, to prune impossible matchings. By doing so, we are able to form preliminary feature clusters, and further clustering is applied within each feature cluster individually. This can narrow down the amount of time series distance computations, and make Distance Density Clustering scalable.
Published: 2018
Full Text: View/download PDF

12. Distance and Density Clustering for Time Series Data

Author: Ruizhe Ma and Rafal A. Angryk
Subjects: Dynamic time warping, Series (mathematics), Computer science, Euclidean space, business.industry, Initialization, Pattern recognition, 02 engineering and technology, 01 natural sciences, Medoid, Euclidean distance, 010104 statistics & probability, ComputingMethodologies_PATTERNRECOGNITION, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Artificial intelligence, 0101 mathematics, Time series, Cluster analysis, business
Abstract: Clustering is an important branch in the field of data mining as well as statistical analysis and is widely used in exploratory analysis. Many algorithms exist for clustering in the Euclidean space. However, time series clustering introduces new problems, such as inadequate distance measure, inaccurate cluster center description, lack of efficient and accurate clustering techniques. When dealing with time series data, Dynamic Time Warping (DTW) is an accepted and effective distance measure. For cluster updates and representation, DTW Barycenter Averaging (DBA) algorithm being a global averaging method using DTW and has proven to be an effective averaging method for time series data. In this paper, we propose a Distance Density clustering method that is a medoid-based clustering with time series data density consideration which provides clustering results in a hierarchy fashion. First, we introduce two clustering initialization techniques, from the time series similarity matrix we use majority voting to determine either the nearest or the furthest time series as the initial clustering seed. By doing so, our clustering method is deterministic, and the clustering results can always be reproduced. In the Distance Density clustering algorithm, we use medoids because it is a more representative alternative to the statistical mean, especially with time series data where the mean value is often non-existent. The time series density is a virtual density based on time series similarity; this can find more natural splits in a dataset and also the number of clusters does not need to be determined a priori. Experiments using the Distance Density clustering technique on the UCR dataset demonstrates that clustering initialization is crucial in obtaining stable and better results than random initialization on average, and is also more accurate than traditional distance clustering.
Published: 2017
Full Text: View/download PDF

13. A data-driven analysis of interplanetary coronal mass ejecta and magnetic flux ropes

Author: Pete Riley, Ruizhe Ma, and Rafal A. Angryk
Subjects: Spacecraft, Computer science, business.industry, 02 engineering and technology, Astrophysics, 01 natural sciences, Magnetic flux, 020204 information systems, Physics::Space Physics, 0103 physical sciences, Line (geometry), 0202 electrical engineering, electronic engineering, information engineering, Coronal mass ejection, Interplanetary spaceflight, Ejecta, business, 010303 astronomy & astrophysics, Event (particle physics), Simulation
Abstract: Scientists have observed the occurrence of two distinctive subsets in Interplanetary Coronal Mass Ejections (ICMEs): magnetic clouds (MCs) and non-magnetic clouds (non-MCs). While we are aware of some of the distinctive features of MCs and non-MCs, we cannot draw a precise line between them. Features such as large magnetic field, low plasma-beta, low proton temperature, etc. suggest when an ICME event is also an MC event, however, this categorization is far from an automated process. In addition to being time-consuming, the results differ depending on the precision of definition. In this paper, we approach the MC and non-MC class distinction from a data analysis perspective and show a data-driven taxonomy of ICME events. We use a time series dataset from the Ulysses spacecraft combined with a list of labeled MC and non-MC events. The time series data are hierarchically clustered with Euclidean distance and Dynamic Time Warping algorithm, and we compare our MC and non-MC clusters with the results from classifications generated by domain experts.
Published: 2016
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

13 results on '"Ruizhe Ma"'

1. On the Mining of the Minimal Set of Time Series Data Shapelets

2. Evaluation of Hierarchical Structures for Time Series Data

3. Two-stage prediction of machinery fault trend based on deep learning for time series analysis

4. RDF approximate queries based on semantic similarity

5. Thermally Actuated Hierarchical Lattices With Large Linear and Rotational Expansion

6. A Scalable Segmented Dynamic Time Warping for Time Series Classification

7. Segmentation of Time Series in Improving Dynamic Time Warping

8. Neuro-Ensemble for Time Series Data Classification

9. Scalable kNN Search Approximation for Time Series Data

10. A Hardware/Software Co-design Method for Approximate Semi-Supervised K-Means Clustering

11. Time Series Distance Density Cluster with Statistical Preprocessing

12. Distance and Density Clustering for Time Series Data

13. A data-driven analysis of interplanetary coronal mass ejecta and magnetic flux ropes

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Journal

Database

Publisher

13 results on '"Ruizhe Ma"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources