7 results on '"YASUKO MATSUBARA"'
Search Results
2. Multi-aspect Mining of Complex Sensor Sequences
- Author
-
Yasuko Matsubara, Mutsumi Abe, Yasushi Sakurai, Takato Honda, and Ryo Neyama
- Subjects
Computer science ,Feature extraction ,02 engineering and technology ,computer.software_genre ,Sensor fusion ,Automatic summarization ,020204 information systems ,Outlier ,Scalability ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Segmentation ,Data mining ,computer - Abstract
In recent years, a massive amount of time-stamped sensor data has been generated and collected by many Internet of Things (IoT) applications, such as advanced automobiles and health care devices. Given such a large collection of complex sensor sequences, which consists of multiple attributes (e.g., sensor, user, timestamp), how can we automatically find important dynamic time-series patterns and the points of variation? How can we summarize all the complex sensor sequences, and achieve a meaningful segmentation? Also, can we see any hidden user-specific differences and outliers? In this paper we present CUBEMARKER, an efficient and effective method for capturing such multi-aspect features in sensor sequences. CUBEMARKER performs multi-way summarization for all attributes, namely, sensors, users, and time, and specifically it extracts multi-aspect features, such as important time-series patterns (i.e., time-aspect features) and hidden groups of users (i.e., user-aspect features), in complex sensor sequences. Our proposed method has the following advantages: (a) It is effective: it extracts multi-aspect features from complex sensor sequences and enables the efficient and effective analysis of complicated datasets; (b) It is automatic: it requires no prior training and no parameter tuning; (c) It is scalable: our method is carefully designed to be linear as regards dataset size and applicable to a large number of sensor sequences. Extensive experiments on real datasets show that CUBEMARKER is effective in that it can capture meaningful patterns for various real-world datasets, such as those obtained from smart factories, human activities, and automobiles. CUBEMARKER consistently outperforms the best state-of-the-art methods in terms of both accuracy and execution speed.
- Published
- 2019
- Full Text
- View/download PDF
3. Automatic Mining of Large IoT Sensor Tensor
- Author
-
Yasuko Matsubara, Takato Honda, and Yasushi Sakurai
- Subjects
Series (mathematics) ,Computer science ,business.industry ,02 engineering and technology ,computer.software_genre ,020204 information systems ,Tensor (intrinsic definition) ,Scalability ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Data mining ,Time series ,Hidden Markov model ,Internet of Things ,business ,Representation (mathematics) ,computer - Abstract
Given a large collection of multiple time-evolving sensor sequences, how can we capture the transitions of time series patterns? How can we find individual differences between different sequences? In this paper we present CUBEMARKER, an effective method for capturing multi-aspect features of sensor sequences, which provides a compact and powerful representation of sensor behavior. Our second contribution is a novel, scalable, and parameter-free algorithm. CUBEMARKER performs two-way mining for all attributes. Specifically it discovers multi-aspect time series patterns (human motion, smart factory, etc) and groups of patterns simultaneously. Extensive experiments on real datasets show that CUBEMARKER is effective in that it can capture meaningful patterns for various real sensor datasets.
- Published
- 2018
- Full Text
- View/download PDF
4. Data Stream Analysis of Online Activities
- Author
-
Yasuko Matsubara, Koki Kawabata, and Yasushi Sakurai
- Subjects
Data stream mining ,Event (computing) ,Computer science ,Volume (computing) ,02 engineering and technology ,computer.software_genre ,Data modeling ,Set (abstract data type) ,020204 information systems ,Scalability ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Data mining ,Hidden Markov model ,computer ,Streaming algorithm - Abstract
Given a large volume of multiple data streams, such as online web-click logs and sensor data, how can we discover typical patterns and compress them into compact models? How can we incrementally distinguish multiple patterns while considering the information obtained from a pattern found in a streaming setting? In this paper, we propose a streaming algorithm, namely STREAMSCOPE, that can find intuitive patterns efficiently from event streams evolving over time. Our method has the following properties: (a) Effective: It operates on semi-infinite data streams and summarizes all the streams into a set of multiple discrete segments grouped by their similarities. (b) Automatic: It automatically and incrementally recognizes such patterns and generates models for each of them if necessary. (c) Scalable: The complexity of our method does not depend on the length of the input streams. Our experiments on real datasets demonstrate that StreamsCopecan find meaningful patterns and achieve great improvements in terms of computational time and memory space over its full batch method competitors.
- Published
- 2018
- Full Text
- View/download PDF
5. Fast and Exact Monitoring of Co-Evolving Data Streams
- Author
-
Naonori Ueda, Yasushi Sakurai, Masatoshi Yoshikawa, and Yasuko Matsubara
- Subjects
Data stream mining ,Computer science ,Markov model ,Viterbi algorithm ,computer.software_genre ,symbols.namesake ,Exact algorithm ,Subsequence ,Outlier ,symbols ,Algorithm design ,Forward algorithm ,Data mining ,Hidden Markov model ,computer - Abstract
Given a huge stream of multiple co-evolving sequences, such as motion capture and web-click logs, how can we find meaningful patterns and spot anomalies? Our aim is to monitor data streams statistically, and find sub sequences that have the characteristics of a given hidden Markov model (HMM). For example, consider an online web-click stream, where massive amounts of access logs of millions of users are continuously generated every second. So how can we find meaningful building blocks and typical access patterns such as weekday/weekend patterns, and also, detect anomalies and intrusions? In this paper, we propose Stream Scan, a fast and exact algorithm for monitoring multiple co-evolving data streams. Our method has the following advantages: (a) it is effective, leading to novel discoveries and surprising outliers, (b) it is exact, and we theoretically prove that Stream Scan guarantees the exactness of the output, (c) it is fast, and requires O (1) time and space per time-tick. Our experiments on 67GB of real data illustrate that Stream Scan does indeed detect the qualifying subsequence patterns correctly and that it can offer great improvements in speed (up to 479,000 times) over its competitors.
- Published
- 2014
- Full Text
- View/download PDF
6. Scalable Algorithms for Distribution Search
- Author
-
Yasuko Matsubara, Masatoshi Yoshikawa, and Yasushi Sakurai
- Subjects
Full table scan ,Reduction (complexity) ,Kullback–Leibler divergence ,Speedup ,Market segmentation ,Nearest neighbor search ,Outlier ,Search cost ,Data mining ,computer.software_genre ,computer ,Mathematics - Abstract
Distribution data naturally arise in countless domains, such as meteorology, biology, geology, industry and economics. However, relatively little attention has been paid to data mining for large distribution sets. Given n distributions of multiple categories and a query distribution Q, we want to find similar clouds (i.e., distributions), to discover patterns, rules and outlier clouds. For example, consider the numerical case of sales of items, where, for each item sold, we record the unit price and quantity; then, each customer is represented as a distribution of 2-d points (one for each item he/she bought). We want to find similar users, e.g., for market segmentation, anomaly/fraud detection. We propose to address this problem and present D-Search, which includes fast and effective algorithms for similarity search in large distribution datasets. Our main contributions are (1) approximate KL divergence, which can speed up cloud-similarity computations, (2) multi-step sequential scan, which efficiently prunes a significant number of search candidates and leads to a direct reduction in the search cost. We also introduce an extended version of D-Search: (3) time-series distribution mining, which finds similar subsequences in time-series distribution datasets. Extensive experiments on real multi-dimensional datasets show that our solution achieves up to 2,300 faster wall-clock time over the naive implementation while it does not sacrifice accuracy.
- Published
- 2009
- Full Text
- View/download PDF
7. Development of a Desktop Search System Using Correlation between User's Schedule and Data in a Computer
- Author
-
Yasuko Matsubara and Ichiro Kobayashi
- Published
- 2007
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.