Author: "Xu, Shuo" / Topic: machine learning - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Xu, Shuo"' showing total 15 results

Start Over Author "Xu, Shuo" Topic machine learning

15 results on '"Xu, Shuo"'

1. FMDADA: Federated multi-discriminative adversarial domain adaptation

Author: Chi, Hao, Xia, Hui, Xu, Shuo, He, Yusheng, and Hu, Chunqiang
Published: 2024
Full Text: View/download PDF

2. Performance evaluation of seven multi-label classification methods on real-world patent and publication datasets

Author: Xu Shuo, Zhang Yuefu, An Xin, and Pi Sainan
Subjects: multi-label classification, real-world datasets, hierarchical structure, classification system, label correlation, machine learning, Information technology, T58.5-58.64, Electronic computers. Computer science, QA75.5-76.95
Abstract: Many science, technology and innovation (STI) resources are attached with several different labels. To assign automatically the resulting labels to an interested instance, many approaches with good performance on the benchmark datasets have been proposed for multilabel classification task in the literature. Furthermore, several open-source tools implementing these approaches have also been developed. However, the characteristics of real-world multilabel patent and publication datasets are not completely in line with those of benchmark ones. Therefore, the main purpose of this paper is to evaluate comprehensively seven multi-label classification methods on real-world datasets.
Published: 2024
Full Text: View/download PDF

3. A Deep Learning Based Anomaly Detection Model for IoT Networks

Author: Dai, Li E., Wang, Xiao, Xu, Shuo Bo, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Jiming, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Li, Yong, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Oneto, Luca, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zamboni, Walter, Series Editor, Tan, Kay Chen, Series Editor, Dong, Jian, editor, Zhang, Long, editor, and Cheng, Deqiang, editor
Published: 2024
Full Text: View/download PDF

4. Estimating Spatiotemporal Fishing Effort of Trawlers with Vessel-Monitoring System Data: A Case Study of the Sea Area of the Bohai Sea and the Yellow Sea, China.

Author: Li, Dan, Lu, Feng, Xu, Shuo, Liu, Huiyuan, Xue, Muhan, Cui, Guohui, Ma, Zhenhua, Fang, Hui, and Wang, Yu
Subjects: MACHINE learning, FEATURE extraction, FISHERY resources, FISHING, FISHERIES, BOOSTING algorithms
Abstract: Measuring the distribution of the fishing effort of trawlers is of great significance for describing marine fishery activities, quantifying fishing systems in terms of marine ecological pressure, and revising the regulations of fishing. The purpose of this paper is to develop an efficient learning algorithm to detect the fishing behavior of trawlers to analyze the distribution of fishing effort. The vessel-monitoring system data of more than 4600 trawlers from September 2019 to April 2023 were used for feature extraction. According to the spatiotemporal information provided by the vessel position data, 11-dimensional features were extracted to form the feature vectors. A Slime Mould Algorithm-optimized Light Gradient-Boosting Machine (SMA-LightGBM) algorithm was proposed to classify the feature vectors to recognize fishing behavior. The presented method showed a remarkable generalization ability and high accuracy, sensitivity, specificity, and Matthews correlation coefficient in the test results, with scores of 98.23%, 98.75%, 97.75%, and 0.9646, respectively. Subsequently, the trained model was used to identify the fishing behavior of trawlers belonging to the coastal provinces of the Bohai Sea and the Yellow Sea in the sea area of 117 ° E ~ 132 ° E , 26 ° N ~ 41 ° N . The fishing effort was calculated and evaluated according to the fishing behavior recognition results. The mean absolute error was 0.3031 kW·h, and the coefficient of determination score was 0.9772. The thermal map of the fishing effort of the trawler was mapped, and the spatiotemporal characteristics were estimated in the region of interest from 2019 to 2023 with a spatial resolution of 1 8 degree × 1 8 degree. This method is an efficient way of analyzing the spatiotemporal characteristics of the fishing effort of trawlers. It provides a quantitative basis for the assessment of fishery resources and can inform fishing policies. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

5. K-Base: Platform to Build the Knowledge Base for an Intelligent Service

Author: Shin, Sungho, Um, Jung-Ho, Choi, Sung-Pil, Jung, Hanmin, Xu, Shuo, Zhu, Lijun, Park, James J. (Jong Hyuk), editor, Adeli, Hojjat, editor, Park, Namje, editor, and Woungang, Isaac, editor
Published: 2014
Full Text: View/download PDF

6. Important citations identification by exploiting generative model into discriminative model.

Author: An, Xin, Sun, Xin, Xu, Shuo, Hao, Liyuan, and Li, Jinghong
Subjects: SCIENTIFIC knowledge, MACHINE learning, CONVOLUTIONAL neural networks, DEEP learning, SUPPORT vector machines, SUCCESS, CITATION indexes
Abstract: Although the citations between scientific documents are deemed as a vehicle for dissemination, inheritance and development of scientific knowledge, not all citations are well-positioned to be equal. A plethora of taxonomies and machine-learning models have been implemented to tackle the task of citation function and importance classification from qualitative aspect. Inspired by the success of kernel functions from resulting general models to promote the performance of the support vector machine (SVM) model, this work exploits the potential of combining generative and discriminative models for the task of citation importance classification. In more detail, generative features are generated from a topic model, citation influence model (CIM) and then fed to two discriminative traditional machine-learning models, SVM and RF (random forest), and a deep learning model, convolutional neural network (CNN), with other 13 traditional features to identify important citations. The extensive experiments are performed on two data sets with different characteristics. These three models perform better on the data set from one discipline. It is very possible that the patterns for important citations may vary by the fields, which disable machine-learning models to learn effectively the discriminative patterns from publications from multiple domains. The RF classifier outperforms the SVM classifier, which accords with many prior studies. However, the CNN model does not achieve the desired performance due to small-scaled data set. Furthermore, our CIM model–based features improve further the performance for identifying important citations. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

7. The CHEMDNER corpus of chemicals and drugs and its annotation principles

Author: Krallinger, Martin, Rabal, Obdulia, Leitner, Florian, Vazquez, Miguel, Salgado, David, Lu, Zhiyong, Leaman, Robert, Lu, Yanan, Ji, Donghong, Lowe, Daniel M, Sayle, Roger A, Batista-Navarro, Riza Theresa, Rak, Rafal, Huber, Torsten, Rocktäschel, Tim, Matos, Sérgio, Campos, David, Tang, Buzhou, Xu, Hua, Munkhdalai, Tsendsuren, Ryu, Keun Ho, Ramanan, SV, Nathan, Senthil, Žitnik, Slavko, Bajec, Marko, Weber, Lutz, Irmer, Matthias, Akhondi, Saber A, Kors, Jan A, Xu, Shuo, An, Xin, Sikdar, Utpal Kumar, Ekbal, Asif, Yoshioka, Masaharu, Dieb, Thaer M, Choi, Miji, Verspoor, Karin, Khabsa, Madian, Giles, C Lee, Liu, Hongfang, Ravikumar, Komandur Elayavilli, Lamurias, Andre, Couto, Francisco M, Dai, Hong-Jie, Tsai, Richard Tzong-Han, Ata, Caglar, Can, Tolga, Usié, Anabel, Alves, Rui, Segura-Bedmar, Isabel, Martínez, Paloma, Oyarzabal, Julen, and Valencia, Alfonso
Published: 2015
Full Text: View/download PDF

8. Multisource domain factorization network for cross-domain fault diagnosis of rotating machinery: An unsupervised multisource domain adaptation method

Author: Ding Xue, Shun Zhang, Xu Shuo, Shi Yaowei, Jing Li, and Aidong Deng
Subjects: Generalization, Computer science, business.industry, Mechanical Engineering, Aerospace Engineering, Negative transfer, Machine learning, computer.software_genre, Computer Science Applications, Domain (software engineering), Control and Systems Engineering, Signal Processing, Feature (machine learning), Artificial intelligence, Entropy (energy dispersal), Representation (mathematics), Transfer of learning, business, Focus (optics), computer, Civil and Structural Engineering
Abstract: Unsupervised domain adaptation (DA) provides a promising approach for tackling fault diagnosis tasks of target datasets without labeled data and has been actively studied in recent years. Most of them focus only on single-source DA, compared to multisource DA (MDA), which has remarkable advantages in generalized knowledge learning and generalization performance. Nevertheless, there are very few fault diagnosis studies based on MDA, and it remains challenging to reduce multiple domain shifts to improve diagnostic performance and mitigate negative transfer during learning. To this end, a novel unsupervised MDA-based transfer learning approach called multisource domain factorization network (MDFN) is proposed in this paper, where the generalized diagnosis knowledge is learned from multiple sources and then used for diagnosing the target task. The highlights of MDFN are that the shared-space component analysis and transferability-based entropy penalty strategy are employed to significantly mitigate negative transfer from the two levels of feature representation and instance transferability and effectively learn shared feature representation. Therefore, the MDFN can extract shared features that combine domain-invariance and discriminability, thereby performing better. The results of two experimental cases on six datasets, including cross-operating-condition and cross-component diagnosis tasks, validate the effectiveness and superiority of the proposed method.
Published: 2022

9. Learning Efficient Hash Codes for Fast Graph-Based Data Similarity Retrieval.

Author: Wang, Jinbao, Xu, Shuo, Zheng, Feng, Lu, Ke, Song, Jingkuan, and Shao, Ling
Subjects: *INFORMATION retrieval, *REPRESENTATIONS of graphs, *MACHINE learning, *GRAPH algorithms, *COMPUTER vision, *VISUAL fields
Abstract: Traditional operations, e.g. graph edit distance (GED), are no longer suitable for processing the massive quantities of graph-structured data now available, due to their irregular structures and high computational complexities. With the advent of graph neural networks (GNNs), the problems of graph representation and graph similarity search have drawn particular attention in the field of computer vision. However, GNNs have been less studied for efficient and fast retrieval after graph representation. To represent graph-based data, and maintain fast retrieval while doing so, we introduce an efficient hash model with graph neural networks (HGNN) for a newly designed task (i.e. fast graph-based data retrieval). Due to its flexibility, HGNN can be implemented in both an unsupervised and supervised manner. Specifically, by adopting a graph neural network and hash learning algorithms, HGNN can effectively learn a similarity-preserving graph representation and compute pair-wise similarity or provide classification via low-dimensional compact hash codes. To the best of our knowledge, our model is the first to address graph hashing representation in the Hamming space. Our experimental results reach comparable prediction accuracy to full-precision methods and can even outperform traditional models in some cases. In real-world applications, using hash codes can greatly benefit systems with smaller memory capacities and accelerate the retrieval speed of graph-structured data. Hence, we believe the proposed HGNN has great potential in further research. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

10. Simple Interrogative Sentence Analysis Based on CRF

Author: Wang Zheng, Yan Yingying, Xu Shuo, Zhang Ning, Zhu Li-Jun, and Li Weifeng
Subjects: Conditional random field, Sequence, business.industry, Process (engineering), Computer science, 020207 software engineering, 02 engineering and technology, Interrogative, Machine learning, computer.software_genre, Knowledge-based systems, Knowledge extraction, 0202 electrical engineering, electronic engineering, information engineering, Question answering, 020201 artificial intelligence & image processing, Artificial intelligence, business, computer, Natural language processing, Simple (philosophy)
Abstract: This paper intends to enhance the simple interrogative sentence analysis , which leads question answering system to understand ”What is this question asking?”.[ Methods]Under the condition that simple interrogative sentence analysis is regarded as a sequence labelling problems, Conditional Random Field (CRF) model can process it well. [Results] Few manual label can lead to promoted result. [Limitations]For non-factual problems processing needs exceed the defined label system support.[Conclusions]Using Conditional Random Field model to process question analysis problem, which is regarded as a sequential labelling problem, can improve handling capacity with relatively little cost.
Published: 2016

11. Which type of dynamic indicators should be preferred to predict patent commercial potential?

Author: Yang, Guancan, Lu, Guoxuan, Xu, Shuo, Chen, Liang, and Wen, Yuxin
Subjects: PATENTS, MACHINE learning, ARTIFICIAL intelligence, DIGITAL technology, TECHNOLOGICAL forecasting, COVID-19 pandemic
Abstract: The current patent value evaluations increasingly focus on serving realistic predictive scenarios, emphasizing the commercial potential of patents at the early stage from the ex-ante perspective. This requirement poses a serious challenge: those classical dynamic indicators that have been proved to be effective in the literature may not be valid for commercial patent potential prediction from the ex-ante perspective. Thereupon, this study groups the dynamic indicators into cross-sectional indicators and longitudinal indicators. Then, a patent commercial potential prediction framework is proposed from the ex-ante perspective, in which the impact of the chronological order on predictive models is investigated comprehensively. More specifically, this study collects the USPTO cancer-related dataset from 2003 to 2013 as the training set, and combines three dynamic indicators (cross-sectional, longitudinal, and mixed) with classical static indicators to test the prediction performance for the following five years (2014–2018). The biased results caused by the ex-post perspective are indeed observed, and the longitudinal indicators are more sensitive to commercial patent potential, especially in the early stage. The effect of the ex-post perspective will gradually weaken over time, and the cross-sectional indicators provide stable prediction performance three years later. These findings will be helpful for subsequent improvements of commercial patent potential prediction models. • To exploit impact of various indicators on predicting patent commercial potential with a supervised machine learning method. • A framework embedding out-of-time & over sampling methods is raised to predict patent commercial potential from ex-ante view. • Dynamic indicators encode essential information on the commercial potential of a focal patent. • Longitudinal indicators are more sensitive to the commercial patent potential, especially at the early stage. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

12. Patent representation learning with a novel design of patent ontology: Case study on PEM patents.

Author: Zhai, Dongsheng, Zhai, Liang, Li, Mengyang, He, Xijun, Xu, Shuo, and Wang, Feifei
Subjects: PATENTS, TECHNOLOGICAL innovations, MINERAL industries, MACHINE learning, DATA analysis
Abstract: Under the background of innovation-driven knowledge economy globalization, comprehensive and insightful patent technology information mining can help enterprises win the first-mover advantage in the increasingly fierce technology competition. However, existing machine learning-based methods do not entirely incorporate the characteristics of patent technology of technology composition and technology association at the micro-level and macro-level, making it difficult to mine detailed and comprehensive patent information. To fill this research gap, firstly, we conduct a comprehensive analysis from the micro-level technology composition perspective of patent documents and the macro-level technology association perspective of patent data involved in the technology field, and then we design a novel patent ontology that includes the entity of patent, function, solution and application field. Secondly, we create a patent heterogeneous network with the help of the proposed patent ontology and the technology association. Finally, to fully use the patent technology characteristics, we develop a heterogeneous graph embedding algorithm to embed this information into the patent representation, and the experiments done on non-perfluorinated proton exchange membrane patent data show that our method produces better patent representation than the comparable models. Furthermore, we utilize the patent representation to perform case study to confirm the method's reliability and practicability. • Propose an overall methodology to learn the patent representation. • Design a patent ontology that considers the patent technology's characteristics. • Combine patent ontology and heterogeneous network embedding algorithm. • Conduct various experiments to validate the merits of our method. • Generate patent representation to serve multiple patent analysis tasks. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

13. Prediction of core cancer genes using multi-task classification framework

Author: Gao, Shan, Xu, Shuo, Fang, Yaping, and Fang, Jianwen
Subjects: *CANCER genes, *CARCINOGENESIS, *GENE expression, *MACHINE learning, *SYSTEMS biology, *PREDICTION theory, *COMPUTER multitasking
Abstract: Abstract: Cancer is deemed as a highly heterogeneous disease specific to cell type and tissue origin. All cancers, however, share a common pathogenesis. Therefore, it is widely believed that cancers may share common mechanisms. In this study, we introduce a novel strategy based on multi-tasking learning methods to predict core cancer genes shared by multiple cancers in the hope of elucidating common cancer mechanisms. Our strategy uses two multi-tasking learning algorithms, one for feature selection and the other for validation of selected features. The combined use of two methods results in more robust classifiers and reliable selected features. The top 73 significant features, mapped to 72 genes, are selected as core cancer genes. The effectiveness of the 73 features is further demonstrated in a blind test conducted on an independent test data. The biological significance of these genes is evaluated using systems biology analyses. Extensive functional, pathway and network analysis confirms findings in previous studies and brings new insights into common cancer mechanisms. Our strategy can be used as a general method to find important genes from large gene expression datasets on the genomic level. The selected genes can be used to predict cancers. [Copyright &y& Elsevier]
Published: 2013
Full Text: View/download PDF

14. Emerging research topics detection with multiple machine learning models.

Author: Xu, Shuo, Hao, Liyuan, An, Xin, Yang, Guancan, and Wang, Feifei
Subjects: MACHINE learning, GIBBS sampling, GENOME editing
Abstract: • Several machine learning models are together used to detect and foresight the emerging research topics. • The following indicators are operationalized: radical novelty, relatively fast growth, coherence and scientific impact. • As for the CIM model, the collapsed Gibbs sampling is done separately for the cited and citing publication parts. • Experimental results on gene editing dataset show that it is feasible to identify emerging research topics with our framework. Emerging research topic detection can benefit the research foundations and policy-makers. With the long-term and recent interest in detecting emerging research topics, various approaches are proposed in the literature. Though, there is still a lack of well-established linkages between the clear conceptual definition of emerging research topics and the proposed indicators for operationalization. This work follows the definition by Wang (2018) , and several machine learning models are together used to detect and foresight the emerging research topics. Finally, experimental results on gene editing dataset discover three emerging research topics, which make clear that it is feasible to identify emerging research topics with our framework. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

15. Multisource domain factorization network for cross-domain fault diagnosis of rotating machinery: An unsupervised multisource domain adaptation method.

Author: Shi, Yaowei, Deng, Aidong, Ding, Xue, Zhang, Shun, Xu, Shuo, and Li, Jing
Subjects: *FAULT diagnosis, *ROTATING machinery, *FACTORIZATION, *MACHINE learning, *DIAGNOSIS, *ENTROPY (Information theory)
Abstract: • A novel MDFN is proposed for cross-domain fault diagnosis of rotating machinery. • The domain factorization strategy is elaborated to learn domain-invariant features. • The IET loss term is designed to avoid the interference of "bad samples". • Significant mitigation of negative transfer. Unsupervised domain adaptation (DA) provides a promising approach for tackling fault diagnosis tasks of target datasets without labeled data and has been actively studied in recent years. Most of them focus only on single-source DA, compared to multisource DA (MDA), which has remarkable advantages in generalized knowledge learning and generalization performance. Nevertheless, there are very few fault diagnosis studies based on MDA, and it remains challenging to reduce multiple domain shifts to improve diagnostic performance and mitigate negative transfer during learning. To this end, a novel unsupervised MDA-based transfer learning approach called multisource domain factorization network (MDFN) is proposed in this paper, where the generalized diagnosis knowledge is learned from multiple sources and then used for diagnosing the target task. The highlights of MDFN are that the shared-space component analysis and transferability-based entropy penalty strategy are employed to significantly mitigate negative transfer from the two levels of feature representation and instance transferability and effectively learn shared feature representation. Therefore, the MDFN can extract shared features that combine domain-invariance and discriminability, thereby performing better. The results of two experimental cases on six datasets, including cross-operating-condition and cross-component diagnosis tasks, validate the effectiveness and superiority of the proposed method. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

15 results on '"Xu, Shuo"'

1. FMDADA: Federated multi-discriminative adversarial domain adaptation

2. Performance evaluation of seven multi-label classification methods on real-world patent and publication datasets

3. A Deep Learning Based Anomaly Detection Model for IoT Networks

4. Estimating Spatiotemporal Fishing Effort of Trawlers with Vessel-Monitoring System Data: A Case Study of the Sea Area of the Bohai Sea and the Yellow Sea, China.

5. K-Base: Platform to Build the Knowledge Base for an Intelligent Service

6. Important citations identification by exploiting generative model into discriminative model.

7. The CHEMDNER corpus of chemicals and drugs and its annotation principles

8. Multisource domain factorization network for cross-domain fault diagnosis of rotating machinery: An unsupervised multisource domain adaptation method

9. Learning Efficient Hash Codes for Fast Graph-Based Data Similarity Retrieval.

10. Simple Interrogative Sentence Analysis Based on CRF

11. Which type of dynamic indicators should be preferred to predict patent commercial potential?

12. Patent representation learning with a novel design of patent ontology: Case study on PEM patents.

13. Prediction of core cancer genes using multi-task classification framework

14. Emerging research topics detection with multiple machine learning models.

15. Multisource domain factorization network for cross-domain fault diagnosis of rotating machinery: An unsupervised multisource domain adaptation method.

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

15 results on '"Xu, Shuo"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources