101 results on '"Li, Fuyi"'
Search Results
2. Sign-changing solutions to critical Schrödinger equation with Hartree-type nonlinearity.
- Author
-
Zhang, Cui and Li, Fuyi
- Subjects
- *
SCHRODINGER equation , *RICCI flow , *INVARIANT sets , *GENERALIZATION - Abstract
In this paper, we are interested in the following Schrödinger equation - Δ u + λ V (x) u = (I α ∗ | u | p ) | u | p - 2 u + | u | 4 u in R 3 , where λ is a positive parameter, V ∈ C (R , R +) and α ∈ (0 , 3) , p ∈ (2 + α , 3 + α) . Under some reasonable conditions on potential function V, particularly V allows to have nonisolated zero, we first establish the existence of positive ground-state solution and the corresponding energy estimates based on Nehari manifold. Subsequently, with the help of quantitative deformation lemma and invariant sets of descending flow, we also obtain the existence of ground-state sign-changing solution by adopting constrained minimization arguments on the sign-changing Nehari manifold. In this process, a new existence result of zero, which can be regarded as a generalization of Miranda's theorem, plays an essential role. Besides, the asymptotic behavior of sign-changing solutions is also studied when λ tends to infinity. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
3. Ground‐state solutions of Schrödinger‐type equation with magnetic field.
- Author
-
Li, Fuyi, Zhang, Cui, and Liang, Zhanping
- Subjects
- *
MAGNETIC fields , *NONLINEAR equations , *EQUATIONS - Abstract
In this paper, the nonlinear Schrödinger‐type equation −(∇+iA)2u+u+λIα∗K|u|2Ku=af(|u|)|u|uinℝ3$$ -{\left(\nabla + iA\right)}^2u+u+\lambda \left[{I}_{\alpha}\ast \left(K{\left|u\right|}^2\right)\right] Ku=a\frac{f\left(|u|\right)}{\mid u\mid }u\kern.5em \mathrm{in}\kern.5em {\mathrm{\mathbb{R}}}^3 $$is considered in the presence of magnetic field, where A∈C1(ℝ3,ℝ3)$$ A\in {C}^1\left({\mathrm{\mathbb{R}}}^3,{\mathrm{\mathbb{R}}}^3\right) $$, α∈(0,3)$$ \alpha \in \left(0,3\right) $$, Iα$$ {I}_{\alpha } $$ denotes the Riesz potential, K∈Lp(ℝ3)$$ K\in {L}^p\left({\mathrm{\mathbb{R}}}^3\right) $$ is a positive potential for some p∈(6/(1+α),∞]$$ p\in \left(6/\left(1+\alpha \right),\infty \right] $$, a∈Lq(ℝ3)\{0}$$ a\in {L}^q\left({\mathrm{\mathbb{R}}}^3\right)\backslash \left\{0\right\} $$ is a nonnegative potential for some q∈(3/2,∞]$$ q\in \left(3/2,\infty \right] $$, and f∈C(ℝ,[0,∞))$$ f\in C\left(\mathrm{\mathbb{R}},\right[0,\infty \left)\right) $$ is assumed to be asymptotically linear at infinity. Under suitable assumptions regarding A$$ A $$, K$$ K $$, a$$ a $$, and f$$ f $$, variational methods are used to establish the existence of ground‐state solutions of the above equation for sufficiently small values of the parameter λ$$ \lambda $$. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
4. ProsperousPlus: a one-stop and comprehensive platform for accurate protease-specific substrate cleavage prediction and machine-learning model construction.
- Author
-
Li, Fuyi, Wang, Cong, Guo, Xudong, Akutsu, Tatsuya, Webb, Geoffrey I, Coin, Lachlan J M, Kurgan, Lukasz, and Song, Jiangning
- Subjects
- *
PREDICTION models , *CELL physiology , *INTEGRATED software , *SELF-efficacy , *BIOINFORMATICS , *PROTEOLYTIC enzymes - Abstract
Proteases contribute to a broad spectrum of cellular functions. Given a relatively limited amount of experimental data, developing accurate sequence-based predictors of substrate cleavage sites facilitates a better understanding of protease functions and substrate specificity. While many protease-specific predictors of substrate cleavage sites were developed, these efforts are outpaced by the growth of the protease substrate cleavage data. In particular, since data for 100+ protease types are available and this number continues to grow, it becomes impractical to publish predictors for new protease types, and instead it might be better to provide a computational platform that helps users to quickly and efficiently build predictors that address their specific needs. To this end, we conceptualized, developed, tested and released a versatile bioinformatics platform, ProsperousPlus , that empowers users, even those with no programming or little bioinformatics background, to build fast and accurate predictors of substrate cleavage sites. ProsperousPlus facilitates the use of the rapidly accumulating substrate cleavage data to train, empirically assess and deploy predictive models for user-selected substrate types. Benchmarking tests on test datasets show that our platform produces predictors that on average exceed the predictive performance of current state-of-the-art approaches. ProsperousPlus is available as a webserver and a stand-alone software package at http://prosperousplus.unimelb-biotools.cloud.edu.au/. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
5. Normalized Solutions to the Critical Choquard-type Equations with Weakly Attractive Potential and Nonlocal Perturbation.
- Author
-
Long, Lei, Li, Fuyi, and Rong, Ting
- Subjects
- *
LAGRANGE multiplier , *EQUATIONS - Abstract
In this paper, we look for solutions to the following Choquard-type equation - Δ u + (V + λ) u = (I α ∗ | u | p ) | u | p - 2 u + μ (I α ∗ | u | q ) | u | q - 2 u in R N , having a prescribed mass ∫ R N u 2 = a > 0 , where λ ∈ R will arise as a Lagrange multiplier, N ⩾ 3 , I α is the Riesz potential, α ∈ (0 , N) , p ∈ (α ¯ , 2 α ∗ ] , q ∈ (α ¯ , 2 α ∗) , α ¯ = (N + α + 2) / N is the mass critical exponent, 2 α ∗ = (N + α) / (N - 2) is the Hardy–Littlewood–Sobolev upper critical exponent and μ > 0 is a constant. Under suitable conditions on the potential V, the above Choquard-type equation admits a positive ground state normalized solution by comparison arguments, in particular, when p = 2 α ∗ , μ needs to be larger and the Hardy–Littlewood–Sobolev subcritical approximation method is used. At the end of this paper, a new result on the regularity of solutions and Pohozaev identity to a more general Choquard-type equation is established. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
6. Normalized solutions to the mass supercritical Kirchhoff-type equation with non-trapping potential.
- Author
-
Rong, Ting and Li, Fuyi
- Subjects
- *
LAGRANGE multiplier , *EQUATIONS - Abstract
This paper is concerned with the existence of solutions to the Kirchhoff-type equation − a + b ∫ R 3 | ∇ u | 2 Δ u + (V + λ) u = | u | p − 2 u + μ | u | q − 2 u i n R 3 under the normalized constraint ∫ R 3 u 2 = ρ 2 , where a, b, ρ > 0, 14/3 < q < p ⩽ 6, μ > 0 is a constant, and λ ∈ R appears as a Lagrange multiplier. Under an explicit assumption on V, we can prove the existence of positive ground state solutions to the above equation. A new concentration compactness type result is established to recover compactness in the Sobolev critical case. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
7. iAMPCN: a deep-learning approach for identifying antimicrobial peptides and their functional activities.
- Author
-
Xu, Jing, Li, Fuyi, Li, Chen, Guo, Xudong, Landersdorfer, Cornelia, Shen, Hsin-Hui, Peleg, Anton Y, Li, Jian, Imoto, Seiya, Yao, Jianhua, Akutsu, Tatsuya, and Song, Jiangning
- Subjects
- *
ANTIMICROBIAL peptides , *MACHINE learning , *DEEP learning , *CONVOLUTIONAL neural networks , *AMINO acids , *SOURCE code , *PEPTIDE antibiotics - Abstract
Antimicrobial peptides (AMPs) are short peptides that play crucial roles in diverse biological processes and have various functional activities against target organisms. Due to the abuse of chemical antibiotics and microbial pathogens' increasing resistance to antibiotics, AMPs have the potential to be alternatives to antibiotics. As such, the identification of AMPs has become a widely discussed topic. A variety of computational approaches have been developed to identify AMPs based on machine learning algorithms. However, most of them are not capable of predicting the functional activities of AMPs, and those predictors that can specify activities only focus on a few of them. In this study, we first surveyed 10 predictors that can identify AMPs and their functional activities in terms of the features they employed and the algorithms they utilized. Then, we constructed comprehensive AMP datasets and proposed a new deep learning-based framework, iAMPCN (identification of AMPs based on CNNs), to identify AMPs and their related 22 functional activities. Our experiments demonstrate that iAMPCN significantly improved the prediction performance of AMPs and their corresponding functional activities based on four types of sequence features. Benchmarking experiments on the independent test datasets showed that iAMPCN outperformed a number of state-of-the-art approaches for predicting AMPs and their functional activities. Furthermore, we analyzed the amino acid preferences of different AMP activities and evaluated the model on datasets of varying sequence redundancy thresholds. To facilitate the community-wide identification of AMPs and their corresponding functional types, we have made the source codes of iAMPCN publicly available at https://github.com/joy50706/iAMPCN/tree/master. We anticipate that iAMPCN can be explored as a valuable tool for identifying potential AMPs with specific functional activities for further experimental validation. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
8. TIMER is a Siamese neural network-based framework for identifying both general and species-specific bacterial promoters.
- Author
-
Zhu, Yan, Li, Fuyi, Guo, Xudong, Wang, Xiaoyu, Coin, Lachlan J M, Webb, Geoffrey I, Song, Jiangning, and Jia, Cangzhi
- Subjects
- *
ARTIFICIAL neural networks , *INTERNET servers , *RNA polymerases , *DNA sequencing - Abstract
Background Promoters are DNA regions that initiate the transcription of specific genes near the transcription start sites. In bacteria, promoters are recognized by RNA polymerases and associated sigma factors. Effective promoter recognition is essential for synthesizing the gene-encoded products by bacteria to grow and adapt to different environmental conditions. A variety of machine learning-based predictors for bacterial promoters have been developed; however, most of them were designed specifically for a particular species. To date, only a few predictors are available for identifying general bacterial promoters with limited predictive performance. Results In this study, we developed TIMER, a Siamese neural network-based approach for identifying both general and species-specific bacterial promoters. Specifically, TIMER uses DNA sequences as the input and employs three Siamese neural networks with the attention layers to train and optimize the models for a total of 13 species-specific and general bacterial promoters. Extensive 10-fold cross-validation and independent tests demonstrated that TIMER achieves a competitive performance and outperforms several existing methods on both general and species-specific promoter prediction. As an implementation of the proposed method, the web server of TIMER is publicly accessible at http://web.unimelb-bioinfortools.cloud.edu.au/TIMER/. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
9. ATTIC is an integrated approach for predicting A-to-I RNA editing sites in three species.
- Author
-
Chen, Ruyi, Li, Fuyi, Guo, Xudong, Bi, Yue, Li, Chen, Pan, Shirui, Coin, Lachlan J M, and Song, Jiangning
- Subjects
- *
RNA editing , *INTERNET servers , *MACHINE learning , *FEATURE selection , *DOUBLE-stranded RNA , *DROSOPHILA melanogaster - Abstract
A-to-I editing is the most prevalent RNA editing event, which refers to the change of adenosine (A) bases to inosine (I) bases in double-stranded RNAs. Several studies have revealed that A-to-I editing can regulate cellular processes and is associated with various human diseases. Therefore, accurate identification of A-to-I editing sites is crucial for understanding RNA-level (i.e. transcriptional) modifications and their potential roles in molecular functions. To date, various computational approaches for A-to-I editing site identification have been developed; however, their performance is still unsatisfactory and needs further improvement. In this study, we developed a novel stacked-ensemble learning model, ATTIC (A-To-I ediTing predICtor), to accurately identify A-to-I editing sites across three species, including Homo sapiens , Mus musculus and Drosophila melanogaster. We first comprehensively evaluated 37 RNA sequence-derived features combined with 14 popular machine learning algorithms. Then, we selected the optimal base models to build a series of stacked ensemble models. The final ATTIC framework was developed based on the optimal models improved by the feature selection strategy for specific species. Extensive cross-validation and independent tests illustrate that ATTIC outperforms state-of-the-art tools for predicting A-to-I editing sites. We also developed a web server for ATTIC, which is publicly available at http://web.unimelb-bioinfortools.cloud.edu.au/ATTIC/. We anticipate that ATTIC can be utilized as a useful tool to accelerate the identification of A-to-I RNA editing events and help characterize their roles in post-transcriptional regulation. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
10. Clarion is a multi-label problem transformation method for identifying mRNA subcellular localizations.
- Author
-
Bi, Yue, Li, Fuyi, Guo, Xudong, Wang, Zhikang, Pan, Tong, Guo, Yuming, Webb, Geoffrey I, Yao, Jianhua, Jia, Cangzhi, and Song, Jiangning
- Subjects
- *
GENE regulatory networks , *GENETIC regulation - Abstract
Subcellular localization of messenger RNAs (mRNAs) plays a key role in the spatial regulation of gene activity. The functions of mRNAs have been shown to be closely linked with their localizations. As such, understanding of the subcellular localizations of mRNAs can help elucidate gene regulatory networks. Despite several computational methods that have been developed to predict mRNA localizations within cells, there is still much room for improvement in predictive performance, especially for the multiple-location prediction. In this study, we proposed a novel multi-label multi-class predictor, termed Clarion, for mRNA subcellular localization prediction. Clarion was developed based on a manually curated benchmark dataset and leveraged the weighted series method for multi-label transformation. Extensive benchmarking tests demonstrated Clarion achieved competitive predictive performance and the weighted series method plays a crucial role in securing superior performance of Clarion. In addition, the independent test results indicate that Clarion outperformed the state-of-the-art methods and can secure accuracy of 81.47, 91.29, 79.77, 92.10, 89.15, 83.74, 80.74, 79.23 and 84.74% for chromatin, cytoplasm, cytosol, exosome, membrane, nucleolus, nucleoplasm, nucleus and ribosome, respectively. The webserver and local stand-alone tool of Clarion is freely available at http://monash.bioweb.cloud.edu.au/Clarion/. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
11. ASPIRER: a new computational approach for identifying non-classical secreted proteins based on deep learning.
- Author
-
Wang, Xiaoyu, Li, Fuyi, Xu, Jing, Rong, Jia, Webb, Geoffrey I, Ge, Zongyuan, Li, Jian, and Song, Jiangning
- Subjects
- *
DEEP learning , *ARTIFICIAL neural networks , *AMINO acid sequence , *CONVOLUTIONAL neural networks , *SIGNAL peptides , *CELL communication - Abstract
Protein secretion has a pivotal role in many biological processes and is particularly important for intercellular communication, from the cytoplasm to the host or external environment. Gram-positive bacteria can secrete proteins through multiple secretion pathways. The non-classical secretion pathway has recently received increasing attention among these secretion pathways, but its exact mechanism remains unclear. Non-classical secreted proteins (NCSPs) are a class of secreted proteins lacking signal peptides and motifs. Several NCSP predictors have been proposed to identify NCSPs and most of them employed the whole amino acid sequence of NCSPs to construct the model. However, the sequence length of different proteins varies greatly. In addition, not all regions of the protein are equally important and some local regions are not relevant to the secretion. The functional regions of the protein, particularly in the N- and C-terminal regions, contain important determinants for secretion. In this study, we propose a new hybrid deep learning-based framework, referred to as ASPIRER, which improves the prediction of NCSPs from amino acid sequences. More specifically, it combines a whole sequence-based XGBoost model and an N-terminal sequence-based convolutional neural network model; 5-fold cross-validation and independent tests demonstrate that ASPIRER achieves superior performance than existing state-of-the-art approaches. The source code and curated datasets of ASPIRER are publicly available at https://github.com/yanwu20/ASPIRER /. ASPIRER is anticipated to be a useful tool for improved prediction of novel putative NCSPs from sequences information and prioritization of candidate proteins for follow-up experimental validation. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
12. Positive-unlabeled learning in bioinformatics and computational biology: a brief review.
- Author
-
Li, Fuyi, Dong, Shuangyu, Leier, André, Han, Meiya, Guo, Xudong, Xu, Jing, Wang, Xiaoyu, Pan, Shirui, Jia, Cangzhi, Zhang, Yang, Webb, Geoffrey I, Coin, Lachlan J M, Li, Chen, and Song, Jiangning
- Subjects
- *
COMPUTATIONAL biology , *CLASSIFICATION algorithms , *BIOINFORMATICS , *MACHINE learning , *SUPERVISED learning - Abstract
Conventional supervised binary classification algorithms have been widely applied to address significant research questions using biological and biomedical data. This classification scheme requires two fully labeled classes of data (e.g. positive and negative samples) to train a classification model. However, in many bioinformatics applications, labeling data is laborious, and the negative samples might be potentially mislabeled due to the limited sensitivity of the experimental equipment. The positive unlabeled (PU) learning scheme was therefore proposed to enable the classifier to learn directly from limited positive samples and a large number of unlabeled samples (i.e. a mixture of positive or negative samples). To date, several PU learning algorithms have been developed to address various biological questions, such as sequence identification, functional site characterization and interaction prediction. In this paper, we revisit a collection of 29 state-of-the-art PU learning bioinformatic applications to address various biological questions. Various important aspects are extensively discussed, including PU learning methodology, biological application, classifier design and evaluation strategy. We also comment on the existing issues of PU learning and offer our perspectives for the future development of PU learning applications. We anticipate that our work serves as an instrumental guideline for a better understanding of the PU learning framework in bioinformatics and further developing next-generation PU learning frameworks for critical biological applications. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
13. Porpoise: a new approach for accurate prediction of RNA pseudouridine sites.
- Author
-
Li, Fuyi, Guo, Xudong, Jin, Peipei, Chen, Jinxiang, Xiang, Dongxu, Song, Jiangning, and Coin, Lachlan J M
- Subjects
- *
PSEUDOURIDINE , *PORPOISES , *URIDINE , *NUCLEOTIDE sequence , *RNA , *RNA modification & restriction - Abstract
Pseudouridine is a ubiquitous RNA modification type present in eukaryotes and prokaryotes, which plays a vital role in various biological processes. Almost all kinds of RNAs are subject to this modification. However, it remains a great challenge to identify pseudouridine sites via experimental approaches, requiring expensive and time-consuming experimental research. Therefore, computational approaches that can be used to perform accurate in silico identification of pseudouridine sites from the large amount of RNA sequence data are highly desirable and can aid in the functional elucidation of this critical modification. Here, we propose a new computational approach, termed Porpoise, to accurately identify pseudouridine sites from RNA sequence data. Porpoise builds upon a comprehensive evaluation of 18 frequently used feature encoding schemes based on the selection of four types of features, including binary features, pseudo k -tuple composition, nucleotide chemical property and position-specific trinucleotide propensity based on single-strand (PSTNPss). The selected features are fed into the stacked ensemble learning framework to enable the construction of an effective stacked model. Both cross-validation tests on the benchmark dataset and independent tests show that Porpoise achieves superior predictive performance than several state-of-the-art approaches. The application of model interpretation tools demonstrates the importance of PSTNPs for the performance of the trained models. This new method is anticipated to facilitate community-wide efforts to identify putative pseudouridine sites and formulate novel testable biological hypothesis. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
14. Assessing the performance of computational predictors for estimating protein stability changes upon missense mutations.
- Author
-
Iqbal, Shahid, Li, Fuyi, Akutsu, Tatsuya, Ascher, David B, Webb, Geoffrey I, and Song, Jiangning
- Subjects
- *
PROTEIN stability , *MISSENSE mutation , *PROTEIN engineering , *LIVER proteins , *MACHINE learning , *GENETIC disorders - Abstract
Understanding how a mutation might affect protein stability is of significant importance to protein engineering and for understanding protein evolution genetic diseases. While a number of computational tools have been developed to predict the effect of missense mutations on protein stability protein stability upon mutations, they are known to exhibit large biases imparted in part by the data used to train and evaluate them. Here, we provide a comprehensive overview of predictive tools, which has provided an evolving insight into the importance and relevance of features that can discern the effects of mutations on protein stability. A diverse selection of these freely available tools was benchmarked using a large mutation-level blind dataset of 1342 experimentally characterised mutations across 130 proteins from ThermoMutDB, a second test dataset encompassing 630 experimentally characterised mutations across 39 proteins from iStable2.0 and a third blind test dataset consisting of 268 mutations in 27 proteins from the newly published ProThermDB. The performance of the methods was further evaluated with respect to the site of mutation, type of mutant residue and by ranging the pH and temperature. Additionally, the classification performance was also evaluated by classifying the mutations as stabilizing (∆∆ G ≥ 0) or destabilizing (∆∆ G < 0). The results reveal that the performance of the predictors is affected by the site of mutation and the type of mutant residue. Further, the results show very low performance for pH values 6–8 and temperature higher than 65 for all predictors except iStable2.0 on the S630 dataset. To illustrate how stability and structure change upon single point mutation, we considered four stabilizing, two destabilizing and two stabilizing mutations from two proteins, namely the toxin protein and bovine liver cytochrome. Overall, the results on S268, S630 and S1342 datasets show that the performance of the integrated predictors is better than the mechanistic or individual machine learning predictors. We expect that this paper will provide useful guidance for the design and development of next-generation bioinformatic tools for predicting protein stability changes upon mutations. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
15. Normalized solutions of a transmission problem of Kirchhoff type.
- Author
-
Zhu, Xiaoli, Li, Fuyi, and Liang, Zhanping
- Subjects
- *
EIGENVALUES , *ELLIPTIC equations - Abstract
In this paper, we study solutions with a prescribed L 2 mass of a transmission problem of Kirchhoff type, which is an interface problem for elliptic operators in bounded domains of R 3 and arises in some physical systems in different connected media. Because of the presence of transmission/interface conditions, we first deduce a modified Gagliardo–Nirenberg inequality applicable to the Kirchhoff-type transmission problem, based on which the L 2 critical exponent is defined. Subsequently, we prove the existence of normalized solutions no matter whether the nonlinearity is L 2 subcritical, critical, or supercritical. In particular, in the supercritical case, we use an eigenvalue of the transmission problem to characterize the conditions for the existence of solutions. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
16. Comprehensive assessment of machine learning-based methods for predicting antimicrobial peptides.
- Author
-
Xu, Jing, Li, Fuyi, Leier, André, Xiang, Dongxu, Shen, Hsin-Hui, Lago, Tatiana T Marquez, Li, Jian, Yu, Dong-Jun, and Song, Jiangning
- Subjects
- *
ANTIMICROBIAL peptides , *FEATURE selection , *MACHINE learning , *DEEP learning , *CELL physiology , *SUPPORT vector machines , *PEPTIDE antibiotics - Abstract
Antimicrobial peptides (AMPs) are a unique and diverse group of molecules that play a crucial role in a myriad of biological processes and cellular functions. AMP-related studies have become increasingly popular in recent years due to antimicrobial resistance, which is becoming an emerging global concern. Systematic experimental identification of AMPs faces many difficulties due to the limitations of current methods. Given its significance, more than 30 computational methods have been developed for accurate prediction of AMPs. These approaches show high diversity in their data set size, data quality, core algorithms, feature extraction, feature selection techniques and evaluation strategies. Here, we provide a comprehensive survey on a variety of current approaches for AMP identification and point at the differences between these methods. In addition, we evaluate the predictive performance of the surveyed tools based on an independent test data set containing 1536 AMPs and 1536 non-AMPs. Furthermore, we construct six validation data sets based on six different common AMP databases and compare different computational methods based on these data sets. The results indicate that amPEPpy achieves the best predictive performance and outperforms the other compared methods. As the predictive performances are affected by the different data sets used by different methods, we additionally perform the 5-fold cross-validation test to benchmark different traditional machine learning methods on the same data set. These cross-validation results indicate that random forest, support vector machine and eXtreme Gradient Boosting achieve comparatively better performances than other machine learning methods and are often the algorithms of choice of multiple AMP prediction tools. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
17. Anthem: a user customised tool for fast and accurate prediction of binding between peptides and HLA class I molecules.
- Author
-
Mei, Shutao, Li, Fuyi, Xiang, Dongxu, Ayala, Rochelle, Faridi, Pouya, Webb, Geoffrey I, Illing, Patricia T, Rossjohn, Jamie, Akutsu, Tatsuya, Croft, Nathan P, Purcell, Anthony W, and Song, Jiangning
- Subjects
- *
HISTOCOMPATIBILITY class I antigens , *T cells , *PEPTIDES - Abstract
Neopeptide-based immunotherapy has been recognised as a promising approach for the treatment of cancers. For neopeptides to be recognised by CD8+ T cells and induce an immune response, their binding to human leukocyte antigen class I (HLA-I) molecules is a necessary first step. Most epitope prediction tools thus rely on the prediction of such binding. With the use of mass spectrometry, the scale of naturally presented HLA ligands that could be used to develop such predictors has been expanded. However, there are rarely efforts that focus on the integration of these experimental data with computational algorithms to efficiently develop up-to-date predictors. Here, we present Anthem for accurate HLA-I binding prediction. In particular, we have developed a user-friendly framework to support the development of customisable HLA-I binding prediction models to meet challenges associated with the rapidly increasing availability of large amounts of immunopeptidomic data. Our extensive evaluation, using both independent and experimental datasets shows that Anthem achieves an overall similar or higher area under curve value compared with other contemporary tools. It is anticipated that Anthem will provide a unique opportunity for the non-expert user to analyse and interpret their own in-house or publicly deposited datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
18. Large-scale comparative review and assessment of computational methods for anti-cancer peptide identification.
- Author
-
Liang, Xiao, Li, Fuyi, Chen, Jinxiang, Li, Junlong, Wu, Hao, Li, Shuqin, Song, Jiangning, and Liu, Quanzhong
- Subjects
- *
FEATURE selection , *MACHINE learning , *SOURCE code , *RESEARCH methodology , *KEY performance indicators (Management) , *SCALABILITY - Abstract
Anti-cancer peptides (ACPs) are known as potential therapeutics for cancer. Due to their unique ability to target cancer cells without affecting healthy cells directly, they have been extensively studied. Many peptide-based drugs are currently evaluated in the preclinical and clinical trials. Accurate identification of ACPs has received considerable attention in recent years; as such, a number of machine learning-based methods for in silico identification of ACPs have been developed. These methods promote the research on the mechanism of ACPs therapeutics against cancer to some extent. There is a vast difference in these methods in terms of their training/testing datasets, machine learning algorithms, feature encoding schemes, feature selection methods and evaluation strategies used. Therefore, it is desirable to summarize the advantages and disadvantages of the existing methods, provide useful insights and suggestions for the development and improvement of novel computational tools to characterize and identify ACPs. With this in mind, we firstly comprehensively investigate 16 state-of-the-art predictors for ACPs in terms of their core algorithms, feature encoding schemes, performance evaluation metrics and webserver/software usability. Then, comprehensive performance assessment is conducted to evaluate the robustness and scalability of the existing predictors using a well-prepared benchmark dataset. We provide potential strategies for the model performance improvement. Moreover, we propose a novel ensemble learning framework, termed ACPredStackL, for the accurate identification of ACPs. ACPredStackL is developed based on the stacking ensemble strategy combined with SVM, Naïve Bayesian, lightGBM and KNN. Empirical benchmarking experiments against the state-of-the-art methods demonstrate that ACPredStackL achieves a comparative performance for predicting ACPs. The webserver and source code of ACPredStackL is freely available at http://bigdata.biocie.cn/ACPredStackL/ and https://github.com/liangxiaoq/ACPredStackL , respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
19. DeepBL: a deep learning-based approach for in silico discovery of beta-lactamases.
- Author
-
Wang, Yanan, Li, Fuyi, Bharathwaj, Manasa, Rosas, Natalia C, Leier, André, Akutsu, Tatsuya, Webb, Geoffrey I, Marquez-Lago, Tatiana T, Li, Jian, Lithgow, Trevor, and Song, Jiangning
- Subjects
- *
DEEP learning , *BACTERIAL proteins , *AMINO acid sequence , *BETA lactam antibiotics , *ANTIBIOTICS , *BETA lactamases , *DRUG resistance in bacteria , *DRUG resistance in microorganisms - Abstract
Beta-lactamases (BLs) are enzymes localized in the periplasmic space of bacterial pathogens, where they confer resistance to beta-lactam antibiotics. Experimental identification of BLs is costly yet crucial to understand beta-lactam resistance mechanisms. To address this issue, we present DeepBL, a deep learning-based approach by incorporating sequence-derived features to enable high-throughput prediction of BLs. Specifically, DeepBL is implemented based on the Small VGGNet architecture and the TensorFlow deep learning library. Furthermore, the performance of DeepBL models is investigated in relation to the sequence redundancy level and negative sample selection in the benchmark dataset. The models are trained on datasets of varying sequence redundancy thresholds, and the model performance is evaluated by extensive benchmarking tests. Using the optimized DeepBL model, we perform proteome-wide screening for all reviewed bacterium protein sequences available from the UniProt database. These results are freely accessible at the DeepBL webserver at http://deepbl.erc.monash.edu.au/. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
20. Computational identification of eukaryotic promoters based on cascaded deep capsule neural networks.
- Author
-
Zhu, Yan, Li, Fuyi, Xiang, Dongxu, Akutsu, Tatsuya, Song, Jiangning, and Jia, Cangzhi
- Subjects
- *
CAPSULE neural networks , *CONVOLUTIONAL neural networks , *DEEP learning , *DNA sequencing , *RNA polymerases , *DROSOPHILA melanogaster - Abstract
A promoter is a region in the DNA sequence that defines where the transcription of a gene by RNA polymerase initiates, which is typically located proximal to the transcription start site (TSS). How to correctly identify the gene TSS and the core promoter is essential for our understanding of the transcriptional regulation of genes. As a complement to conventional experimental methods, computational techniques with easy-to-use platforms as essential bioinformatics tools can be effectively applied to annotate the functions and physiological roles of promoters. In this work, we propose a deep learning-based method termed Depicter (D eep l e arning for p red ic ting promo ter), for identifying three specific types of promoters, i.e. promoter sequences with the TATA-box (TATA model), promoter sequences without the TATA-box (non-TATA model), and indistinguishable promoters (TATA and non-TATA model). Depicter is developed based on an up-to-date, species-specific dataset which includes Homo sapiens, Mus musculus, Drosophila melanogaster and Arabidopsis thaliana promoters. A convolutional neural network coupled with capsule layers is proposed to train and optimize the prediction model of Depicter. Extensive benchmarking and independent tests demonstrate that Depicter achieves an improved predictive performance compared with several state-of-the-art methods. The webserver of Depicter is implemented and freely accessible at https://depicter.erc.monash.edu/. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
21. Systematic evaluation of machine learning methods for identifying human–pathogen protein–protein interactions.
- Author
-
Chen, Huaming, Li, Fuyi, Wang, Lei, Jin, Yaochu, Chi, Chi-Hung, Kurgan, Lukasz, Song, Jiangning, and Shen, Jun
- Subjects
- *
MACHINE learning , *PROTEIN-protein interactions , *MACHINE performance , *SOCIAL interaction , *SEQUENTIAL analysis - Abstract
In recent years, high-throughput experimental techniques have significantly enhanced the accuracy and coverage of protein–protein interaction identification, including human–pathogen protein–protein interactions (HP-PPIs). Despite this progress, experimental methods are, in general, expensive in terms of both time and labour costs, especially considering that there are enormous amounts of potential protein-interacting partners. Developing computational methods to predict interactions between human and bacteria pathogen has thus become critical and meaningful, in both facilitating the detection of interactions and mining incomplete interaction maps. In this paper, we present a systematic evaluation of machine learning-based computational methods for human–bacterium protein–protein interactions (HB-PPIs). We first reviewed a vast number of publicly available databases of HP-PPIs and then critically evaluate the availability of these databases. Benefitting from its well-structured nature, we subsequently preprocess the data and identified six bacterium pathogens that could be used to study bacterium subjects in which a human was the host. Additionally, we thoroughly reviewed the literature on 'host–pathogen interactions' whereby existing models were summarized that we used to jointly study the impact of different feature representation algorithms and evaluate the performance of existing machine learning computational models. Owing to the abundance of sequence information and the limited scale of other protein-related information, we adopted the primary protocol from the literature and dedicated our analysis to a comprehensive assessment of sequence information and machine learning models. A systematic evaluation of machine learning models and a wide range of feature representation algorithms based on sequence information are presented as a comparison survey towards the prediction performance evaluation of HB-PPIs. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
22. Computational prediction and interpretation of both general and specific types of promoters in Escherichia coli by exploiting a stacked ensemble-learning framework.
- Author
-
Li, Fuyi, Chen, Jinxiang, Ge, Zongyuan, Wen, Ya, Yue, Yanwei, Hayashida, Morihiro, Baggag, Abdelkader, Bensmail, Halima, and Song, Jiangning
- Subjects
- *
ESCHERICHIA coli , *PROBLEM solving , *GENES , *MACHINE learning , *NUCLEIC acids - Abstract
Promoters are short consensus sequences of DNA, which are responsible for transcription activation or the repression of all genes. There are many types of promoters in bacteria with important roles in initiating gene transcription. Therefore, solving promoter-identification problems has important implications for improving the understanding of their functions. To this end, computational methods targeting promoter classification have been established; however, their performance remains unsatisfactory. In this study, we present a novel stacked-ensemble approach (termed SELECTOR) for identifying both promoters and their respective classification. SELECTOR combined the composition of k -spaced nucleic acid pairs, parallel correlation pseudo-dinucleotide composition, position-specific trinucleotide propensity based on single-strand, and DNA strand features and using five popular tree-based ensemble learning algorithms to build a stacked model. Both 5-fold cross-validation tests using benchmark datasets and independent tests using the newly collected independent test dataset showed that SELECTOR outperformed state-of-the-art methods in both general and specific types of promoter prediction in Escherichia coli. Furthermore, this novel framework provides essential interpretations that aid understanding of model success by leveraging the powerful Shapley Additive exPlanation algorithm, thereby highlighting the most important features relevant for predicting both general and specific types of promoters and overcoming the limitations of existing 'Black-box' approaches that are unable to reveal causal relationships from large amounts of initially encoded features. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
23. A comprehensive review and performance evaluation of bioinformatics tools for HLA class I peptide-binding prediction.
- Author
-
Mei, Shutao, Li, Fuyi, Leier, André, Marquez-Lago, Tatiana T, Giam, Kailin, Croft, Nathan P, Akutsu, Tatsuya, Smith, A Ian, Li, Jian, Rossjohn, Jamie, Purcell, Anthony W, and Song, Jiangning
- Subjects
- *
T cell receptors , *HISTOCOMPATIBILITY class I antigens , *FORECASTING , *CYTOTOXIC T cells , *MAJOR histocompatibility complex , *PROTEOLYSIS - Abstract
Human leukocyte antigen class I (HLA-I) molecules are encoded by major histocompatibility complex (MHC) class I loci in humans. The binding and interaction between HLA-I molecules and intracellular peptides derived from a variety of proteolytic mechanisms play a crucial role in subsequent T-cell recognition of target cells and the specificity of the immune response. In this context, tools that predict the likelihood for a peptide to bind to specific HLA class I allotypes are important for selecting the most promising antigenic targets for immunotherapy. In this article, we comprehensively review a variety of currently available tools for predicting the binding of peptides to a selection of HLA-I allomorphs. Specifically, we compare their calculation methods for the prediction score, employed algorithms, evaluation strategies and software functionalities. In addition, we have evaluated the prediction performance of the reviewed tools based on an independent validation data set, containing 21 101 experimentally verified ligands across 19 HLA-I allotypes. The benchmarking results show that MixMHCpred 2.0.1 achieves the best performance for predicting peptides binding to most of the HLA-I allomorphs studied, while NetMHCpan 4.0 and NetMHCcons 1.1 outperform the other machine learning-based and consensus-based tools, respectively. Importantly, it should be noted that a peptide predicted with a higher binding score for a specific HLA allotype does not necessarily imply it will be immunogenic. That said, peptide-binding predictors are still very useful in that they can help to significantly reduce the large number of epitope candidates that need to be experimentally verified. Several other factors, including susceptibility to proteasome cleavage, peptide transport into the endoplasmic reticulum and T-cell receptor repertoire, also contribute to the immunogenicity of peptide antigens, and some of them can be considered by some predictors. Therefore, integrating features derived from these additional factors together with HLA-binding properties by using machine-learning algorithms may increase the prediction accuracy of immunogenic peptides. As such, we anticipate that this review and benchmarking survey will assist researchers in selecting appropriate prediction tools that best suit their purposes and provide useful guidelines for the development of improved antigen predictors in the future. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
24. PRISMOID: a comprehensive 3D structure database for post-translational modifications and mutations with functional impact.
- Author
-
Li, Fuyi, Fan, Cunshuo, Marquez-Lago, Tatiana T, Leier, André, Revote, Jerico, Jia, Cangzhi, Zhu, Yan, Smith, A Ian, Webb, Geoffrey I, Liu, Quanzhong, Wei, Leyi, Li, Jian, and Song, Jiangning
- Subjects
- *
INTERNET servers , *JAVASCRIPT programming language , *WEB databases , *AMINO acid sequence , *DATA visualization , *MORPHOLOGY , *DATABASES - Abstract
Post-translational modifications (PTMs) play very important roles in various cell signaling pathways and biological process. Due to PTMs' extremely important roles, many major PTMs have been studied, while the functional and mechanical characterization of major PTMs is well documented in several databases. However, most currently available databases mainly focus on protein sequences, while the real 3D structures of PTMs have been largely ignored. Therefore, studies of PTMs 3D structural signatures have been severely limited by the deficiency of the data. Here, we develop PRISMOID, a novel publicly available and free 3D structure database for a wide range of PTMs. PRISMOID represents an up-to-date and interactive online knowledge base with specific focus on 3D structural contexts of PTMs sites and mutations that occur on PTMs and in the close proximity of PTM sites with functional impact. The first version of PRISMOID encompasses 17 145 non-redundant modification sites on 3919 related protein 3D structure entries pertaining to 37 different types of PTMs. Our entry web page is organized in a comprehensive manner, including detailed PTM annotation on the 3D structure and biological information in terms of mutations affecting PTMs, secondary structure features and per-residue solvent accessibility features of PTM sites, domain context, predicted natively disordered regions and sequence alignments. In addition, high-definition JavaScript packages are employed to enhance information visualization in PRISMOID. PRISMOID equips a variety of interactive and customizable search options and data browsing functions; these capabilities allow users to access data via keyword, ID and advanced options combination search in an efficient and user-friendly way. A download page is also provided to enable users to download the SQL file, computational structural features and PTM sites' data. We anticipate PRISMOID will swiftly become an invaluable online resource, assisting both biologists and bioinformaticians to conduct experiments and develop applications supporting discovery efforts in the sequence–structural–functional relationship of PTMs and providing important insight into mutations and PTM sites interaction mechanisms. The PRISMOID database is freely accessible at http://prismoid.erc.monash.edu/. The database and web interface are implemented in MySQL, JSP, JavaScript and HTML with all major browsers supported. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
25. DeepCleave: a deep learning predictor for caspase and matrix metalloprotease substrates and cleavage sites.
- Author
-
Li, Fuyi, Chen, Jinxiang, Leier, André, Marquez-Lago, Tatiana, Liu, Quanzhong, Wang, Yanze, Revote, Jerico, Smith, A Ian, Akutsu, Tatsuya, Webb, Geoffrey I, Kurgan, Lukasz, and Song, Jiangning
- Subjects
- *
ARTIFICIAL neural networks , *DEEP learning , *PEPTIDE bonds , *DNA-binding proteins , *AMINO acid sequence , *PROTEOLYTIC enzymes - Abstract
Motivation Proteases are enzymes that cleave target substrate proteins by catalyzing the hydrolysis of peptide bonds between specific amino acids. While the functional proteolysis regulated by proteases plays a central role in the 'life and death' cellular processes, many of the corresponding substrates and their cleavage sites were not found yet. Availability of accurate predictors of the substrates and cleavage sites would facilitate understanding of proteases' functions and physiological roles. Deep learning is a promising approach for the development of accurate predictors of substrate cleavage events. Results We propose DeepCleave, the first deep learning-based predictor of protease-specific substrates and cleavage sites. DeepCleave uses protein substrate sequence data as input and employs convolutional neural networks with transfer learning to train accurate predictive models. High predictive performance of our models stems from the use of high-quality cleavage site features extracted from the substrate sequences through the deep learning process, and the application of transfer learning, multiple kernels and attention layer in the design of the deep network. Empirical tests against several related state-of-the-art methods demonstrate that DeepCleave outperforms these methods in predicting caspase and matrix metalloprotease substrate-cleavage sites. Availability and implementation The DeepCleave webserver and source code are freely available at http://deepcleave.erc.monash.edu/. Supplementary information Supplementary data are available at Bioinformatics online. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
26. Twenty years of bioinformatics research for protease-specific substrate and cleavage site prediction: a comprehensive revisit and benchmarking of existing methods.
- Author
-
Li, Fuyi, Wang, Yanan, Li, Chen, Marquez-Lago, Tatiana T, Leier, André, Rawlings, Neil D, Haffari, Gholamreza, Revote, Jerico, Akutsu, Tatsuya, Chou, Kuo-Chen, Purcell, Anthony W, Pike, Robert N, Webb, Geoffrey I, Smith, A Ian, Lithgow, Trevor, Daly, Roger J, Whisstock, James C, and Song, Jiangning
- Subjects
- *
CHEMICAL processes , *CALPAIN , *PARALLEL programming , *BENCHMARKING (Management) , *DEEP learning - Abstract
The roles of proteolytic cleavage have been intensively investigated and discussed during the past two decades. This irreversible chemical process has been frequently reported to influence a number of crucial biological processes (BPs), such as cell cycle, protein regulation and inflammation. A number of advanced studies have been published aiming at deciphering the mechanisms of proteolytic cleavage. Given its significance and the large number of functionally enriched substrates targeted by specific proteases, many computational approaches have been established for accurate prediction of protease-specific substrates and their cleavage sites. Consequently, there is an urgent need to systematically assess the state-of-the-art computational approaches for protease-specific cleavage site prediction to further advance the existing methodologies and to improve the prediction performance. With this goal in mind, in this article, we carefully evaluated a total of 19 computational methods (including 8 scoring function-based methods and 11 machine learning-based methods) in terms of their underlying algorithm, calculated features, performance evaluation and software usability. Then, extensive independent tests were performed to assess the robustness and scalability of the reviewed methods using our carefully prepared independent test data sets with 3641 cleavage sites (specific to 10 proteases). The comparative experimental results demonstrate that PROSPERous is the most accurate generic method for predicting eight protease-specific cleavage sites, while GPS-CCD and LabCaS outperformed other predictors for calpain-specific cleavage sites. Based on our review, we then outlined some potential ways to improve the prediction performance and ease the computational burden by applying ensemble learning, deep learning, positive unlabeled learning and parallel and distributed computing techniques. We anticipate that our study will serve as a practical and useful guide for interested readers to further advance next-generation bioinformatics tools for protease-specific cleavage site prediction. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
27. MULTiPly: a novel multi-layer predictor for discovering general and specific types of promoters.
- Author
-
Zhang, Meng, Li, Fuyi, Marquez-Lago, Tatiana T, Leier, André, Fan, Cunshuo, Kwoh, Chee Keong, Chou, Kuo-Chen, Song, Jiangning, and Jia, Cangzhi
- Subjects
- *
INTERNET servers , *PROMOTERS (Genetics) , *FEATURE selection , *NUCLEOTIDE sequence , *GENE enhancers - Abstract
Motivation Promoters are short DNA consensus sequences that are localized proximal to the transcription start sites of genes, allowing transcription initiation of particular genes. However, the precise prediction of promoters remains a challenging task because individual promoters often differ from the consensus at one or more positions. Results In this study, we present a new multi-layer computational approach, called MULTiPly, for recognizing promoters and their specific types. MULTiPly took into account the sequences themselves, including both local information such as k-tuple nucleotide composition, dinucleotide-based auto covariance and global information of the entire samples based on bi-profile Bayes and k -nearest neighbour feature encodings. Specifically, the F-score feature selection method was applied to identify the best unique type of feature prediction results, in combination with other types of features that were subsequently added to further improve the prediction performance of MULTiPly. Benchmarking experiments on the benchmark dataset and comparisons with five state-of-the-art tools show that MULTiPly can achieve a better prediction performance on 5-fold cross-validation and jackknife tests. Moreover, the superiority of MULTiPly was also validated on a newly constructed independent test dataset. MULTiPly is expected to be used as a useful tool that will facilitate the discovery of both general and specific types of promoters in the post-genomic era. Availability and implementation The MULTiPly webserver and curated datasets are freely available at http://flagshipnt.erc.monash.edu/MULTiPly/. Supplementary information Supplementary data are available at Bioinformatics online. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
28. Existence of nontrivial solutions for Kirchhoff-type problems with jumping nonlinearities.
- Author
-
Rong, Ting, Li, Fuyi, and Liang, Zhanping
- Subjects
- *
INFINITY (Mathematics) - Abstract
In this study, the Fučik spectrum is used to investigate Kirchhoff-type problems with jumping nonlinearities at infinity, and a new result on the existence of nontrivial solutions is obtained. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
29. Normalized solutions to nonlinear scalar field equations with doubly nonlocal terms and critical exponent.
- Author
-
Long, Lei, Li, Fuyi, and Zhu, Xiaoli
- Published
- 2023
- Full Text
- View/download PDF
30. Fučik spectrum for the Kirchhoff-type problem and applications.
- Author
-
Li, Fuyi, Rong, Ting, and Liang, Zhanping
- Subjects
- *
KIRCHHOFF'S approximation , *MULTIPLICITY (Mathematics) , *CONVEX sets , *LIPSCHITZ spaces , *INFINITY (Mathematics) , *NONLINEAR theories - Abstract
Abstract In this study, we focus on the Fučik spectrum for the Kirchhoff-type problem, which is defined as a set Σ comprising those (α , β) ∈ R 2 such that (0.1) − ∫ Ω | ∇ u | 2 Δ u = α (u +) 3 + β (u −) 3 , in Ω , u = 0 , on ∂ Ω has a nontrivial solution, where Ω is an open ball in R N for N = 1 , 2 , 3 ; or Ω ⊂ R 2 is symmetric in x and y , and convex in the x and y directions, u + = max { u , 0 } , u − = min { u , 0 } , and u = u + + u −. First, we prove that the curves { μ 1 } × R , R × { μ 1 } , and C ≔ { (s + c (s) , c (s)) : s ∈ R } belong to Σ , where c (s) = min { β : (s + β , β) ∈ Σ 0 } and Σ 0 comprises those (α , β) ∈ R 2 such that (0.1) has a sign changing solution. We refer to { μ 1 } × R and R × { μ 1 } as trivial curves in Σ in the sense that any solution of (0.1) with (α , β) ∈ { μ 1 } × R or R × { μ 1 } is signed. We denote C as the first nontrivial curve in Σ in the sense that any solution of (0.1) with (α , β) ∈ C is sign changing and for each s ∈ R , we consider the line that passes through (s , 0) with a slope of 1 in the α O β plane R 2 , then the first point on this line that intersects with Σ 0 is simply (s + c (s) , c (s)) ∈ C. Second, we investigate some properties of the function c and the curve C. In particular, c is Lipschitz continuous, decreasing on R and c (s) → μ 1 as s → ∞ , and C is asymptotic to the broken line ℒ 2 ≔ { μ 1 } × [ μ 1 , ∞) ∪ [ μ 1 , ∞) × { μ 1 } . Furthermore, we show that the point (α , β) corresponding to the signed solution of (0.1) is from ℒ ≔ ({ μ 1 } × R) ∪ (R × { μ 1 }) , the point (α , β) corresponding to the sign changing solution of (0.1) is on the upper right of ℒ 2 , and no nontrivial solution of (0.1) exists when (α , β) is between ℒ 2 and C. Finally, as an application, we establish the multiplicity of solutions to the following Kirchhoff-type problem: − 1 + ∫ Ω | ∇ u | 2 Δ u = f (x , u) , in Ω , u = 0 , on ∂ Ω , where the nonlinearity f is asymptotically linear at zero and asymptotically 3-linear at infinity. To the best of our knowledge, this is the first study to consider that the nonlinearity has an extension property at both the zero and infinity points. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
31. Positive-unlabelled learning of glycosylation sites in the human proteome.
- Author
-
Li, Fuyi, Zhang, Yang, Purcell, Anthony W., Webb, Geoffrey I., Chou, Kuo-Chen, Lithgow, Trevor, Li, Chen, and Song, Jiangning
- Subjects
- *
GLYCOSYLATION , *POST-translational modification , *PROTEOMICS , *SUPERVISED learning , *PROTEIN stability - Abstract
Background: As an important type of post-translational modification (PTM), protein glycosylation plays a crucial role in protein stability and protein function. The abundance and ubiquity of protein glycosylation across three domains of life involving Eukarya, Bacteria and Archaea demonstrate its roles in regulating a variety of signalling and metabolic pathways. Mutations on and in the proximity of glycosylation sites are highly associated with human diseases. Accordingly, accurate prediction of glycosylation can complement laboratory-based methods and greatly benefit experimental efforts for characterization and understanding of functional roles of glycosylation. For this purpose, a number of supervised-learning approaches have been proposed to identify glycosylation sites, demonstrating a promising predictive performance. To train a conventional supervised-learning model, both reliable positive and negative samples are required. However, in practice, a large portion of negative samples (i.e. non-glycosylation sites) are mislabelled due to the limitation of current experimental technologies. Moreover, supervised algorithms often fail to take advantage of large volumes of unlabelled data, which can aid in model learning in conjunction with positive samples (i.e. experimentally verified glycosylation sites). Results: In this study, we propose a positive unlabelled (PU) learning-based method, PA2DE (V2.0), based on the AlphaMax algorithm for protein glycosylation site prediction. The predictive performance of this proposed method was evaluated by a range of glycosylation data collected over a ten-year period based on an interval of three years. Experiments using both benchmarking and independent tests show that our method outperformed the representative supervised-learning algorithms (including support vector machines and random forests) and one-class learners, as well as currently available prediction methods in terms of F1 score, accuracy and AUC measures. In addition, we developed an online web server as an implementation of the optimized model (available at http://glycomine.erc.monash.edu/Lab/GlycoMine%5fPU/) to facilitate community-wide efforts for accurate prediction of protein glycosylation sites. Conclusion: The proposed PU learning approach achieved a competitive predictive performance compared with currently available methods. This PU learning schema may also be effectively employed and applied to address the prediction problems of other important types of protein PTM site and functional sites. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
32. E-Commerce and Industrial Upgrading in the Chinese Apparel Value Chain.
- Author
-
Li, Fuyi, Frederick, Stacey, and Gereffi, Gary
- Subjects
- *
ELECTRONIC commerce , *CLOTHING industry , *TECHNOLOGICAL innovations , *ECONOMIC development , *ECONOMIC reform , *CONSUMER behavior - Abstract
The economic and social gains from electronic commerce (e-commerce) that promote innovation, industry upgrading and economic growth have been widely discussed. China's successful experience with e-commerce has had a positive effect in transforming consumer-goods sectors of the economy and motivating economic reform. This article looks at how e-commerce reduces barriers to entry and enables firms to move up the value chain by using the global value chain framework to analyse the impact of e-commerce on the upgrading trajectories and governance structures of China's apparel industry. For large Chinese brands, e-commerce has enabled end-market diversification. For small- and medium-sized enterprises, e-commerce has facilitated entry with functional upgrading as well as end-market upgrading. In the "two-sided markets" created by platform companies, the "engaged consumers" are the demand side of this market, and "e-commerce focused apparel firms" are the supply side of the new market. Consumers and platforms are more directly involved in value creation within this emerging internet-based structure. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
33. Quokka: a comprehensive tool for rapid and accurate prediction of kinase family-specific phosphorylation sites in the human proteome.
- Author
-
Li, Fuyi, Purcell, Anthony W, Daly, Roger J, Li, Chen, Marquez-Lago, Tatiana T, Leier, André, Akutsu, Tatsuya, Smith, A Ian, Lithgow, Trevor, Song, Jiangning, and Chou, Kuo-Chen
- Subjects
- *
PHOSPHORYLATION , *KINASES , *DISEASES , *EXPERIMENTS , *INTERNET servers , *BIOINFORMATICS - Abstract
Motivation Kinase-regulated phosphorylation is a ubiquitous type of post-translational modification (PTM) in both eukaryotic and prokaryotic cells. Phosphorylation plays fundamental roles in many signalling pathways and biological processes, such as protein degradation and protein-protein interactions. Experimental studies have revealed that signalling defects caused by aberrant phosphorylation are highly associated with a variety of human diseases, especially cancers. In light of this, a number of computational methods aiming to accurately predict protein kinase family-specific or kinase-specific phosphorylation sites have been established, thereby facilitating phosphoproteomic data analysis. Results In this work, we present Quokka, a novel bioinformatics tool that allows users to rapidly and accurately identify human kinase family-regulated phosphorylation sites. Quokka was developed by using a variety of sequence scoring functions combined with an optimized logistic regression algorithm. We evaluated Quokka based on well-prepared up-to-date benchmark and independent test datasets, curated from the Phospho.ELM and UniProt databases, respectively. The independent test demonstrates that Quokka improves the prediction performance compared with state-of-the-art computational tools for phosphorylation prediction. In summary, our tool provides users with high-quality predicted human phosphorylation sites for hypothesis generation and biological validation. Availability and implementation The Quokka webserver and datasets are freely available at http://quokka.erc.monash.edu/. Supplementary information Supplementary data are available at Bioinformatics online. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
34. Multiple positive solutions for a class of (2, p)-Laplacian equation.
- Author
-
Li, Fuyi, Rong, Ting, and Liang, Zhanping
- Subjects
- *
SET theory , *LAPLACIAN operator , *BOUNDARY value problems , *INFINITY (Mathematics) , *VARIATIONAL approach (Mathematics) - Abstract
In this work, we investigate the (2, p)-Laplacian equation −Δu − Δpu = f(x, u) in Ω with the boundary condition u = 0 on ∂Ω, where Ω is a smooth bounded domain in R N , p > 2, and the nonlinearity f has extension property at both the zero and infinity points. We observe that the above equation admits at least two positive solutions, owing to the mountain pass theorem and Ekeland's variational principle. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
35. Global solutions and blow up solutions to a class of pseudo-parabolic equations with nonlocal term.
- Author
-
Zhu, Xiaoli, Li, Fuyi, and Li, Yuhua
- Subjects
- *
PARABOLIC differential equations , *BOUNDARY value problems , *MATHEMATICAL models of Newtonian fluids , *MANIFOLDS (Mathematics) , *GROUND state (Quantum mechanics) - Abstract
In this paper, we investigate an initial boundary value problem to a class of pseudo-parabolic partial differential equations with Newtonian nonlocal term. First, the local existence and uniqueness of a weak solution is established. In virtue of the energy functional and the related Nehari manifold, we also describe the exponent decay behavior and the blow up phenomenon of weak solutions with different kinds of initial data. Our second conclusion states that some solutions starting in a potential well exist globally, whereas solutions with suitable initial data outside the potential well must blow up. Furthermore, the instability of a ground state equilibrium solution is studied. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
36. PREvaIL, an integrative approach for inferring catalytic residues using sequence, structural, and network features in a machine-learning framework.
- Author
-
Song, Jiangning, Li, Fuyi, Takemoto, Kazuhiro, Haffari, Gholamreza, Akutsu, Tatsuya, Chou, Kuo-Chen, and Webb, Geoffrey I.
- Subjects
- *
ENZYME inhibitors , *AMINO acid sequence , *STRUCTURE-activity relationships , *MACHINE learning , *PROTEIN structure , *CATALYTIC activity - Abstract
Determining the catalytic residues in an enzyme is critical to our understanding the relationship between protein sequence, structure, function, and enhancing our ability to design novel enzymes and their inhibitors. Although many enzymes have been sequenced, and their primary and tertiary structures determined, experimental methods for enzyme functional characterization lag behind. Because experimental methods used for identifying catalytic residues are resource- and labor-intensive, computational approaches have considerable value and are highly desirable for their ability to complement experimental studies in identifying catalytic residues and helping to bridge the sequence–structure–function gap. In this study, we describe a new computational method called PREvaIL for predicting enzyme catalytic residues. This method was developed by leveraging a comprehensive set of informative features extracted from multiple levels, including sequence, structure, and residue-contact network, in a random forest machine-learning framework. Extensive benchmarking experiments on eight different datasets based on 10-fold cross-validation and independent tests, as well as side-by-side performance comparisons with seven modern sequence- and structure-based methods, showed that PREvaIL achieved competitive predictive performance, with an area under the receiver operating characteristic curve and area under the precision-recall curve ranging from 0.896 to 0.973 and from 0.294 to 0.523, respectively. We demonstrated that this method was able to capture useful signals arising from different levels, leveraging such differential but useful types of features and allowing us to significantly improve the performance of catalytic residue prediction. We believe that this new method can be utilized as a valuable tool for both understanding the complex sequence–structure–function relationships of proteins and facilitating the characterization of novel enzymes lacking functional annotations. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
37. PROSPERous: high-throughput prediction of substrate cleavage sites for 90 proteases with improved accuracy.
- Author
-
Song, Jiangning, Li, Fuyi, Leier, André, Marquez-Lago, Tatiana T, Akutsu, Tatsuya, Haffari, Gholamreza, Chou, Kuo-Chen, Webb, Geoffrey I, and Pike, Robert N
- Subjects
- *
PROTEOLYTIC enzymes , *PEPTIDES - Abstract
Proteases are enzymes that specifically cleave the peptide backbone of their target proteins. As an important type of irreversible post-translational modification, protein cleavage underlies many key physiological processes. When dysregulated, proteases' actions are associated with numerous diseases. Many proteases are highly specific, cleaving only those target substrates that present certain particular amino acid sequence patterns. Therefore, tools that successfully identify potential target substrates for proteases may also identify previously unknown, physiologically relevant cleavage sites, thus providing insights into biological processes and guiding hypothesis-driven experiments aimed at verifying protease-substrate interaction. In this work, we present PROSPERous, a tool for rapid in silico prediction of protease-specific cleavage sites in substrate sequences. Our tool is based on logistic regression models and uses different scoring functions and their pairwise combinations to subsequently predict potential cleavage sites. PROSPERous represents a state-of-the-art tool that enables fast, accurate and high-throughput prediction of substrate cleavage sites for 90 proteases. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
38. Existence and uniqueness results for Kirchhoff–Schrödinger–Poisson system with general singularity.
- Author
-
Li, Fuyi, Song, Zhaoxia, and Zhang, Qi
- Subjects
- *
MATHEMATICS theorems , *EIGENFUNCTIONS , *MATHEMATICAL equivalence , *UNIQUENESS (Mathematics) , *LIPSCHITZ spaces - Abstract
In this paper, under the general singular assumptions onf, we discuss the existence and uniqueness of solution to a class of Kirchhoff–Schrödinger–Poisson system. For the existence of solution, we assume thatfis nonincreasing onand satisfies general singularity at zero, and sublinear growth at infinity. Furthermore, we also obtain the unique result assuming thatfsatisfies one side Lipschitz condition on. The variational method is employed to discuss the existence and uniqueness of solution to this system. [ABSTRACT FROM PUBLISHER]
- Published
- 2017
- Full Text
- View/download PDF
39. Existence and multiplicity of positive solutions to Schrödinger-Poisson type systems with critical nonlocal term.
- Author
-
Li, Yuhua, Li, Fuyi, and Shi, Junping
- Subjects
- *
GROUND state energy , *SCHRODINGER equation , *POISSON processes , *NONLINEAR theories , *MONOTONIC functions - Abstract
The existence, nonexistence and multiplicity of positive radially symmetric solutions to a class of Schrödinger-Poisson type systems with critical nonlocal term are studied with variational methods. The existence of both the ground state solution and mountain pass type solutions are proved. It is shown that the parameter ranges of existence and nonexistence of positive solutions for the critical nonlocal case are completely different from the ones for the subcritical nonlocal system. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
40. Existence of solutions to a class of Schrödinger–Poisson systems with indefinite nonlinearity.
- Author
-
Li, Fuyi, Chang, Caihong, and Feng, Xiaojing
- Subjects
- *
SEMICONDUCTORS , *SCHRODINGER equation , *MANIFOLDS (Mathematics) , *QUANTUM mechanics , *MATHEMATICAL analysis - Published
- 2017
- Full Text
- View/download PDF
41. Existence and concentration of sign-changing solutions to Kirchhoff-type system with Hartree-type nonlinearity.
- Author
-
Li, Fuyi, Gao, Chunjuan, and Zhu, Xiaoli
- Subjects
- *
EXISTENCE theorems , *KIRCHHOFF'S theory of diffraction , *HARTREE-Fock approximation , *NONLINEAR theories , *COEFFICIENTS (Statistics) , *SCHRODINGER equation - Abstract
In this paper, we discuss the existence and the concentration of sign-changing solutions to a class of Kirchhoff-type systems with Hartree-type nonlinearity in R 3 . By the minimization argument on the sign-changing Nehari manifold and a quantitative deformation lemma, we prove that the system has a sign-changing solution. Moreover, concentration behaviors of sign-changing solutions are obtained when the coefficient of the potential function tends to infinity. Specially, our results cover general Schrödinger equations, Kirchhoff equations and Schrödinger–Poisson systems. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
42. Multiple solutions to a class of generalized quasilinear Schrödinger equations with a Kirchhoff-type perturbation.
- Author
-
Li, Fuyi, Zhu, Xiaoli, and Liang, Zhanping
- Subjects
- *
SCHRODINGER equation , *NUMERICAL solutions to partial differential equations , *QUASILINEARIZATION , *PERTURBATION theory , *EXISTENCE theorems , *MATHEMATICAL analysis - Abstract
In this paper, we consider a class of generalized quasilinear Schrödinger equations with a Kirchhoff-type perturbation. Under the assumption that the potential may be vanishing at infinity, the existence of both the ground state and the ground state sign-changing solutions is established. Furthermore, the behavior of these solutions is studied when the perturbation vanishes. It is a surprise that we find an interesting phenomenon about the monotonicity for the quotient function as a byproduct. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
43. A sufficient condition for blowup of solutions to a class of pseudo-parabolic equations with a nonlocal term.
- Author
-
Zhu, Xiaoli, Li, Fuyi, Liang, Zhanping, and Rong, Ting
- Subjects
- *
ENERGY function , *GROUND state energy , *MATHEMATICAL bounds , *BLOWING up (Algebraic geometry) , *PARTIAL differential equations - Abstract
A sufficient condition for blowup of solutions to a class of pseudo-parabolic equations with a nonlocal term is established in this paper. In virtue of the potential wells method, we first extend the results obtained by Xu and Su in [J. Funct. Anal., 264 (12): 2732-2763, 2013] to the nonlocal case and describe successfully the behavior of solutions by using the energy functional, Nehari functional, and the ground state energy of the stationary equation. Sequently, we study the boundedness and convergency of any global solution. Finally, we achieve a criterion to guarantee the blowup of solutions without any limit of the initial energy.Copyright © 2015 John Wiley & Sons, Ltd. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
44. Multiple positive radial solutions to some Kirchhoff equations.
- Author
-
Li, Fuyi, Guan, Chen, and Feng, Xiaojing
- Subjects
- *
KIRCHHOFF'S theory of diffraction , *NUMERICAL solutions to partial differential equations , *EXISTENCE theorems , *MULTIPLICITY (Mathematics) , *ELLIPTIC differential equations , *CONTINUOUS functions , *MATHEMATICAL inequalities - Abstract
In this paper, we discuss the existence and multiplicity of positive radial solutions to some Kirchhoff equations and elliptic equations. By using the Leggett–Williams three solutions theorem skillfully, we obtain two positive solutions to Kirchhoff equations under some appropriate conditions. As a direct corollary, we also obtain the same result to elliptic equations. In order to facilitate the proof of main results, we first establish a new simple three solutions theorems for the equation [ a + b φ ( x ) ] x = A x , where A is a completely continuous operator, a > 0 , b ⩾ 0 and φ ( x ) ⩽ k ( ‖ x ‖ ) for all x ∈ P , where k is a nondecreasing nonnegative continuous function, then apply it to prove the main results. In addition, we discover and prove a new inequality in the article. In the end of the paper, we present an example which makes the elliptic equation has infinite many positive radial solutions and give the approximate images of the nonlinear term. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
45. A new sufficient condition for blow-up of solutions to a class of parabolic equations.
- Author
-
Zhu, Xiaoli, Li, Fuyi, and Li, Yuhua
- Subjects
- *
BLOWING up (Algebraic geometry) , *DEGENERATE parabolic equations , *FUNCTIONALS , *ENERGY function , *MATHEMATICAL singularities - Abstract
In this paper, a new sufficient condition under which the solutions to a classical parabolic equation blow up is established. During this process, through constructing a new functional which falls between the energy and Nehari functionals, we also get a larger blow-up set. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
46. Influences of longitudinal gradients on methane-driven membrane biofilm reactor for complete nitrogen removal: A model-based investigation.
- Author
-
Chen, Xueming, Li, Fuyi, Huo, Pengfei, Liu, Jinzhong, Yang, Linyan, Li, Xianhui, Wei, Wei, and Ni, Bing-Jie
- Subjects
- *
MEMBRANE reactors , *TUBULAR reactors , *LIQUEFIED gases , *NITROGEN - Abstract
• Insensitive TN removal to longitudinal heterogeneity at non-excessive CH 4 supply. • Longitudinal gradient in liquid phase affected biomass stratification/CH 4 utilization. • Recommended plug flow operation for MBfR with co-current flow of wastewater and CH 4. Integrating anammox with denitrifying anaerobic methane oxidation (DAMO) in the membrane biofilm reactor (MBfR) is a promising technology capable of achieving complete nitrogen removal from wastewater. However, it remains unknown whether reactor configurations featuring longitudinal gradients parallel to the membrane surface would affect the performance of the CH 4 -driven MBfR. To this end, this work aims to study the impacts of longitudinal heterogeneity potentially present in the gas and liquid phases on a representative CH 4 -driven MBfR performing anammox/DAMO by applying the reported modified compartmental modeling approach. Through comparing the modeling results of different reactor configurations, this work not only offered important guidance for better design, operation and monitoring of the CH 4 -driven MBfR, but also revealed important implications for prospective related modeling research. The total nitrogen removal efficiency of the MBfR at non-excessive CH 4 supply (e.g., surface loading of ≤0.064 g-COD m−2 d−1 in this work) was found to be insensitive to both longitudinal gradients in the liquid and gas phases. Comparatively, the longitudinal gradient in the liquid phase led to distinct longitudinal biomass stratification and therefore played an influential role in the effective CH 4 utilization efficiency, which was also related to the extent of reactor compartmentation considered in modeling. When supplied with non-excessive CH 4 , the MBfR is recommended to be designed/operated with both the biofilm reactor and the membrane lumen as plug flow reactors (PFRs) with co-current flow of wastewater and CH 4 , which could mitigate dissolved CH 4 discharge in the effluent. For the reactor configurations with the biofilm reactor designed/operated as a PFR, multi-spot sampling in the longitudinal direction is needed to obtain a correct representation of the microbial composition of the MBfR. Graphical abstract [Display omitted]. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
47. Existence of multiple positive solutions to nonhomogeneous Schrödinger–Poisson system.
- Author
-
Zhang, Qi, Li, Fuyi, and Liang, Zhanping
- Subjects
- *
EXISTENCE theorems , *SCHRODINGER equation , *POISSON'S ratio , *GENERALIZATION , *ASYMPTOTIC efficiencies - Abstract
In this paper, we consider the existence of multiple solutions to the following nonhomogeneous generalized Schrödinger–Poisson system - Δ u + Ku + q ϕ f ( u ) = g ( u ) + h ( x ) , in R 3 , - Δ ϕ = 2 qF ( u ) , in R 3 , where q ⩾ 0 is a parameter, 0 ≠ h ( x ) = h ( | x | ) ∈ L 2 ( R 3 ) , and g is asymptotically linear or superliner at infinity. We show that there exists q 0 > 0 such that the system has at least two positive radial solutions for q ∈ [ 0 , q 0 ) . [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
48. Positive solutions to some equations with homogeneous operator.
- Author
-
Li, Fuyi, Guan, Chen, and Li, Yuhua
- Subjects
- *
OPERATOR theory , *NUMERICAL solutions to equations , *CONTINUOUS functions , *LINEAR operators , *BANACH spaces , *FIXED point theory - Abstract
In this paper, we discuss the positive solutions to the equation φ ( u ) u = λ a A u + B u + u 0 , where A is a positive linear completely continuous operator, B is an α -homogeneous operator defined on a cone in a real Banach space and φ ( u ) = a + b ‖ u ‖ β . By using the fixed point index theory, when u 0 is sufficiently small, the spectral radius λ r ( A ) < 1 and α − γ β > 1 , where γ = sgn b , we obtain a positive solution to the above equation under some appropriate conditions. The new results generalize the previous research about the homogeneous operator equation. As an application, by using our main theorem we can obtain a symmetrical positive solution to the one dimensional Kirchhoff equation. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
49. Existence of positive solutions to Schrödinger-Poisson type systems with critical exponent.
- Author
-
Li, Fuyi, Li, Yuhua, and Shi, Junping
- Subjects
- *
EXISTENCE theorems , *SCHRODINGER equation , *NUMERICAL solutions to Poisson's equation , *CRITICAL exponents , *MATHEMATICAL proofs - Abstract
The existence of positive solutions to Schrödinger-Poisson type systems in ℝ3 with critically growing nonlocal term is proved by using variational method which does not require usual compactness conditions. A key ingredient of the proof is a new Brézis-Lieb type convergence result. [ABSTRACT FROM AUTHOR]
- Published
- 2014
- Full Text
- View/download PDF
50. ResNetKhib: a novel cell type-specific tool for predicting lysine 2-hydroxyisobutylation sites via transfer learning.
- Author
-
Jia, Xiaoti, Zhao, Pei, Li, Fuyi, Qin, Zhaohui, Ren, Haoran, Li, Junzhou, Miao, Chunbo, Zhao, Quanzhi, Akutsu, Tatsuya, Dou, Gensheng, Chen, Zhen, and Song, Jiangning
- Subjects
- *
INTERNET servers , *LIQUID chromatography-mass spectrometry , *RECEIVER operating characteristic curves , *PENTOSE phosphate pathway , *LYSINE , *GLYCOLYSIS , *FEATURE selection - Abstract
Lysine 2-hydroxyisobutylation (Khib), which was first reported in 2014, has been shown to play vital roles in a myriad of biological processes including gene transcription, regulation of chromatin functions, purine metabolism, pentose phosphate pathway and glycolysis/gluconeogenesis. Identification of Khib sites in protein substrates represents an initial but crucial step in elucidating the molecular mechanisms underlying protein 2-hydroxyisobutylation. Experimental identification of Khib sites mainly depends on the combination of liquid chromatography and mass spectrometry. However, experimental approaches for identifying Khib sites are often time-consuming and expensive compared with computational approaches. Previous studies have shown that Khib sites may have distinct characteristics for different cell types of the same species. Several tools have been developed to identify Khib sites, which exhibit high diversity in their algorithms, encoding schemes and feature selection techniques. However, to date, there are no tools designed for predicting cell type-specific Khib sites. Therefore, it is highly desirable to develop an effective predictor for cell type-specific Khib site prediction. Inspired by the residual connection of ResNet, we develop a deep learning-based approach, termed ResNetKhib, which leverages both the one-dimensional convolution and transfer learning to enable and improve the prediction of cell type-specific 2-hydroxyisobutylation sites. ResNetKhib is capable of predicting Khib sites for four human cell types, mouse liver cell and three rice cell types. Its performance is benchmarked against the commonly used random forest (RF) predictor on both 10-fold cross-validation and independent tests. The results show that ResNetKhib achieves the area under the receiver operating characteristic curve values ranging from 0.807 to 0.901, depending on the cell type and species, which performs better than RF-based predictors and other currently available Khib site prediction tools. We also implement an online web server of the proposed ResNetKhib algorithm together with all the curated datasets and trained model for the wider research community to use, which is publicly accessible at https://resnetkhib.erc.monash.edu/. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.