skip to main content
Primo Search
Search in: Busca Geral

Identification of compound–protein interactions through the analysis of gene ontology, KEGG enrichment for proteins and molecular fragments of compounds

Chen, Lei ; Zhang, Yu-Hang ; Zheng, Mingyue ; Huang, Tao ; Cai, Yu-Dong

Molecular genetics and genomics : MGG, 2016-12, Vol.291 (6), p.2065-2079 [Periódico revisado por pares]

Berlin/Heidelberg: Springer Berlin Heidelberg

Texto completo disponível

Citações Citado por
  • Título:
    Identification of compound–protein interactions through the analysis of gene ontology, KEGG enrichment for proteins and molecular fragments of compounds
  • Autor: Chen, Lei ; Zhang, Yu-Hang ; Zheng, Mingyue ; Huang, Tao ; Cai, Yu-Dong
  • Assuntos: Algorithms ; Animal Genetics and Genomics ; Biochemistry ; Biomedical and Life Sciences ; Computational Biology - methods ; Databases, Genetic ; Gene Ontology ; Human Genetics ; Life Sciences ; Microbial Genetics and Genomics ; Original Article ; Plant Genetics and Genomics ; Proteins - chemistry ; Proteins - metabolism ; Small Molecule Libraries - pharmacology
  • É parte de: Molecular genetics and genomics : MGG, 2016-12, Vol.291 (6), p.2065-2079
  • Notas: ObjectType-Article-1
    SourceType-Scholarly Journals-1
    ObjectType-Feature-2
    content type line 23
  • Descrição: Compound–protein interactions play important roles in every cell via the recognition and regulation of specific functional proteins. The correct identification of compound–protein interactions can lead to a good comprehension of this complicated system and provide useful input for the investigation of various attributes of compounds and proteins. In this study, we attempted to understand this system by extracting properties from both proteins and compounds, in which proteins were represented by gene ontology and KEGG pathway enrichment scores and compounds were represented by molecular fragments. Advanced feature selection methods, including minimum redundancy maximum relevance, incremental feature selection, and the basic machine learning algorithm random forest, were used to analyze these properties and extract core factors for the determination of actual compound–protein interactions. Compound–protein interactions reported in The Binding Databases were used as positive samples. To improve the reliability of the results, the analytic procedure was executed five times using different negative samples. Simultaneously, five optimal prediction methods based on a random forest and yielding maximum MCCs of approximately 77.55 % were constructed and may be useful tools for the prediction of compound–protein interactions. This work provides new clues to understanding the system of compound–protein interactions by analyzing extracted core features. Our results indicate that compound–protein interactions are related to biological processes involving immune, developmental and hormone-associated pathways.
  • Editor: Berlin/Heidelberg: Springer Berlin Heidelberg
  • Idioma: Inglês

Buscando em bases de dados remotas. Favor aguardar.