skip to main content
Tipo de recurso Mostra resultados com: Mostra resultados com: Índice

Semi-supervised learning approaches with applications in Medicinal Chemistry

Gertrudes, Jadson Castro

Biblioteca Digital de Teses e Dissertações da USP; Universidade de São Paulo; Instituto de Ciências Matemáticas e de Computação 2019-05-20

Acesso online. A biblioteca também possui exemplares impressos.

  • Título:
    Semi-supervised learning approaches with applications in Medicinal Chemistry
  • Autor: Gertrudes, Jadson Castro
  • Orientador: Campello, Ricardo José Gabrielli Barreto
  • Assuntos: Agrupamento Baseado Em Densidade; Agrupamento Semissupervisionado; Análise De Relação Entre Estrutura Química E Atividade Biológica; Classificação Semissupervisionada; Density- Based Clustering; Semi-Supervised Classification; Semi-Supervised Clustering; Structure-Activity Relationship
  • Notas: Tese (Doutorado)
  • Descrição: Semi-supervised learning is drawing increasing attention in the era of big data, as the gap between the abundance of cheap, automatically collected unlabeled data and the scarcity of labeled data that are laborious and expensive to obtain is dramatically increasing. In this thesis, we first introduce a unified view of density-based clustering algorithms. Then, we build upon this view and bridge the areas of semi-supervised clustering and classification under a common umbrella of density-based techniques. We show that there are close relations between density-based clustering algorithms and the graph-based approach for transductive classification. These relations are then used as a basis for a new framework for semi-supervised classification based on building-blocks from density-based clustering. This framework is not only efficient and effective, but it is also statistically sound. We also generalize the core algorithm of the framework HDBSCAN* so that it can also perform semi-supervised clustering by directly taking advantage of any fraction of labeled data that may be available, rather than instance-level pairwise constraints. Experimental results on a large collection of datasets show the advantages of the proposed approach both for semi-supervised classification, as well as for semi-supervised clustering. In addition, we evaluate the semi-supervised learning algorithms to determine relationships between chemical structure and biological activity in datasets from Medicinal Chemistry. The datasets evaluated in this area are characterized by a low number of labeled examples, a high dimensionality, and in some cases, do not have a clear relationship between chemical structure and biological activity, which makes it difficult to use classification techniques and analyze chemical phenomena. We implement and validate semi-supervised classification approaches that are appropriate for data analysis in Medicinal Chemistry.
  • DOI: 10.11606/T.55.2019.tde-22082019-105334
  • Editor: Biblioteca Digital de Teses e Dissertações da USP; Universidade de São Paulo; Instituto de Ciências Matemáticas e de Computação
  • Data de criação/publicação: 2019-05-20
  • Formato: Adobe PDF
  • Idioma: Inglês

Buscando em bases de dados remotas. Favor aguardar.