skip to main content

Overcoming challenging crban images: deep learning and data integration methods for detecting trees entangled with power lines

Oliveira, Artur Andre Almeida De Macedo

Biblioteca Digital de Teses e Dissertações da USP; Universidade de São Paulo; Instituto de Matemática e Estatística 2023-08-11

Acesso online. A biblioteca também possui exemplares impressos.

  • Título:
    Overcoming challenging crban images: deep learning and data integration methods for detecting trees entangled with power lines
  • Autor: Oliveira, Artur Andre Almeida De Macedo
  • Orientador: Hirata Junior, Roberto
  • Assuntos: Aprendizagem Profunda; Dificuldade De Instância; Imagens Urbanas; Visão Computacional; Computer Vision; Deep Learning; Instance Hardness; Urban Images
  • Notas: Tese (Doutorado)
  • Descrição: Urban image classification at the street-level poses significant challenges due to the presence of diverse elements, varying appearances, and complex poses. Factors such as occlusion, background clutter, environmental conditions, and camera viewpoints further complicate the classification process. In this study, we leverage the capabilities of state-of-the-art Deep Learning Networks (DLNs), including MobileNets, ResNets, DenseNets, and EfficientNets, to tackle these challenges head-on. We aim to evaluate the performance of these DLNs, identify limitations, and propose innovative techniques for overcoming them. Our research focuses on the specific task of classifying urban images with or without trees near overhead powerlines. Through an extensive exploration, we provide methods and insights that not only address this classification problem but also offer generalizable solutions applicable to a range of classification tasks. Two major contributions are introduced in our work. Firstly, we extend the INvestigate and Analyze a CITY (INACITY) platform by integrating a graph-oriented database, improving the performance and coverage of urban image collection from Google Street View. Secondly, we develop the Street-Level Image Labeler (SLIL) tool, which efficiently mitigates the manual labeling burden, facilitating dataset creation. With the help of INACITY and SLIL, we curate a comprehensive labeled dataset comprising 8,800 street-level urban images. Human evaluation of the dataset reveals the presence of challenging images that perplex even experienced classifiers. For example, distinguishing whether powerlines intersect or pass behind tree canopies can be difficult depending on the perspective. The comparison of state-of-the-art DLNs on this dataset reveals that the highest accuracy achieved by plain DLNs is 74.6%. However, by introducing a new class \\emph distinct from positive or negative, and employing the noisy student training protocol and focal loss, we effectively enhance the recall rates for positive and negative classes respectively from 66.5% and 63.7% to 83.7% and 78.8%. This approach enables us to better identify and classify images that were previously prone to misclassification.
  • DOI: 10.11606/T.45.2023.tde-22032024-184659
  • Editor: Biblioteca Digital de Teses e Dissertações da USP; Universidade de São Paulo; Instituto de Matemática e Estatística
  • Data de criação/publicação: 2023-08-11
  • Formato: Adobe PDF
  • Idioma: Inglês

Buscando em bases de dados remotas. Favor aguardar.