Idioma:

The distance function effect on k-nearest neighbor classification for medical datasets

Hu, Li-Yu ; Huang, Min-Wei ; Ke, Shih-Wen ; Tsai, Chih-Fong

SpringerPlus, 2016-08, Vol.5 (1), p.1304-1304, Article 1304 [Periódico revisado por pares]

Cham: Springer International Publishing

Texto completo disponível

Citações Citado por

Enviar para

Título:
The distance function effect on k-nearest neighbor classification for medical datasets
Autor: Hu, Li-Yu ; Huang, Min-Wei ; Ke, Shih-Wen ; Tsai, Chih-Fong
Assuntos: Case Study ; Computer Science ; Humanities and Social Sciences ; multidisciplinary ; Science ; Science (multidisciplinary)
É parte de: SpringerPlus, 2016-08, Vol.5 (1), p.1304-1304, Article 1304
Notas: ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
Descrição: Introduction K-nearest neighbor (k-NN) classification is conventional non-parametric classifier, which has been used as the baseline classifier in many pattern classification problems. It is based on measuring the distances between the test data and each of the training data to decide the final classification output. Case description Since the Euclidean distance function is the most widely used distance metric in k-NN, no study examines the classification performance of k-NN by different distance functions, especially for various medical domain problems. Therefore, the aim of this paper is to investigate whether the distance function can affect the k-NN performance over different medical datasets. Our experiments are based on three different types of medical datasets containing categorical, numerical, and mixed types of data and four different distance functions including Euclidean, cosine, Chi square, and Minkowsky are used during k-NN classification individually. Discussion and evaluation The experimental results show that using the Chi square distance function is the best choice for the three different types of datasets. However, using the cosine and Euclidean (and Minkowsky) distance function perform the worst over the mixed type of datasets. Conclusions In this paper, we demonstrate that the chosen distance function can affect the classification accuracy of the k-NN classifier. For the medical domain datasets including the categorical, numerical, and mixed types of data, K-NN based on the Chi square distance function performs the best.
Editor: Cham: Springer International Publishing
Idioma: Inglês

Links

View this record in MEDLINE/PubMed

Voltar para lista de resultados

Anterior Resultado 7 Avançar Ir para próxima página

Realização: Logos de Redes Sociais:

The distance function effect on k-nearest neighbor classification for medical datasets

Hu, Li-Yu ; Huang, Min-Wei ; Ke, Shih-Wen ; Tsai, Chih-Fong

Cham: Springer International Publishing

Buscando em bases de dados remotas. Favor aguardar.