Leveraging Naive Bayes Classification for Early Detection of Breast Cancer: A Data-Centric Diagnostic Approach

Baik Budi; Refki Budiman; Queen Hesti Ramadhamy

doi:10.25077/jarpet.v5i1.114

PDF

Published: Jun 1, 2025

DOI: https://doi.org/10.25077/jarpet.v5i1.114

Keywords:

Naïve Bayes, , K-Means, Cancer, Machine, Learning

Baik Budi

Universitas Andalas, Fakultas Teknik, Departemen Teknik Elektro

Refki Budiman

Universitas Andalas, Fakultas Teknik, Departemen Teknik Elektro

Queen Hesti Ramadhamy

Universitas Andalas, Fakultas Teknik, Departemen Teknik Elektro

Abstract

This study aims to develop a breast cancer detection model using two distinct approaches: the Naive Bayes algorithm for classification and the K-Means algorithm for clustering. The methodology involves the collection of diagnostic clinical feature data, data preprocessing for normalization, and the separate training and evaluation of each model. Naive Bayes is employed to classify breast cancer as malignant or benign based on training and testing datasets, while K-Means is applied to unlabeled data as an additional analytical method. The performance of the Naive Bayes classifier is assessed using a confusion matrix, whereas the clustering results from K-Means are evaluated based on cluster validity metrics. The results indicate that Naive Bayes achieves a high level of accuracy (93%) in breast cancer classification, while K-Means offers additional insights through data pattern clustering. Together, these approaches demonstrate potential to effectively support the medical diagnostic process.

How to Cite

Budi, B., Refki Budiman, & Queen Hesti Ramadhamy. (2025). Leveraging Naive Bayes Classification for Early Detection of Breast Cancer: A Data-Centric Diagnostic Approach. Jurnal Andalas: Rekayasa Dan Penerapan Teknologi, 5(1), 18–22. https://doi.org/10.25077/jarpet.v5i1.114