Perbandingan Klasifikasi Penyakit Kanker Paru-Paru menggunakan Support Vector Machine dan K-Nearest Neighbor

Penulis

  • Anita Desiani Universitas Sriwijaya
  • Sri Indra Maiyanti Universitas Sriwijaya
  • Yuli Andriani Universitas Sriwijaya
  • Bambang Suprihatin Universitas Sriwijaya
  • Ali Amran Universitas Sriwijaya
  • Nyanyu Chika Marselina Universitas Sriwijaya
  • Aulia Salsabila Universitas Sriwijaya

DOI:

https://doi.org/10.33998/processor.2023.18.1.700

Kata Kunci:

K-Fold Cross Validation, K-Nearest Neighbor, Lung Cancer, Percentage Split, Support Vector Machine

Abstrak

termasuk ke dalam penyebab pertama kematian pada pria dan menjadi penyebab kedua kematian pada wanita. Salah satu cara untuk mengurangi tingkat kematian karena kanker paru-paru adalah dengan melakukan deteksi dini, yakni dengan klasifikasi. Proses mengidentifikasi dan mengelompokkan objek dengan ciri atau karakteristik yang sama ke dalam beberapa kelas yang telah ditentukan disebut dengan klasifikasi. Beberapa algoritma yang banyak digunakan dalam proses klasifikasi adalah Support Vector Machine (SVM) dan K-Nearest Neighbor (KNN). SVM memiliki kelebihan, yakni mampu mengidentifikasi hyperplane secara terpisah sehingga memaksimalkan margin antara dua kelas atau lebih yang berbeda, tetapi sulit digunakan dalam data yang berukuran besar, sedangkan KNN dapat melakukan pemisahan data yang berskala besar dan tangguh terhadap noise pada data. Penelitian ini bertujuan untuk membangun model dengan menggunakan algoritma SVM dan KNN pada klasifikasi penyakit kanker paru-paru. Dataset penyakit kanker paru-paru memiliki jumlah data sebanyak 309 data dimana data dibagi dengan menggunakan metode percentage split dan k-fold cross validation pada masing-masing algoritma yang digunakan. Parameter yang digunakan dalam mengevaluasi model adalah akurasi, presisi, dan recall. Dari penelitian yang dilakukan, nilai akurasi, presisi, dan recall tertinggi diperoleh pada algoritma SVM metode percentage split dengan nilai secara berturut-turut, yakni 95,16%, 88%, dan 82,5%. Hal tersebut mengindikasikan bahwa algoritma SVM dengan metode percentage split memiliki performa yang lebih baik dalam melakukan klasifikasi penyakit kanker paru-paru dibandingkan algoritma dan metode lainnya

Unduhan

Data unduhan belum tersedia.

Referensi

N. M. Aljamali, W. K. N. Al-Qraawy, and T. A. Helal, “Review on Carcinogens Materials in Chemical Laboratories,” Int. J. Mol. Biol. Biochem., vol. 4, no. 1, pp. 17–25, 2022.

J. A. Barta, C. A. Powell, and J. P. Wisnivesky, “Global Epidemiology of Lung Cancer,” Ann. Glob. Heal., vol. 85, no. 1, p. 8, Jan. 2019, doi: 10.5334/aogh.2419.

A. Desiani, Erwin, B. Suprihatin, S. Yahdin, A. I. Putri, and F. R. Husein, “Bi-Path Architecture of CNN Segmentation and Classification Method for Cervical Cancer Disorders Based on Pap-smear Images,” Int. J. Comput. Sci., vol. 48, no. 3, 2021.

Ş. Yaşar, A. K. Arslan, C. Çolak, and S. Yoloğlu, “A Developed Web Based Software Can Easily Fulfill the Assumptions of Correlation, Classification and Regression Tasks in Data Processing,” in 2019 International Artificial Intelligence and Data Processing Symposium (IDAP), 2019, pp. 1–5. doi: 10.1109/IDAP.2019.8875914.

M. Onel, C. A. Kieslich, Y. A. Guzman, C. A. Floudas, and E. N. Pistikopoulos, “Big Data Approach to Batch Process Monitoring: Simultaneous Fault Detection and Diagnosis using Nonlinear Support Vector Machine based Feature Selection,” Comput. Chem. Eng., vol. 115, pp. 46–63, 2018, doi: https://doi.org/10.1016/j.compchemeng.2018.03.025.

R. I. Borman, F. Rossi, Y. Jusman, A. A. A. Rahni, S. D. Putra, and A. Herdiansah, “Identification of Herbal Leaf Types Based on Their Image Using First Order Feature Extraction and Multiclass SVM Algorithm,” in 2021 1st International Conference on Electronic and Electrical Engineering and Intelligent System (ICE3IS), 2021, pp. 12–17. doi: 10.1109/ICE3IS54102.2021.9649677.

S. R. A. Ahmed, I. Al-Barazanchi, A. Mhana, and H. R. Abdulshaheed, “Lung Cancer Classification using Data Mining and Supervised Learning Algorithms on Multi-Dimensional Data Set,” Period. Eng. Nat. Sci., vol. 7, no. 2, pp. 438–447, 2019, doi: 10.21533/pen.v7i2.483.

B. K. Francis and S. S. Babu, “Predicting Academic Performance of Students Using a Hybrid Data Mining Approach,” J. Med. Syst., vol. 43, no. 6, 2019, doi: 10.1007/s10916-019-1295-4.

F. G. Woldemichael and S. Menaria, “Prediction of Diabetes Using Data Mining Techniques,” in International Conference on Trends in Electronics and Informatics (ICOEI), 2018, pp. 414–418. doi: 10.1109/ICOEI.2018.8553959.

Y. R. Nugraha, A. P. Wibawa, and I. A. E. Zaeni, “Particle Swarm Optimization-Support Vector Machine (PSO-SVM) Algorithm for Journal Rank Classification,” in 2019 2nd International Conference of Computer and Informatics Engineering (IC2IE), 2019, pp. 69–73. doi: 10.1109/IC2IE47452.2019.8940822.

S. Widaningsih and S. Yusuf, “Penerapan Data Mining untuk Memprediksi Siswa Berprestasi dengan Menggunakan Algoritma K Nearest Neighbor,” JATISI (Jurnal Tek. Inform. dan Sist. Informasi), vol. 9, no. 3, pp. 2598–2611, 2022, doi: 10.35957/jatisi.v9i3.859.

Y. Wang, Z. Pan, and Y. Pan, “A Training Data Set Cleaning Method by Classification Ability Ranking for the K-Nearest Neighbor Classifier,” IEEE Trans. Neural Networks Learn. Syst., vol. 31, no. 5, pp. 1544–1556, 2020, doi: 10.1109/TNNLS.2019.2920864.

S. Bharati, P. Podder, R. Mondal, A. Mahmood, and M. Raihan-Al-Masud, “Comparative Performance Analysis of Different Classification Algorithm for the Purpose of Prediction of Lung Cancer,” in International Conference on Intelligent Systems Design and Applications, 2020, vol. 941, pp. 447–457. doi: 10.1007/978-3-030-16660-1_44.

R. Devika, S. V. Avilala, and V. Subramaniyaswamy, “Comparative Study of Classifier for Chronic Kidney Disease Prediction using Naive Bayes, KNN and Random Forest,” in International Conference on Computing Methodologies and Communication (ICCMC), 2019, pp. 679–684. doi: 10.1109/ICCMC.2019.8819654.

K. Taunk, S. De, S. Verma, and A. Swetapadma, “A Brief Review of Nearest Neighbor Algorithm for Learning and Classification,” in 2019 International Conference on Intelligent Computing and Control Systems (ICCS), 2019, pp. 1255–1260. doi: 10.1109/ICCS45141.2019.9065747.

S. A. Taher, K. A. Akhter, and K. M. A. Hasan, “N-Gram Based Sentiment Mining for Bangla Text Using Support Vector Machine,” in 2018 International Conference on Bangla Speech and Language Processing (ICBSLP), 2018, pp. 1–5. doi: 10.1109/ICBSLP.2018.8554716.

S. Huang, C. A. I. Nianguang, P. Penzuti Pacheco, S. Narandes, Y. Wang, and X. U. Wayne, “Applications of support vector machine (SVM) learning in cancer genomics,” Cancer Genomics and Proteomics, vol. 15, no. 1, pp. 41–51, 2018, doi: 10.21873/cgp.20063.

S. Ghosh, A. Dasgupta, and A. Swetapadma, “A Study on Support Vector Machine based Linear and Non-Linear Pattern Classification,” in 2019 International Conference on Intelligent Sustainable Systems (ICISS), 2019, pp. 24–28. doi: 10.1109/ISS1.2019.8908018.

L. Yahaya, N. D. Oye, and E. J. Garba, “A Comprehensive Review on Heart Disease Prediction Using Data Mining and Machine Learning Techniques,” Am. J. Artif. Intell., vol. 4, no. 1, pp. 20–29, 2020, doi: 10.11648/j.ajai.20200401.12.

D. M. Abdullah, A. M. Abdulazeez, and A. B. Sallow, “Lung cancer Prediction and Classification based on Correlation Selection method Using Machine Learning Techniques,” Qubahan Acad. J. , vol. 1, no. 2, pp. 141–149, 2021, doi: 10.48161/Issn.2709-8206.

M. I. Faisal, S. Bashir, Z. S. Khan, and F. Hassan Khan, “An Evaluation of Machine Learning Classifiers and Ensembles for Early Stage Prediction of Lung Cancer,” 2018 3rd Int. Conf. Emerg. Trends Eng. Sci. Technol. ICEEST 2018, pp. 1–4, 2019, doi: 10.1109/ICEEST.2018.8643311.

A. Goel and S. K. Srivastava, “Role of kernel parameters in performance evaluation of SVM,” Proc. - 2016 2nd Int. Conf. Comput. Intell. Commun. Technol. CICT 2016, pp. 166–169, 2016, doi: 10.1109/CICT.2016.40.

E. Sathiyapriya and S. Venila, “A Study on Classification Algorithms and Performance Analysis of Data Mining using Cancer Data to Predict Lung Cancer Disease,” Int. J. New Technol. Res., vol. 3, no. 8, pp. 88–93, 2017.

C. Thallam, A. Peruboyina, S. S. T. Raju, and N. Sampath, “Early Stage Lung Cancer Prediction Using Various Machine Learning Techniques,” Proc. 4th Int. Conf. Electron. Commun. Aerosp. Technol. ICECA 2020, pp. 1285–1292, 2020, doi: 10.1109/ICECA49313.2020.9297576.

H. F. Kareem, M. S. AL-Husieny, F. Y. Mohsen, E. A. Khalil, and Z. S. Hassan, “Evaluation of SVM performance in the detection of lung cancer in marked CT scan dataset,” Indones. J. Electr. Eng. Comput. Sci., vol. 21, no. 3, pp. 1731–1738, 2021, doi: 10.11591/ijeecs.v21.i3.pp1731-1738.

R. R. A. Siregar, Z. U. Siregar, and R. Arianto, “Klasifikasi Sentiment Analysis Pada Komentar Peserta Diklat Menggunakan Metode K-Nearest Neighbor,” Kilat, vol. 8, no. 1, pp. 81–92, 2019, doi: 10.33322/kilat.v8i1.421.

J. Riany, M. Fajar, and M. P. Lukman, “Penerapan Deep Sentiment Analysis pada Angket Penilaian Terbuka Menggunakan K-Nearest Neighbor,” Sisfo, vol. 6, no. 1, pp. 147–156, 2016, doi: 10.24089/j.sisfo.2016.09.011.

N. Maleki, Y. Zeinali, and S. T. A. Niaki, “A k-NN method for lung cancer prognosis with the use of a genetic algorithm for feature selection,” Expert Syst. Appl., vol. 164, no. July 2019, p. 113981, 2021, doi: 10.1016/j.eswa.2020.113981.

R. Patra, Prediction of lung cancer using machine learning classifier, vol. 1235 CCIS. Springer Singapore, 2020. doi: 10.1007/978-981-15-6648-6_11.

F. Adams, R. A. D. Anggoro, M. B. Satria, A. W. Oktavia, and N. Chamidah, “Perbandingan Normalisasi Data untuk Klasifikasi Wine Menggunakan Algoritma Naïve Bayes, Decision Tree, dan Support Vector Machine,” in Seminar Nasional Mahasiswa Ilmu Komputer dan Aplikasinya (SENAMIKA), 2021, pp. 260–268.

R. A. Wijayanti, M. T. Furqon, and S. Adinugroho, “Penerapan Algoritma Support Vector Machine Terhadap Klasifikasi Tingkat Risiko Pasien Gagal Ginjal,” J. Pengemb. Teknol. Inf. dan Ilmu Komput., vol. 2, no. 10, pp. 3500–3507, 2018.

A. Septiarini, R. Saputra, A. Tejawati, and M. Wati, “Deteksi Sarung Samarinda Menggunakan Metode Naive Bayes Berbasis Pengolahan Citra,” J. Rekayasa Sist. dan Teknol. Inf., vol. 5, no. 5, pp. 927–935, 2021.

S. A. Naufal, Adiwijaya, and W. Astuti, “Analisis Perbandingan Klasifikasi Support Vector Machine (SVM) dan K-Nearest Neighbors (KNN) untuk Deteksi Kanker dengan Data Microarray,” JURIKOM (Jurnal Ris. Komputer), vol. 7, no. 1, pp. 162–168, 2020, doi: 10.30865/jurikom.v7i1.2014.

D. Cahyanti, A. Rahmayani, and S. A. Husniar, “Analisis Performa Metode KNN pada Dataset Pasien Pengidap Kanker Payudara,” Indones. J. Data Sci., vol. 1, no. 2, pp. 39–43, 2020, doi: 10.33096/ijodas.v1i2.13.

K. Shah, H. Patel, D. Sanghvi, and M. Shah, “A Comparative Analysis of Logistic Regression, Random Forest and KNN Models for the Text Classification,” Augment. Hum. Res., vol. 5, no. 12, p. 12, 2020, doi: 10.1007/s41133-020-00032-0.

R. Novendri, A. S. Callista, D. N. Pratama, and C. E. Puspita, “Sentiment Analysis of YouTube Movie Trailer Comments Using Naïve Bayes,” Bull. Comput. Sci. Electr. Eng., vol. 1, no. 1, pp. 26–32, 2020, doi: 10.25008/bcsee.v1i1.5.

Diterbitkan

2023-04-30

Abstract views:

959

PDF Download:

773

DOI:

10.33998/processor.2023.18.1.700

Dimension Badge:

Cara Mengutip

Desiani, A., Indra Maiyanti, S., Andriani, Y., Suprihatin, B., Amran, A., Marselina, N. C., & Salsabila, A. (2023). Perbandingan Klasifikasi Penyakit Kanker Paru-Paru menggunakan Support Vector Machine dan K-Nearest Neighbor. Jurnal PROCESSOR, 18(1). https://doi.org/10.33998/processor.2023.18.1.700