Comparison of Multiple Linear Regression and Random Forest Regression Algorithms for Prediction of Particulate Matter 2.5 (PM2.5) Concentration in Jambi City
DOI:
https://doi.org/10.33998/jms.2025.5.2.2273Keywords:
Particulate Matter 2.5 (PM2.5), Data Mining, Prediction, Multiple Linear Regression, Rendom Forest RegressionAbstract
The problem of increasing air pollution that occurs globally also occurs in Indonesia, including in Jambi Province. Air pollution has dust particles in it, one of which is Particulate Matter (PM2.5). The concentration of PM2.5 in the air is influenced by the meteorological conditions of an area, as well as the events around it, whether it is a natural event or an event caused by human activities. This study predicted PM2.5 concentration in Jambi city using Multiple Linear Regression and Random Forest Regression algorithms with air temperature, air humidity, wind speed, rainfall and hot spots as independent variabels. In the process, this study compares the two algorithms and assesses the accuracy of each algorithm. The Multiple Linear Regression algorithm is able to generate a model that can describe the relationship between air temperature, air humidity, wind speed, rainfall and hot spots to PM2.5 concentration, although the error value is larger than that of the Random Forest Regression algorithm. The Random Forest Regression algorithm produces a model with an RMSE of 0.033μg/mm3 smaller than the Multiple Linear Regression algorithm. In the accuracy test with MAPE, the Random Forest Regression algorithm has a value of 74,0% where Multiple Linear Regression has a value of 73,0% so that the Random Forest Regression algorithm has a higher accuracy in predicting PM2.5 concentrations.
Downloads
References
World Health Organization, “New WHO Global Air Quality Guidelines aim to save millions of lives from air pollution.” Accessed: Oct. 25, 2024. [Online]. Available: https://www.who.int/news/item/22-09-2021-new-who-global-air-quality-guidelines-aim-to-save-millions-of-lives-from-air-pollution
A. Fauziah, M. Zuhdi, and H. Syarifuddin, “Analisis Distribusi Asap Dampak Kebakaran Hutan dan Lahan di Provinsi Jambi,” Jurnal Pembangunan Berkelanjutan, vol. 6, no. 2, pp. 10–25, 2023, doi: 10.22437/jpb.v6i2.30262.
United States Environmental Protection Agency, “Particulate Matter (PM) Basics.” Accessed: Oct. 25, 2024. [Online]. Available: https://www.epa.gov/pm-pollution/particulate-matter-pm-basics
W. Xu et al., “The influence of PM2.5 exposure on kidney diseases,” Jan. 07, 2022, SAGE Publications Ltd. doi: 10.1177/09603271211069982.
P. Thangavel, D. Park, and Y. C. Lee, “Recent Insights into Particulate Matter (PM2.5)-Mediated Toxicity in Humans: An Overview,” Jun. 01, 2022, MDPI. doi: 10.3390/ijerph19127511.
Badan Pusat Statistik Provinsi Jambi, “Penduduk Menurut Kabupaten/Kota di Provinsi Jambi.” Accessed: Nov. 06, 2024. [Online]. Available: https://jambi.bps.go.id/id/statistics-table/2/MjEwOCMy/penduduk-menurut-kabupaten-kota-di-provinsi-jambi.html
M. Sulistiyono, B. Satria, A. Sidauruk, and R. Wardhana, “RAINFALL PREDICTION USING MULTIPLE LINEAR REGRESSION ALGORITHM,” JITK (Jurnal Ilmu Pengetahuan dan Teknologi Komputer), vol. 9, no. 1, pp. 17–22, Aug. 2023, doi: 10.33480/jitk.v9i1.4203.
A. Lin, “Walmart Sales Prediction Using Multiple Linear Regression,” 2024.
P. Sari Ramadhan and N. Safitri STMIK Triguna Dharma, “Penerapan Data Mining Untuk Mengestimasi Laju Pertumbuhan Penduduk Menggunakan Metode Regresi Linier Berganda Pada BPS Deli Serdang,” vol. 18, no. SAINTIKOM, pp. 55–61, 2019, [Online]. Available: https://sirusa.bps.go.id/index.php
D. Tampubolon and D. Saripurna, “Implementasi Regresi Linier Berganda Untuk Memprediksi Tingkat Penjualan Alat Kelistrikan,” Jurnal CyberTech, vol. 3, no. 1, pp. 176–185, 2020, [Online]. Available: https://ojs.trigunadharma.ac.id/
M. Maulita, “PENDEKATAN DATA MINING UNTUK ANALISA CURAH HUJAN MENGGUNAKAN METODE REGRESI LINEAR BERGANDA (STUDI KASUS: KABUPATEN ACEH UTARA),” 2023. [Online]. Available: http://jom.fti.budiluhur.ac.id/index.php/IDEALIS/indexMayaMaulita|http://jom.fti.budiluhur.ac.id/index.php/IDEALIS/index|
D. Pramesti and W. M. Baihaqi, “Perbandingan Prediksi Jumlah Transaksi Ojek Online Menggunakan Regresi Linier dan Random Forest,” Generation Journal, vol. 7, Oct. 2023.
J. Hong, H. Choi, and W. S. Kim, “A house price valuation based on the random forest approach: The mass appraisal of residential property in south korea,” International Journal of Strategic Property Management, vol. 24, no. 3, pp. 140–152, Mar. 2020, doi: 10.3846/ijspm.2020.11544.
A. Primajaya and B. N. Sari, “Random Forest Algorithm for Prediction of Precipitation,” Indonesian Journal of Artificial Intelligence and Data Mining (IJAIDM), vol. 1, no. 1, pp. 27–31, 2018.
R. Prasetyo et al., “Komparasi Algoritma Logistic Regression dan Random Forest pada Prediksi Cacat Software,” 2021. [Online]. Available: http://promise.site.uottawa.ca.
P. T. Noi, J. Degener, and M. Kappas, “Comparison of multiple linear regression, cubist regression, and random forest algorithms to estimate daily air surface temperature from dynamic combinations of MODIS LST data,” Remote Sens (Basel), vol. 9, no. 5, May 2017, doi: 10.3390/rs9050398.
T. Zhang, “Prediction for Insurance Premiums Based on Random Forest and Multiple Linear Regression,” 2023.


