
(2) * Ida Nurhaida

*Corresponding author
AbstractScientific writing requires precision and clarity to uphold credibility and effective communication. Errors such as spelling mistakes and typos can compromise the quality and reliability of scientific texts. This study proposes a Long Short-Term Memory (LSTM)-based approach to detect and correct spelling errors, enhancing text accuracy and readability. The dataset comprises 45,698 standard words, supplemented with typo variations to improve model performance. Data is sourced from the Indonesian Dictionary (KBBI) and undergoes normalization and preprocessing to capture diverse error patterns. The model’s performance is evaluated using a confusion matrix, achieving 93% accuracy and high precision, recall, and F1-score metrics. These results demonstrate that the proposed NLP-based LSTM model offers an effective and reliable solution for identifying and correcting spelling errors. This approach significantly enhances the quality of scientific writing, ensuring more transparent and credible communication.
KeywordsScientific Writing; Natural Language Processing; Text Correction System; LSTM
|
DOIhttps://doi.org/10.33122/ejeset.v5i1.309 |
Article metricsAbstract views : 552 | PDF views : 488 |
Cite |
Full Text Download
|
References
Adawiyah, R. (2023). Analisis kesalahan penulisan paragraf bahasa Inggris oleh mahasiswa non-jurusan bahasa Inggris. Innovative: Journal of Social Science Research, 3(6), 7308–7320.
Amien, M. (2023). Sejarah dan perkembangan teknik Natural Language Processing (NLP) bahasa Indonesia: Tinjauan tentang sejarah, perkembangan teknologi, dan aplikasi NLP dalam bahasa Indonesia. Research Gate. https://www.researchgate.net/publication/369855102_Sejarah_dan_Perkembangan_Teknik_Natural_Language_Processing_NLP_Bahasa_Indonesia_Tinjauan_tentang_sejarah_perkembangan_teknologi_dan_aplikasi_NLP_dalam_bahasa_Indonesia
Baghoussi, Y., Soares, C., & Mendes-Moreira, J. (2024). Corrector LSTM: Built-in training data correction for improved time-series forecasting. Neural Computing and Applications, 36(26), 16213–16231. https://doi.org/10.1007/s00521-024-09962-x
Dewi, N. C., & Qoiriah, A. (2021). Implementasi algoritma jaro-winkler distance dan N-Gram untuk deteksi dan prediksi perbaikan kesalahan penulisan kata bahasa Indonesia pada karya tulis ilmiah mahasiswa. Journal of Informatics and Computer Science, 2(03), 169–177. https://doi.org/10.26740/jinacs.v2n03.p169-177
Herawati, I., Kanzunnudin, M., & Wiranti, D. A. (2022). Analisis kesalahan ejaan dalam penulisan karangan deskripsi siswa kelas IV SD 04 Besito Kudus. Jurnal Prasasti Ilmu, 2(3), 128–132. https://doi.org/10.24176/jpi.v2i3.8643
Juniarti, Y. (2019). Pentingnya keterampilan menulis akademik di perguruan tinggi. Prosiding Sembadra Universitas Sriwijaya, 2(1), 185–189.
Khaidir, J., Erlinawati, Sriani, Y., & Hidayat, A. (2023). Teknik penulisan karya ilmiah (N. Saputra (ed.); Vol. 1, Issue February). Yayasan Penerbit Muhammad Zaini. https://www.google.co.id/books/edition/pengantar_teknik_penulisan_karya_ilmiah/nx7eeaaaqbaj?hl=id&gbpv=0
Kusuma, A. T., & Ratnasari, C. I. (2023). Comparison of spell correction in bahasa Indonesia: Peter norvig, LSTM, and N-Gram. JIKO (Jurnal Informatika Dan Komputer), 6(3), 214–220. https://doi.org/10.33387/jiko.v6i3.7072
Marlina, Y. I. (2019). Bentuk kesalahan berbahasa ruang publik: kajian struktural bahasa [Thesis, Universitas Muhammadiyah Surakarta]. https://eprints.ums.ac.id/76214/1/NASKAH PUBLIKASI
Patel, B. M., & Sule, M. (2023). Tokenization techniques in NLP: A comprehensive review. International Journal of Advance Research and Innovative Ideas in Education, 9(1), 1873–1892. https://ijariie.com/adminuploadpdf/tokenization_techniques_in_nlp_a_comprehensive_review_ijariie22082.pdf
Putri, R. R., & Cahyono, N. (2024). Analisis sentimen komentar masyarakat terhadap pelayanan publik pemerintah DKI Jakarta dengan algoritma super vector machine and naive bayes. JATI (Jurnal Mahasiswa Teknik Informatika), 8(2), 2363–2371. https://doi.org/10.36040/jati.v8i2.9472
Rayhan, A., Kinzler, R., & Rayhan, R. (2023). Natural language processing: Transforming how machines understand human language. Researchgate. https://doi.org/10.13140/RG.2.2.34900.99200
Riehl, K., Neunteufel, M., & Hemberg, M. (2023). Hierarchical confusion matrix for classification performance evaluation. Journal of the Royal Statistical Society. Series C: Applied Statistics, 72(5), 1394–1412. https://doi.org/10.1093/jrsssc/qlad057
Rosmiati, A. (2017). Dasar-dasar penulisan karya ilmiah., ISI Press. http://repository.isi-ska.ac.id/1395/3/Dasar-Dasar Penulisan Ilmiah.pdf
Rustanti, H. D. (2024). Analisis kesalahan penggunaan ejaan bahasa indonesia pada karya ilmiah siswa kelas XI SMA Negeri 86 Jakarta Tahun Pelajaran 2021/2022. In UIN. https://repository.uinjkt.ac.id/dspace/bitstream/123456789/77844/1/HANIFAH DWI RUSTANTI11180130000023.pdf
Sathyanarayanan, S., & Tantri, B. R. (2024). Confusion matrix-based performance evaluation metrics. Afr. J. Biomed. Res., 27(4), 4023–4031. https://doi.org/10.53555/AJBR.v27i4S.4345
Siregar, S., Hasibuan, N. S., & Harahap, E. M. (2023). Pengaruh penggunaan teknik koreksi secara langsung pada keterampilan menulis puisi siswa kelas X di SMA Negeri I Siabu. Linguistik: Jurnal Bahasa Dan Sastra, 8(3), 449–513. http://jurnal.um-tapsel.ac.id/index.php/Linguistik/article/view/12837/pdf
Suhendar, A., Sugiarti, D. H., & Rosalina, S. (2023). Analisis kesalahan penulisan judul pada berita online Karawangpost.com dan Purwakartanews.com. Jurnal Onoma: Pendidikan, Bahasa, Dan Sastra, 9(1), 113–124. https://doi.org/10.30605/onoma.v9i1.2141
Suprihatma. (2024). Analisis penggunaan bahasa Indonesia dalam jurnalistik pada media massa online. Journal on Education, 6(2), 11011–11018. https://doi.org/10.31004/joe.v6i2.4892
Utama, F. P., Nurhadi, R. M. H., Fitria, D., & Ramadhan, M. P. (2021). Studi perbandingan implementasi string matching dengan metode sequential searching dan kondisi like pada pencarian judul skripsi. Jurnal Rekursif, 9(1), 43–47. https://doi.org/10.33369/rekursif.v9i1.14315
Wiranda, L., & Sadikin, M. (2019). Penerapan long short term memory pada data time series untuk memprediksi penjualan produk PT. Metiska Farma. Jurnal Nasional Pendidikan Teknik Informatika (JANAPATI), 8(3), 184–196.
Susanti, W., Wulandari, W., Hasanah, U., Aprindah, & Wahyuni,. (2022). Analisis kesalahan berbahasa pada berita dalam media surat kabar Kompas.com. KASTRAL: Kajian Sastra Nusantara Linggau, 2(2), 1–8. https://doi.org/10.55526/kastaral.v2i2.277
Yang, Z., Zeng, H., & Li, H. (2020). Chinese text error correction method based on prefix tree merging. IEEE 3rd International Conference on Automation, Electronics and Electrical Engineering, 272–276. https://doi.org/10.1109/AUTEEE50969.2020.9315643
Yulianizar, R., & Waliah, S. Z. (2022). Analisis kesalahaan ejaan terhadap teks berita “Bikin gagal ginjal, etilen glikol di obat sirup ternyata 'familiar’ di mesin” pada media online Detikoto”. Sinar Dunia: Jurnal Riset Sosial Humaniora dan Ilmu Pendidikan, 1(4), 62–73. https://doi.org/10.58192/sidu.v1i4.225
Zaky, D., & Romadhony, A. (2019). An LSTM-based spell checker for Indonesian text. Proceedings - 2019 International Conference on Advanced Informatics: Concepts, Theory, and Applications, 1–6. https://doi.org/10.1109/ICAICTA.2019.8904218
Refbacks
- There are currently no refbacks.
Copyright (c) 2024 Yeru Dwi Pratama Halim, Ida Nurhaida

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

























Download 