Psychometric Characteristics of the Kalam Cognitive Test (KCT): An Andrich Rating Scale Analysis

Main Article Content

Kriswantoro
Rizki Nor Amelia

Abstract

The science of Kalam is one of the compulsory courses for prospective Islamic teachers, and mastery of this material is very important. Therefore, the measurement instrument used must have good psychometric characteristics to accurately reflect the abilities of prospective Islamic teachers. The purpose of this study is to analyze the psychometric characteristics of the Kalam Cognitive Test (KCT), which has been developed with the Andrich Rating Scale and assisted by the Winsteps program. The research, which was conducted in the spring semester of the 2022–2023 academic year, involved 44 prospective Islamic religion teacher students in the Islamic Religious Education Study Program at Jambi Ma'arif Islamic College who were selected through cluster random sampling techniques. The results of the analysis show that the KCT has good psychometric characteristics in terms of fulfilling the assumptions of the analysis, item reliability, person reliability, and item difficulty level. Related to the rating scale analysis, it can also be concluded that all rating scale categories have met the required criteria so that they have functioned properly.

Article Details

How to Cite
Kriswantoro, & Amelia, R. N. (2023). Psychometric Characteristics of the Kalam Cognitive Test (KCT): An Andrich Rating Scale Analysis. Jurnal Penelitian, 20(1), 55–66. https://doi.org/10.28918/jupe.v20i1.1098
Section
Artikel

References

Ajeigbe, T. O., & Afolabi, E. R. I. (2017). Assessing unidimensionality and differential item functioning in qualifying examination for Senior Secondary School Students, Osun State, Nigeria. World Journal of Education, 4(4), 30-37. https://eric.ed.gov/?id=EJ1158579

Andersen, E. B. (1997). The rating scale model. In W. J. van der Linden, and R. K. Hambleton (Eds.), Handbook of modern item response theory (pp. 67-84). New York: Springer.

Andrich, D. (1978). A rating formulation for ordered response categories. Psychometrika, 43, 561-573. https://doi.org/10.1007/BF02293814

Ariffin, S. R., Omar, B., Isa, A., & Sharif, S. (2010). Validity and reliability of multiple intelligent items using the Rasch measurement model. Procedia-Social and Behavioral Sciences, 9, 729-733. https://doi.org/10.1016/j.sbspro.2010.12.225

Aziz, A.A., Masodi, M.S., & Zaharim, A. (2013). Asas model pengukuran rasch: pembentukan skala & struktur pengukuran. Bangi: Universiti Kebangsaan Malaysia

Baek, S.G. (2003). Measurement and assessment in teaching. Asia Pacific Education Review, 4(2), 210-211.

Bakar, N. S. A., Maat, S. M., & Rosli, R. (2019). Evaluation of mathematics teachers’ Technological Pedagogical Content Knowledge (TPACK) scale using Rasch model Analysis. Religación: Revista de Ciencias Sociales y Humanidades, 4(16), 342-348. https://dialnet.unirioja.es/servlet/articulo?codigo=8274073

Bond, T.G. & Fox, C.M. (2001). Applying the Rasch model fundamental measurement in the human sciences. London: ERL Lawrence Baum Associates Publishers.

Brown, T. A. (2015). Confirmatory factor analysis for applied research (2nd ed.). Guilford Press.

Clark, D. A., Donnellan, M. B., Durbin, C. E., Brooker, R. J., Neppl, T. K., Gunnar, M., Carlson, S. M., Le Mare, L., Kochanska, G., Fisher, P. A., Leve, L. D., Rothbart, M. K., & Putnam, S. P. (2020). Using item response theory to evaluate the Children’s Behavior Questionnaire: Considerations of general functioning and assessment length. Psychological Assessment, 32(10), 928–942. https://doi.org/10.1037/pas0000883

Cordier, R., Speyer, R., Schindler, A., Michou, E., Heijnen, B. J., Baijens, L., Karaduman, A., Swan, J., Clave, P., & Joosten, A. V. (2018). Using Rasch analysis to evaluate the reliability and validity of the Swallowing Quality of Life Questionnaire: an item response theory approach. Dysphagia, 33, 441-456. https://doi.org/10.1007/s00455-017-9873-4

DeMars, C. (2010). Item response theory: understanding statistics measurement. New York: Oxford University Press, Inc.

Gupta, C., Jain, A., & D'souza, A. S. (2016). Essay versus multiple-choice: A perspective from the undergraduate student point of view with its implications for examination. Gazi Medical Journal, 27(1). https://medicaljournal.gazi.edu.tr/index.php/GMJ/article/view/1198

Hasbi, M. (2015). Ilmu Kalam memotret berbagai aliran teologi dalam islam. Trustmedia publishing: Yogyakarta.

Hayat, B., Putra, M. D. K., & Suryadi, B. (2020). Comparing item parameter estimates and fit statistics of the Rasch model from three different traditions. Jurnal Penelitian dan Evaluasi Pendidikan, 24(1), 39-50. https://doi.org/10.21831/pep.v24i1. 29871

Jonsson, A., & Svingby, G. (2007). The use of scoring rubrics: Reliability, validity and educational consequences. Educational research review, 2(2), 130-144. https://doi.org/10.1016/j.edurev.2007.05.002

Lang, J. W., & Tay, L. (2021). The science and practice of item response theory in organizations. Annual Review of Organizational Psychology and Organizational Behavior, 8, 311-338. https://doi.org/10.1146/annurev-orgpsych-012420-061705

Linacre, J.M. (2023). A user’s guide to Winstep Ministeps: Rasch-Model computer programs. Chicago, IL.

Martinez, K. (1997). The effect of a rubric on evaluating and improving student writing (Doctoral dissertation, Caldwell College).

Meijer, R.R., & Tendeiro, J.N. 2018. Unidimensional item response theory. In P. Irwing, T. Booth, & D. J. Hugh (Eds.). The Wiley handbook of psychometric testing : A multidisciplinary reference on survey, scale and test development (pp. 413-433). Wiley. https://doi.org/10.1002/9781118489772.ch15

Minbashian, A., Huon, G.F., & Bird, K.D. 2004. Approaches to studying and academic performance in short-essay exams. Higher Education, 47, 161–176. https://doi.org/10.1023/B:HIGH.0000016443.43594.d1

Rusch, T., Lowry, P. B., Mair, P., & Treiblmaier, H. (2017). Breaking free from the limitations of classical test theory: Developing and measuring information systems scales using item response theory. Information & Management, 54(2), 189-203. https://doi.org/10.1016/j.im.2016.06.005

Segal, D.L., & Coolidge, F.L. (2018). Reliability. In Bornstein, M. H. (Ed.). The SAGE encyclopedia of lifespan human development (pp 1835). Thousand Oaks, CA: SAGE Publications.

Sick, J. (2010). Assumptions and requirements of Rasch measurement. Shiken: JALT Testing & Evaluation SIG Newsletter, 14(2), 23-29.

Sumekto, D. R., & Setyawati, H. (2018). Students descriptive writing performance: the analytic scoring assessment usage. Cakrawala Pendidikan, 37(3), 413-425. https://doi.org/10.21831/cp.v38i3.20033

Tozoglu, D., Tozoglu, M.D., Gurses, A., & Dogar, C. (2004). The students' perceptions: Essay versus multiple-choice type exams. Journal of Baltic Science Education, 2(6), 52-59.

Tuckman, B. W. (1993). The essay test: A look at the advantages and disadvantages. Nassp Bulletin, 77(555), 20-26. https://doi.org/10.1177/019263659307755504

Wahyuni, L. D., Gumela, G., & Maulana, H. (2021, June). Interrater Reliability: Comparison of essay tests and scoring rubrics. In Journal of Physics: Conference Series (Vol. 1933, No. 1, p. 012081). IOP Publishing.

Youssef, A. M. I. (2022). Using The Andrich Rating Scale Model (ARSM) to build a Scale for the Academic Proficiency Among Cairo University Students Psychometric Study. Egyptian Journals, 32(16), 383-440.‎ https://doi.org/10.21608/EJCJ.2022.247911

Zile-Tamsen, C.V. (2017). Using Rasch analysis to inform rating scale development. Res High Educ, 58, 922-933. https://doi.org/10.1007/s11162-017-9448-0