Comparison between students’ perception and examination validity, reliability and items difficulty: a cross-sectional study


Introduction: Students' perception of an examination reflects their feelings, while its item analysis refers to a statistical analysis of students’ responses to examination items. The study was conducted to compare the student’s perception towards examination to its item analysis. Material and methods: This is a cross-sectional study conducted at the college of medicine, from January to April 2019. The study used a structured questionnaire and standardized item analysis of students’ examinations. All students who had registered for semester two (2018-2019) were included in the study. Exclusion criteria included students who refused to participate in the study or those who did not fill the questionnaire.

Result:  KR-20 of the examination was 0.906.  The average difficulty index of the examination was 69.4. The response rate of the questionnaire was 88.9% (40/45). Students considered the examination to be easy (70.4%). A significant correlation was reported between student perceptions towards examination difficulty and standard examination difficulty.

Discussion: Student perceptions support the evidence of examination validity.  Students were found able to estimate examination difficulty.

Keywords: Student perception, Item analysis, Assessment, validity, reliability.  

1. van de Watering G. Teachers' and students' perceptions of assessments: A review and a study into the ability and accuracy of estimating the difficulty levels od assessment items. Educational Research Review. 2006;2(1):133-47.
2. McMillan JH. SAGE handbook of research on classroom assessment: Sage; 2012.
3. Association AER, Association AP, Educational JCoSf, Testing P, Education NCoMi. Standards for educational and psychological testing: American Educational Research Association; 1985.
4. Linn RL. Educational Measurement. American Council on Education Series on Higher Education: ERIC; 1993.
5. Considine J, Botti M, Thomas S. Design, format, validity and reliability of multiple choice questions for use in nursing research and education. Collegian. 2005;12(1):19-24.
6. Abdulghani HM, Ahmad F, Ponnamperuma GG, Khalil MS, Aldrees A. The relationship between non-functioning distractors and item difficulty of multiple choice questions: a descriptive analysis. Journal of Health Specialties. 2014;2(4):148.
7. Lai H, Gierl MJ, Touchie C, Pugh D, Boulais A-P, De Champlain A. Using automatic item generation to improve the quality of MCQ distractors. Teaching and learning in medicine. 2016;28(2):166-73.
8. Shete AN, Kausar A, Lakhkar K, Khan S. Item analysis: An evaluation of multiple choice questions in Physiology examination. J Contemp Med Edu. 2015;3(3):106-9.
9. Al-Osail AM, Al-Sheikh MH, Al-Osail EM, Al-Ghamdi MA, Al-Hawas AM, Al-Bahussain AS, et al. Is Cronbach’s alpha sufficient for assessing the reliability of the OSCE for an internal medicine course? BMC research notes. 2015;8(1):582.
10. Peterson RA, Kim Y. On the relationship between coefficient alpha and composite reliability. Journal of Applied Psychology. 2013;98(1):194.
11. Kehoe J. Basic item analysis for multiple-choice tests. Practical assessment, research & evaluation. 1995;4(10):20-4.
12. Cortina JM. What is coefficient alpha? An examination of theory and applications. Journal of applied psychology. 1993;78(1):98.
13. Pande SS, Pande SR, Parate VR, Nikam AP, Agrekar SH. Correlation between difficulty & discrimination indices of MCQs in formative exam in Physiology. 2013.
14. Bland JM, Altman DG. Statistics notes: Cronbach's alpha. Bmj. 1997;314(7080):572.
15. Carmines EG, Zeller RA. Reliability and validity assessment: Sage publications; 1979.
16. Mitra N, Nagaraja H, Ponnudurai G, Judson J. The levels of difficulty and discrimination indices in type a multiple choice questions of pre-clinical semester 1, multidisciplinary summative tests. IeJSME. 2009;3(1):2-7.
17. Streiner DL. Starting at the beginning: an introduction to coefficient alpha and internal consistency. Journal of personality assessment. 2003;80(1):99-103.
18. Ibrahim ME, Al-Shahrani AM. Implementing of a problem-based learning strategy in a Saudi medical school: requisites and challenges. International journal of medical education. 2018;9:83.
19. Ibrahim ME, Al-Shahrani AM, Abdalla ME, Abubaker IM, Mohamed ME. The effectiveness of problem-based learning in Acquisition of Knowledge, soft skills during basic and preclinical sciences: medical Students’ points of view. Acta Informatica Medica. 2018;26(2):119.
20. Aleamoni LM. Student rating myths versus research facts from 1924 to 1998. Journal of personnel evaluation in education. 1999;13(2):153-66.
21. Zhao J, Gallant DJ. Student evaluation of instruction in higher education: Exploring issues of validity and reliability. Assessment & Evaluation in Higher Education. 2012;37(2):227-35.
22. Marsh HW. Students' evaluations of university teaching: Dimensionality, reliability, validity, potential baises, and utility. Journal of educational psychology. 1984;76(5):707.
23. Rezigalla AA. Observational Study Designs: Synopsis for Selecting an Appropriate Study Design. Cureus. 2020;12(1).
24. Abdellatif H, Al-Shahrani AM. Effect of blueprinting methods on test difficulty, discrimination, and reliability indices: cross-sectional study in an integrated learning program. Advances in medical education and practice. 2019;10:23.
25. Sullivan GM. A primer on the validity of assessment instruments. The Accreditation Council for Graduate Medical Education Suite 2000, 515 North State Street, Chicago, IL 60654; 2011.
26. van der Vleuten C. Validity of final examinations in undergraduate medical training. Bmj. 2000;321(7270):1217-9.
27. Brown S, Knight P. Assessing learners in higher education: Routledge; 2012.
28. Rezigalla AA. Angoff's method: The impact of raters' selection. Saudi Journal of Medicine and Medical Sciences. 2015;3(3):220.
29. Dochy FJRC. Assessment of prior knowledge as a determinant for future learning: The use of prior knowledge state tests and knowledge profiles: Centre for Educational Technology and Innovation, Open University; 1992.