Medical Students' Performances Using Different Assessment Methods during the Final Examination in Internal Medicine at the University of Benghazi, Libya

Abstract Background  Distinctive evaluation tools assess diverse fields of learning that considerably impact the learning process. Objective  To compare and correlate the performances of undergraduate final year medical students in written, clinical, and viva examinations in the subject of internal medicine. Methods  This is a retrospective study. After authority approval, data was collected from final year examination results during 2019 to 2020. All the students of the medical school at University of Benghazi were included in this study. Their gender and their written, clinical, viva, and total scores were included. Data were coded and transferred from Excel to SPSS version 24 and expressed as frequencies and percentages. Chi-squared analysis was performed to test for differences in the proportions of categorical variables between two or more groups. Odd ratio (OR) is used to calculate the odds of passing the subject based on scores in different types of exams. Person's correlation ( R ) is used to evaluate the consistency of students' performances in different examinations. A p -value of less than 0.05 was considered the cut-off value of significant. Results  The total number of students was 679, out of which 499 (73.5%) were females and 180 (26.5%) were males. The total number of students who passed the course was 422 (62%) with no significant differences between males and females. A statistically significant ( p  < 0.001) greater percentage of students achieved a passing score in clinical assessment (502 [73.9%]), followed by viva assessment (458.0 [67.5%]). The students performed the worse in written examination with only 291/679 (43%) students passing the examination, with no gender-based differences. There was a highly significant association between the total score of students who passed the subject and their scores in the written examination with an OR of 2.3 ( p  < 0.001). Viva examination and total score OR was 0.79 with no significant differences for males or females. On the contrary, there was a statistically significant negative association between clinical exams and total scores of students who passed the subject (OR = 0.58). There was a highly significant correlation ( p  < 0.001) between written examination and viva examination ( R  = 0.638), between written examination and clinical examination ( R  = 0.629), and between clinical and viva examinations ( R  = 0.763). Conclusion  Students demonstrated higher performance on clinical and viva exams compared with written exams. Additionally, there were no notable disparities in results between male and female students across any of the three exam types. The written exam served as the most reliable indicator of a student's success in the subject. Furthermore, the data revealed a positive correlation between scores on the different exam formats, indicating that students exhibited consistent performance across all modes of evaluation.


Introduction
Assessment is the most important factor that drives students' learning, as students tend to study materials that will be assessed.Bloom's taxonomy was originally proposed by Benjamin Bloom in 1956 and has since been revised.The taxonomy consists of six levels: remembering, understanding, applying, analyzing, evaluating, and creating.Each level builds upon the previous one and requires a higher level of cognitive skill. 1 There are different methods of assessment that examine different domains of Bloom's taxonomy.Theory essays test the knowledge (level 1), at this stage, there will be an assess-ment of how will the student learn new knowledge.Questions that contain verbs like explain and compare will test comprehension (level 2).Exams that instruct students to apply and compare represent level 3, while those that test analysis and synthesis represent levels 4 and 5. Finally, level 6 tests evaluation and conclusion.Written examinations usually test levels 1 to 3, while clinical examinations test levels 2 to 6. 2 On the other hand, to create a competent graduate, other skills should be evaluated; like communication, analytical skills, teamwork skills, and evidence-based medical care. 3,4,6Students' assessments can be performed by many methods including short essay questions, students' projects, short and long case between clinical exams and total scores of students who passed the subject (OR ¼ 0.58).There was a highly significant correlation (p < 0.001) between written examination and viva examination (R ¼ 0.638), between written examination and clinical examination (R ¼ 0.629), and between clinical and viva examinations (R ¼ 0.763).Conclusion Students demonstrated higher performance on clinical and viva exams compared with written exams.Additionally, there were no notable disparities in results between male and female students across any of the three exam types.The written exam served as the most reliable indicator of a student's success in the subject.Furthermore, the data revealed a positive correlation between scores on the different exam formats, indicating that students exhibited consistent performance across all modes of evaluation.
][9] The choice of assessment methods depends on the domains being tested.Different learning outcomes should be tested by suitable assessment tools.Usually, a combination of assessment methods is required to test different learning outcomes, and good assessment methods will ultimately promote students' learning.0][11] In this retrospective analysis, we shall examine the academic achievements of students from the medical school at Benghazi University in the field of internal medicine, by employing three distinct forms of examination.

Methods
This is a retrospective study.After authority approval, data was collected from final year examination results during the year 2019-2020.All students were included in the study.Students' gender, written, clinical, viva, and total scores in the subject were included.The written examination is composed of two papers; each paper with 50 questions; paper 1 included 50 case scenarios with multiple choice questions and paper 2 included 50 multiple choice questions.Clinical examination is composed of five stations: four clinical and one viva examination station.
The total score for the final year examination was 300.Scores were distributed as follows; 100 marks for the written examination, 150 marks for the clinical exam, and 50 marks for the viva examination.The required pass score percentage was 60%; which means 180 marks for the total score, 60 marks for the written exam, 90 marks for the clinical examination, and 30 marks for the viva exam.

Statistical Methods
Data was coded and transferred from Excel to Statistical Package for Social Sciences (SPSS) version 24 (Chicago, IL, United States).The data included the number of students who passed or failed each type of the three exams according to gender.Data was expressed as frequency (percentage).Chi-square analysis was performed to test for the differences between two or more groups.Odd ratio (OR) is used to calculate the odds of passing the subject based on scores in different types of exams.Person's correlation (R) is used to evaluate the consistency of students' performances in different exams.The level of P< 0.05 was considered the cut-off value of significance.

General Characteristics
The total number of students was 679, 499 (73.5%) were females and 180 (26.5%) were males.The total number of students who passed the subject was 422 (62.2% of the total number of students), 314 were females (62.9% of the total number of females) and 108 were males (60% of the total number of males) with no significant difference between male and female students (►Table 1).

Comparison of Students' Performance in the Written, Clinical, and Viva Examinations
In total, students had a higher performance in clinical examination (73.9%) and viva examination (67.5%) compared with written examination (43%).This was statistically significant (p < 0.001) and applied to both male and female students.However, there were no significant differences in performance between male and female students in any of the three types of exams.
The Odd Ratio of the Relationship between the Total Score and Scores in Different Exams ►Table 2 shows the OR between the total score of students who passed the subject and their scores in the written examination, indicating a statistically significant positive association (p < 0.001).This means that the odds of passing the final exam were about two times higher than passing a written exam with a high degree of significance for both males and females.On the other hand, the viva examination OR with the total score was 0.79 with no significant differences for males or  females.On the contrary, we found a statistically significant negative association between total scores and clinical exams of students who passed the subject (OR ¼ 0.58).

Correlations between Students' Performances in the Different Types of Exams
The relationship between the different types of exams used to evaluate students' performance was calculated by using Pearson's correlation (►Table 3).It shows that there is a highly significant correlation between the different types of examinations, with the lowest between clinical and written exams (R ¼ 0.629) and the highest between viva and clinical exams (R ¼ 0.763).

Discussion
2][13] Different assessment tools are evaluating different domains of learning. 14In this study, we are reporting the results of students in internal medicine using three different types of exams, each evaluating different domains of learning (refer to the Introduction).The total number of students who passed the subject was 62.2% with nearly equal percentages for females (62.9%) and males (60%; ►Table 1), indicating that gender has no effect on the total performances of students in this study.Similar results were reported where no significant differences were found in the performance of males and females on preclerkship OSCEs or Essentials of Clinical Medicine semester final exams. 15However, our results are different from other studies, where they found gender differences regarding students' performances; one study showed a better male performance, 16 while another two studies showed a better female performance. 17,18hen we looked at the performances of students in different types of exams (►Table 4), we found that both male and female students performed better in clinical and viva examinations than in written examinations, with no significant differences in performance between male and female students in any of the three types of exams.Similar results were reported in another study, where scores of the clinical examination were significantly higher than the written examination. 19There could be several reasons why students performed differently in the three types of exams.Clinical and viva examinations are typically more interactive and require students to apply their knowledge in practical situations, which may better reflect their understanding of the material.On the other hand, written exams may be more focused on testing memorization and recall of information, which may not necessarily reflect a student's ability to apply that knowledge.Additionally, the format of the exams and the types of questions asked may also contribute to differences in performance.Another explanation is that the clinical assessment tools used in our study might be not completely objective and examiner factors could play a role.][21] The OR statistics were used to calculate the odds of passing the subject based on scores in different types of exams, and which type of examination can be the best predictor of students passing in the subject of internal medicine.►Table 2 shows that there was a statistically significant positive association between the total marks of students and their marks in the written exams.Specifically, the OR was about two times higher for passing the final exam compared with passing a written exam, and this difference was highly significant (p < 0.001).This suggests that students who perform well on the written exam are more likely to pass the subject, and this relationship holds true for both male and female students.In other words, doing well on the written exam is a good predictor of success in our subjects, and this finding is statistically significant.On the other hand, the OR between the viva examination score and the total score was 0.79, which suggests a weaker relationship compared with the written exam and the total score.This means that performing well on the viva examination is not as good a predictor of success as the written exam.On the opposite side, the study found a statistically significant negative association between the total score and clinical exam scores of students who passed the subject.Specifically, OR was 0.58.This finding is in contrast to the positive association found between the written exam and the total score.It suggests that clinical exams may be a weaker predictor of success on the total score compared with the written exam.Interestingly, our study found no significant differences in performance between male and female students in any of the three types of examinations.This means that gender did not have a significant effect on student performance, contrary to the "gender gap" reported in some literature. 15o evaluate the performance of students, we used Pearson's correlation to calculate the relationship between their scores in various types of exams (as shown in ►Table 3).The results indicate a strong correlation between different exams, indicating consistency in student performance across exams.This means that if a student performs well in one exam, they are likely to perform well in the others too.On the other hand, if a student performs poorly in one exam, they are also likely to perform poorly in the other exams.

Conclusion
This study found that students performed better in the clinical and viva examinations than in the written examination.There was no gender difference in the performance of male and female students across the three types of exams.The written exam was the strongest predictor of student success in the subject.The student's performance was consistent in the three types of exams and not affected by gender.

Limitations of This Study
The study only included students from one subject and one batch, which may limit the generalizability of the findings to other contexts and domains.Moreover, the study did not control for other factors that may affect student performance, such as motivation, prior knowledge, learning styles, or instructor quality.Additionally, the study did not use specific outcomes or competencies to evaluate student performance, but rather a general score that may not capture the nuances of student learning and achievement.

Table 1
Frequency and passing rates by gender in internal medicine

Table 2
Odds ratio of total score compared with scores of different types of examinations with gender-based analysis Abbreviations: CI, confidence intervals; OR, odds ratio.

Table 4
Comparison of student performance in written, clinical, and viva examinations with gender-based analysis a Percentage calculated from the total number of students in the respective gender.

Table 3
Correlation between different types of examinations used to evaluate students' performance in internal medicine