Global Rating Scores & Retrospective Standard Setting

What are Global Rating Scores? 

In clinical skills or observational assessment, apart from item score sheets, Global Rating Scores (GRS) are used. A GRS reflects the professional opinion of the examiner once they have completed the item score list. This list represents what they have observed or marked during the scenario. In most cases, a 5 or 6 item Likert scale is used ranging from 0 = Fail; 1 = Borderline; 2 = Pass; 3 = Good to 4 = Excellent. Sometimes, if examiners can’t choose between Borderline and Fail or Borderline and Pass an extra option is brought into this GRS. Now examiners have the option to choose between 0 = Fail; 1 = Borderline fail; 2 = Borderline pass; 3 = Pass and 4 and 5 are Good and Excellent, respectively. read more

How is Data Analysis used in Educational Decision Making?

“Current data-based methods of analysis and predictive models are insufficient – big data is able to remedy this. ” 

– Dr. Alexander Lenk 

Pass/fail decisions by faculty are mostly based on the ability of students to pass or fail prior to (Angoff) or post standard setting (Borderline Regression Analysis) criteria developed by faculty. Standard setting is a critical part of educational, licensing and certification testing. But outside of the cadre of practitioners, this aspect of test development is not well understood. Standard setting is the methodology used to define levels of achievement or proficiency and the cut-scores corresponding to those levels. A cut-score is simply the score that serves to classify the students whose score is below the cut-score into one level and the students whose score is at or above the cut-score into the next and higher level.  read more

Back to the future 2: An online OSCE Management Information System for nursing OSCEs

A few words about this paper…

It was not long after we implemented our OSCE Management Information System within the School of Medicine in the National University of Ireland, Galway, that our colleagues from the School of Nursing started to use the system. Moreover, as nurses embed solid practice in their research, the purpose of this practice-based investigation was to initiate a first user acceptance test. We were very pleased with their initiative as all (n=18) nurse examiners appeared to very satisfied with the electronic system and its embedded functionalities. With a University/National cut-score in nursing of (only) 40%, it was surprising to see when the difficulty of the stations using Borderline Regression Analysis was incorporated, the cutscore rose towards 60%. Nurses are very well trained in their skill sets. Although not further researched to date, approximately thirty hours of administration time was saved. In contrast to previous paper-based approaches prior to 2014, results and feedback could be released immediately after the exam was finished, according to Pauline, Eimear and Evelyn. read more

Controversies Standard Error of Measurement and Borderline Regression Method in an OSCE Management System

The Standard Error of Measurement (SEM) indicates the amount of error around the observed score. The observed score, the score we retrieve, store and analyse from an OSCE, is in fact the result of the true score and error around this true score. If we want a reliable decision around passing or failing a station e.g. an OSCE, we need to incorporate the SEM in that decision.

Observed Score is the true ability (true score) of the student plus the random error around that true score. The error is associated with the reliability or internal consistency of score sheets used in OSCEs. Within our system, Qpercom calculates Cronbach’s alpha as a reliability score indicating how consistent scores are being measured, and the Intra Class Correlation coefficient; how reliable are scores between the different stations (Silva et al., 2017). These classical psychometric measures of the data can be used to calculate the SEM. An observed score +/- the SEM means that with 68% certainty the ‘true score’ of that station is somewhere in between the actual score, plus or minus the SEM. In principle, one should consider plus or minus the 95% Confidence Interval, which is the Observed score plus or minus 1.96 * SEM (Zimmerman & Williams, 1966). read more

An Online Management Information System for Objective Structured Clinical Examinations

A few words about this paper…

During 2006 – 2008, David Cunningham, as an intern, and myself as a lecturer, were engaged with teaching & learning in the National University of Ireland, Galway, in the School of Medicine (Medical Informatics & Medical Education in those days). Our OSCE procedures involving the planning and execution of the examination was typically laborious, as it is for this exam. Planning was one thing, but what about results? We encountered issues with forms and results. On top of this, the study recorded one typical OSCE exam with 30% errors and a high cost of automation. With Cussimano’s €4.70 staff cost per student, per station and our estimate of €2.80 administration costs per submitted paper form, total cost of an OSCE could be estimated to be €7.50 per student, per station.  read more

Back to the Future 1: Electronic Marking of Objective Structured Clinical Examinations and Admission Interviews Using an Online Management Information System in Schools of Health Sciences

A few words on this paper…

‘Back to the Future’ refers to the 1985 American science fiction film directed by Robert Zemeckis, featuring Micheal J. Fox as teenager Marty McFly. Marty, a 17-year-old high school student, is accidentally sent thirty years into the past in a time-traveling DeLorean invented by his close friend, the maverick scientist, Doc Brown.

We looked back 44 years, to when Professor R.M. Harden invented the paper-based OSCE in 1974. The future would have to be about the actual results. Facing 30% errors in our paper assessment results, we had a problem to solve. With incomplete forms and failures in adding up the results, we decided to automate the OSCE procedure. Planning, form submission and data analysis is all done electronically and this paper provides insights in the automated features. read more

Reliability and validity of OSCE checklists used to assess the communication skills of undergraduate medical students: A systematic review

A few words about this paper…

In 2011, Winny from Indonesia approached me to ask whether he could join us for a PhD track. It would be an opportunity to investigate the wide range of communications stations used within our School of Medicine. Data was collected using our OSCE Management Information System. A systematic review was commenced to find out where the flaws in practice were, and it was successful. If a clinical skills trainer addresses that he/she is responsible for a communication skills station I ask, which of the 18 domains of communications skills are you going to assess? Silence usually follows and low Cronbach’s alpha (internal consistency of the assessment form) at a later stage is very likely. Winny’s paper (to date, November 2018) is referenced 17 times by other researchers. read more

Calibration of Communication Skills Items in OSCE Checklists according to the MAAS-Global

A few words about this paper…

After the discovery that about 17 different styles of communications skills are used in the field of communication skills training in medical education, it was apparent we needed to validate the communication skills items included in OSCE checklists. Within our own School of Medicine, in the College of Medicine, Nursing and Health Sciences of the National University of Ireland in Galway, about 280 OSCE stations assessment forms throughout 4 years, and from 4 different medical specialties contained a variety of communication skills items. None of these were ever validated using existing reliable and valid Communication Skills Questionnaires. read more

The fairness, predictive validity and acceptability of multiple mini interview in an internationally diverse student population- a mixed methods study

A few words about this paper…

For the Irish and moreover in an international context, an important paper written by my colleague in the School of Medicine/National University of Ireland, Galway, Dr Maureen Kelly. Once multiple-mini-interviews were made available in an electronic fashion, data retrieval, storage and analysis appeared more accessible than collecting all data from paper score-sheets. International medical students, those attending medical school outside of their country of citizenship, account for a growing proportion of medical undergraduates worldwide. This study aimed to establish the fairness, predictive validity and acceptability of Multiple Mini Interview (MMI) in an internationally diverse student population. MMI appears to be a welcome addition to assessment armamentarium for selection, particularly with regard to stakeholder acceptability. Understanding the mediating and moderating influences for differences in performance of international candidates is essential to ensure that MMI complies with the metrics of good assessment practice, and principles of both distributive and procedural justice for all applicants, irrespective of nationality and cultural background.  read more

True communication skills assessment in interdepartmental OSCE stations: Standard setting using the MAAS-Global and EduG

A few words about this paper…

In medical education it is extremely helpful to compare outcomes. To be able to compare communication skills outcomes between students, years of study or between institutions is very challenging. If the measurement of particular learning outcomes is not standardised, just as using a standardised measurement tape to measure length, you cannot trust the outcome. In this study we attempted to compare communication skills outcomes between groups of students.

Since communication skills assessment forms are not standardised at our School of Medicine within the College of Medicine, Nursing and Health Sciences of the National University of Ireland in Galway, we developed the MAAS-Global proportion (MG-P) as a result of one of our previous studies. If we know how large the MG-P of an assessment form is we might be able to compare different students, groups of students or years of the curriculum. We therefore introduced the MAAS-Global score followed by MAAS-Global proportion and section percentage. read more