Keywords
ophthalmology residency match - success rate - outcomes analysis
Candidates for ophthalmology residency apply through the ophthalmology residency matching
program, administered by the San Francisco Residency and Fellowship Matching Services
(SF Match). A universal application is submitted to SF Match, which in turn distributes
it to the programs specified by the applicant. After completion of the interview process,
the programs and the applicants each submit rank lists to SF Match, which uses an
algorithm based on preferences submitted to place candidates in a training program.
Ophthalmology continues to be perceived as one of the most competitive medical specialties.
This is demonstrated by the steady increase in the mean number of applications submitted
per applicant over the past 10 years, from 48 in 2008, to 68 in 2017 based on SF Match
tracking data. Despite this trend, the overall match rate has remained relatively
stable (mean 74%, range 70–78%, for the period 2008–2017). This continued rise in
the number of applications submitted per applicant places a significant financial
burden on students as well as a tremendous administrative load on residency programs.
The cost structure consists of a $100 registration fee and a $60 base fee for application
to 10 programs, with incremental cost increases of $ 100 to 350 for each additional
10 programs. Thus, the total fee for applying to 40 programs is $610, which increases
to $2,010 for 80 programs and $2,710 for 100 programs.
Due to the perceived difficulty of the ophthalmology match, applicants may feel the
need to apply to more programs to gain more interview invitations and increase the
likelihood of matching successfully. However, this assumption may be based on incomplete
or inaccurate information from peers or advisors, given that the match is a complex
process based on both quantitative and qualitative characteristics.
Although the National Residency Matching Program (NRMP) provides an annual report
detailing match outcomes for other specialties, this information is not available
for ophthalmology. For example, NRMP data for otolaryngology (found at https://students-residents.aamc.org/applying-residency/article/apply-smart-data-consider/) demonstrate that candidates with United States Medical Licensing Examination (USMLE)
step 1 scores > 250 have a mean match rate of 88% with diminishing chances of matching
for students who applied to more than 39 programs; similar data for USMLE score 237–249
and <237 are 80% and 66%, and 43 and 44 programs, respectively. Providing similar
data to candidates for ophthalmology residency programs will assist candidates in
effectively allocating effort and investment in the application process, while at
the same time maximizing the successful match rate among qualified applicants.
The goal of this study was to develop a model to predict the probability of matching
to an ophthalmology residency program based on applicant characteristics and number
of application submissions by analyzing SF Match data.
Methods
Data Source: Deidentified application and ophthalmology residency matching data for the 2013,
2014 and 2015 match cycles collected by the Association of University Professors of
Ophthalmology were used in the study. Independent variables include the number of
program applications, the number of invitations for interview, USMLE step 1 score,
international medical graduate (IMG) status, Alpha Omega Alpha membership, presence
of research activity as stated by the applicant, number of times that the applicant
applied for an ophthalmology residency, number of programs ranked by the applicant,
the total number of programs that ranked the applicant, the matched position on the
program's rank list, and the applicant's rank of the matched program. Data on individual
interview invitations offered and interviews completed is provided by programs and
applicants, respectively, through a self-reporting process, which may result in inadvertent
errors, and is discussed in more detail later.
Statistical Analysis Methods: Descriptive statistics were estimated to summarize the characteristics of the applicants
across the three application cycles. Summaries were presented after stratifying by
program matching success, first versus repeat attempt, and IMG status. An independent
sample t-test was used to compare the means of continuous variables, and a chi-square test
was used to compare categorical variables between independent groups. The Wilcoxon
rank sum test and Fisher's exact test were used in the case of skewed distributions
for continuous variables or small expected frequencies for categorical variables,
respectively. Linear and segmented regression modeling[1] was used to quantify the association between number of applications submitted and
the number of interview invitations. SAS 9.4 was used to perform the descriptive summary
and regression modeling (SAS Institute, Cary, NC).
To provide an intuitive model to predict the probability of matching, a recursive
partitioning method was used in the multivariable analysis.[2] The recursive partitioning analysis used USMLE score and number of program applications
as the two predictors of matching status. The analysis was first stratified by IMG
status and then by first-attempt status as a separate stratification factor. Models
were estimated separately within each resulting subgroup. The minimum terminal node
size was set to 20. A fivefold cross-validation method was used to identify a best-fit
model. A fivefold R
2 value was presented, which was the proportion of the variability in the response
that was explained by the model. The resulting models were presented as decision trees.
Subgroups with similar probabilities of matching in the decision trees were combined.
JMP software was used to fit the decision trees (JMP version 11.2.0, 2013 SAS Institute).[3]
Results
Across the 2013, 2014, and 2015 application cycles, 1,959 unique individuals were
identified. [Table 1] includes a summary of characteristics for all candidates. On average, applicants
submitted 64 applications (standard deviation [SD] = 27, 95% confidence interval [CI]:
62.9–65.3), received 9 invitations for interview (SD = 6, 95% CI: 9.0–9.6), and had
a score of 238 in USMLE step 1 (SD = 16, 95% CI: 237.6–239.0). Among applicants, 13%
were IMGs (95% CI: 11–14%), 88% were first-time submissions (95% CI: 87–90%), and
71% accomplished a successful match (95% CI: 69–73%). For applicants who submitted
multiple applications (n = 226, 12%), the most recent attempt was retained in the study. The most recent attempt
was selected to reflect the overall status of each applicant during the study time
period (2013–2015) and was necessary to ensure independent applicants, avoiding correlated
measures, in the analysis set. The approach results in a slightly increased probability
of matching (71%), compared with analysis of first attempts only (68%).
Table 1
Descriptive summary for all applicants (n = 1,959)
Variable
|
Mean
|
SD
|
Min
|
Max
|
Median
|
Q1
|
Q3
|
IQR
|
No. of program applications
|
64.06
|
26.91
|
1
|
113
|
63
|
45
|
83
|
38
|
No. of invites for interview
|
9.29
|
6.3
|
0
|
29
|
9
|
4
|
14
|
10
|
USMLE step 1 score[a]
|
238.3
|
16.42
|
182
|
275
|
241
|
229
|
250
|
21
|
No. of attempts
|
1.16
|
0.49
|
1
|
5
|
1
|
1
|
1
|
0
|
No. of institutions ranked by the applicant[b]
|
10.45
|
5.04
|
1
|
113
|
11
|
8
|
13
|
5
|
Total no. of programs that ranked the applicant[b]
|
9.85
|
4.19
|
1
|
14
|
10
|
1
|
4
|
3
|
Matched position on the program's rank list[b]
|
12.49
|
9.52
|
1
|
57
|
11
|
5
|
18
|
13
|
Applicant's rank of the matched program[b]
|
2.92
|
2.41
|
1
|
14
|
2
|
1
|
1
|
0
|
|
N
|
%
|
|
|
|
|
|
|
IMG
|
247
|
13
|
|
|
|
|
|
|
Successful match
|
1391
|
71
|
|
|
|
|
|
|
AOA member[a]
|
422
|
44
|
|
|
|
|
|
|
Published research
|
1948
|
99
|
|
|
|
|
|
|
Invited for ≥ 1 interview
|
1802
|
92
|
|
|
|
|
|
|
First attempt
|
1733
|
88
|
|
|
|
|
|
|
Abbreviations: AOA, Alpha Omega Alpha Honor Medical Society; IMG, international medical
graduate; IQR, interquartile range (75th percentile–25th percentile); Max, maximum;
Min, minimum; Q1, 25th percentile; Q3, 75th percentile; SD, standard deviation; USMLE,
United States Medical Licensing Examination.
a Number of missing observations for USMLE = 46, AOA status = 992.
b For these four variables, summaries are restricted to applicants who matched (n = 1,391).
[Table 2] includes a summary of characteristics according to matching status (matched or did
not match). Data analyzed included number of interviews offered, number of programs
ranked by the candidate, number of programs that ranked the applicant, and rank position
of program by candidate and candidate by program. Applicants who successfully matched
submitted 66 applications (SD 24) and ranked 12 programs (SD 7) on average, whereas
those who failed to match submitted 60 applications (SD 33) (p = 0.0004) and ranked 2 programs (SD 4) on average (p < 0.0001). In addition, matched applicants performed better on USMLE step 1, with
a mean score of 243 (SD 13), compared with a mean score of 226 (SD 18) for unmatched
applicants (p < 0.0001). Furthermore, 95% of matched applicants were applying on their first attempt,
whereas only 72% of unmatched applicants were on their first attempt (p < 0.0001). IMGs were significantly less likely to be in the matched group (4% vs
35%, p < 0.0001).
Table 2
Descriptive summary of characteristics according to matching status
Variable
|
Matched
(n = 1,391)
|
Did not match
(n = 568)
|
p-Value[a]
|
Mean
|
SD
|
Mean
|
SD
|
No. of program applications
|
65.63
|
23.64
|
60.21
|
33.31
|
0.0004
|
No. of invites for interview[b]
|
12
|
7
|
2
|
4
|
<0.0001[b]
|
USMLE step 1 score[c]
|
242.99
|
13.28
|
226.23
|
17.54
|
<0.0001
|
No. of attempts
|
1.06
|
0.28
|
1.4
|
0.74
|
NP[d]
|
|
Count
|
%
|
Count
|
%
|
p
-Value
[e]
|
IMG
|
49
|
4
|
198
|
35
|
<0.0001
|
AOA member[c]
|
391
|
55
|
31
|
12
|
<0.0001
|
Published research
|
1390
|
99
|
558
|
98
|
<0.0001
|
Invited for ≥ 1 interview
|
1391
|
100
|
411
|
72
|
NP[f]
|
First attempt
|
1325
|
95
|
408
|
72
|
<0.0001
|
Abbreviations: AOA, Alpha Omega Alpha Honor Medical Society; IMG, international medical
graduate; IQR, interquartile range (75th percentile–25th percentile); Max, maximum;
Min, minimum; NP, not performed; Q1, 25th percentile; Q3, 75th percentile; SD, standard
deviation; USMLE, United States Medical Licensing Examination.
a
p-Values are based on t-test.
b Descriptive summary based on median and interquartile range. p-Value based on Wilcoxon rank sum test.
c Number of missing observations for USMLE (matched) = 13, AOA status (matched) = 679,
USMLE (not matched) =33, AOA status (not matched) =313.
d Given the skewed nature of the variable indicating the number of application attempts,
hypothesis testing is based on the dichotomous variables indicating the first application
attempt.
e
p-Value for published research is based on Fisher's exact test, other p-values are based on chi-square test.
f Hypothesis testing was not performed given that by definition, those who match to
a program had at least one interview.
[Figs. 1] and [2] illustrate the flow of applicants through the application, interview, and matching
process after stratifying by IMG status and the number of attempts. The overall matching
rate was higher among non-IMGs compared with IMGs (78% vs 20%, p < 0.0001) and among those on a first attempt compared with a repeat attempt (76%
vs 29%, p < 0.0001).
Fig. 1 Flow of applicants (stratified by international medical graduate [IMG] status) through
the application, interview, and matching process. Note: percentage was calculated
based on the number of applicants in the previous level.
Fig. 2 Flow of applicants (stratified by first/repeat attempts) through the application,
interview, and matching process. Note: percentage was calculated based on the number
of applicants in the previous level.
The association between the number of applications submitted and the number of interviews
offered at programs ranked is summarized in [Fig. 3] after stratifying by IMG status. The red line corresponds to the best-fit simple
linear association. The blue line allows for a nonlinear association and indicates
the change point for the slope of the best-fit line. Based on the estimated segmented
regression line, the number of invitations for interviews will generally increase
among IMGs. In contrast, for non-IMGs, the change point of the association between
number of applications and number of interviews is 39. This indicates that the association
was positive for individuals who submitted up to 39 applications, after which point,
the association became negative (p < 0.0001). It is important to note that among the 1,712 non-IMGs, 1,484 (87%) submitted
more than 39 applications. When all applicants are considered, regardless of IMG status,
the change point of the association between number of applications and number of interviews
is 48, indicating that when the number of applications is less than 48, the association
is positive; when the number of applications is greater than 48, the association is
negative. Among the 1,959 applicants, 1,398 (71%) submitted more than 48 applications.
Fig. 3 Plot number of interviews (y-axis) versus number of applications (x-axis) for (A) international medical graduates (IMGs) (n = 247) and (B) non-IMGs (n = 1,712).
[Fig 4] demonstrates that the probability of matching reaches the peak of 81% with the group
who submitted 41 to 60 applications and decreases thereafter. Additionally, 538 (39%)
applicants matched with their top-ranked program, while 268 (19%) applicants matched
at their second-ranked program. A majority (87%) of the applicants who matched did
so within their top five choices.
Fig. 4 Summary of matching characteristics. (A) Probability of matching by categories of number of applications (n = 1,959) and (B) cumulative percentage of applicants' rank of the matching program for applicants
who matched (n = 1,391).
After univariate data analysis, a multivariable, recursive partitioning algorithm
was created to identify three different groups based on their predicted probability
of matching, resulting in those with a low probability of matching (probability < 0.4),
a moderate probability of matching (probability between 0.4 and 0.8), and a high probability
of matching (probability > 0.8). The analysis used USMLE score and number of program
applications as the two predictors because these were nonmissing for a large percentage
of applications, with 1,913 applicants who had nonmissing observations for these variables.
The analysis was first stratified by IMG status and then by first-attempt status as
a separate stratification factor. Models were estimated separately within each resulting
subgroup. [Figs. 5] and [6] present the prediction models based on stratification factors. In summary, among
the non-IMG applicants, those with a “high” probability of matching were those with
USMLE ≥ 244 or those with a USMLE between 231 and 243 who submitted at least 30 applications.
Similarly, among those at their first attempt, individuals with a USMLE ≥ 233 were
predicted to have a “high” probability of matching. None of the IMG applicants and
none of those at a second attempt were categorically predicted to have a “high” probability
of matching. Each model explained no more than 18% of the variability in the probability
of matching (fivefold cross-validation R
2 ≤ 0.18).
Fig. 5 Classification tree summary of factors predictive of matching for (A) non-IMG (international medical graduate) applicants (n = 1,671). The resulting decision tree explains 18% of the variability in the probability
of matching (5-fold cross-validation R2
= 0.18). (B) IMG applicants (n = 242). The resulting decision tree explains 7% of the variability in probability
of matching (5-fold cross-validation R2 = 0.07). Green, high probability of matching (>0.8); NPA, number of program applications;
P, predicted probability of matching; Red, low probability of matching (<0.4); USMLE,
United States Medical Licensing Examination; Yellow, moderate probability of matching
(0.4–0.8).
Fig. 6 Classification tree summary of factors predictive of matching for (A) first attempt applicants (n = 1,690). The resulting decision tree explains 14% of the variability in probability
of matching (5-fold cross-validation R2
= 0.14). (B) second or higher attempt applicants (n = 233). The resulting decision tree explains 14% of the variability in probability
of matching (5-fold cross-validation R2
= 0.14). Green, high probability of matching (>0.8); NPA, number of program applications;
P, predicted probability of matching; Red, low probability of matching (<0.4); USMLE,
United States Medical Licensing Examination; Yellow, moderate probability of matching
(0.4–0.8).
Discussion
Roughly 70% of candidates match in ophthalmology, and 70% match at one of their top
three choices. Yet a majority (53%) apply to more than 60 programs and over one-quarter
(27%) apply to more than 80. Clearly, more data are needed for program directors and
faculty advisors to assist them in the application and matching process.
The recursive partitioning model that we have described is extremely useful in this
regard. Since the match rates are so disparate for IMG versus US graduates and first-
versus nonfirst attempts, these are logical stratification points. The group with
the highest match rate (93%) is US graduates, regardless of first or repeated attempt,
with a USMLE step 1 score of 244 or greater. The second highest match rate (87%) was
among all first-time candidates with a USMLE score of 233 or greater. For these two
groups, the number of programs applied to or ranked was not a factor in matching success.
The third highest match rate (83%) occurred among US graduates with USMLE scores from
231 to 243 who applied to 30 or more programs. Although our data do not allow specific
recommendations on exact numbers, it would seem unlikely that many candidates in these
groups should be advised to apply to more than 40 programs, especially as among all
applicants overall, applying to more than 48 programs is not associated with an increasing
number of interviews; this number decreases to 39 when confined to US graduates alone.
These suggested thresholds are well below the average number of applications submitted
by all applicants (mean 64, SD 27), US graduates (mean 65, SD 25), and IMG applicants
(mean 56, SD 36).
The fourth highest success rate (66%) was among US graduates with USMLE scores of
217 to 230 who applied to 43 or more programs, and the next highest (61%) among first-time
applicants with USMLE < 233 who applied to 47 or more programs. However, those with
scores 229 to 232 who applied to fewer than 47 programs had a similar match rate (60%).
These groups seem the most likely to benefit from applying to more than 40 programs.
For candidates who have previously not matched, the single most important factor in
matching was a USMLE score > 235 (55%), independent of number of programs applied
to.
For candidates not in the groups described above, there were no subgroups with match
rates > 50%. US graduates and first-time applicants with USMLE scores < 231 should
be advised to apply to more than 45 programs. Non-US graduates, regardless of USMLE
score, and previously unmatched candidates with USMLE score 220–235 did not achieve
a moderate probability of matching until they had applied to more than 100 programs
(43 and 42%, respectively).
While we did find that AOA (Alpha Omega Alpha Honor Medical Society) membership conferred
an increased probability of matching (55% vs 12%, p < 0.0001), this finding must be interpreted cautiously. AOA membership status was
missing for 992 (51%) of applicants and, therefore, could not accurately be evaluated
as an independent predictive factor of matching, relative to USMLE score or number
of submitted applications in our analysis. Some US medical schools, and many non-US
schools, may not have an AOA chapter, and some offer both junior and senior admission,
while others offer only senior admission. Additionally, differences among schools,
timing of application submissions, and the fact that individual chapters have nonstandard
admission criteria and protocols are significant confounders which limit interpretation
of results.
Similarly, research activity was common in both matched and unmatched groups. However,
the analysis simply used the presence of absence of any research activity; we did
not distinguish many relative criteria regarding research, for example, whether the
research activity was print publication(s), the impact factor of any journals, presentation
at local versus regional versus national meetings, or whether the activity was merely
participation of some sort in an uncompleted project. Finally, in recent years the
Gold Humanism Honor Society has grown across the medical community and is increasingly
associated with many of the character traits felt to be important for competent physicians.
It is certainly possible that any of these factors could be significantly correlated
with probability of matching, and revision of the application form to clarify and
standardize reporting of these factors would offer further assistance in advising
candidates. In a similar vein, a standardized letter of recommendation format has
been recommended for ophthalmology candidates, although to date it has not been widely
adopted.
A final set of limitations to this study surrounds the fact that data on interview
invitations and completions is self-reported by both programs and applicants. In the
match system, programs must mark an applicant as invited for an interview before they
can view candidate photos or add them to a rank list. Similarly, applicants are asked
to mark an interview as having been completed, but in some cases may make or not make
this designation erroneously. As an example of the consequences of this methodology,
in the 2015 application cycle there were 726 candidates who submitted 634 rank lists.
Of these, 624 applicants were ranked. Programs reported offering a total of 6,594
interview invitations with completion of 5,503 of interviews. However, in the same
cycle, applicants reported 6,655 interview invitations with 5,749 interviews completed.
The newly formed SF Match Oversight Committee of the AUPO intends to review this design
system and consider alternatives to minimize such discrepancies.
Although this analysis yields the ability to provide overall advice to various categories
of candidates, it does not permit more detailed candidate-specific data that includes
individual portfolios and “good fits” for either candidates or programs in creating
their rank lists. The focus of this study may make many candidates and advisors more
comfortable with a smaller number of programs applied to, but does not assist with
determining which candidates should apply to which programs. In this regard, programs
should consider providing candidates with standardized data regarding their residents
(e.g., program mean USMLE scores, grade point average (GPA), AOA status, research
participation in medical school) to help them determine their best application strategy.
In summary, these data indicate that many candidates for ophthalmology residency (first-time
applicants/US grads with USMLE scores >243) need not apply to 60 or 80 programs to
successfully match; for most of them, 40 to 45 applications should suffice. The 45
to 60 range may be indicated for initial applicant US graduates with USMLE scores
in the 217 to 243 range. A relatively small number of candidates, especially international
graduates and previously unmatched candidates (12% each of the entire applicant pool),
should consider applying to more than 80 programs, particularly if their USMLE score
is <236.
Finally, we recognize that these recommendations must always be individualized, as
there are many additional factors that cannot be easily standardized or quantified
which nevertheless play important roles in the likelihood of candidates matching.
These include variable and qualitative grading rubrics, communication skills, strength
of letters of recommendation, extracurricular and employment experiences, and interview
performance. In the end, applicants for ophthalmology residency are ranked, and thus
matched, by a holistic consideration of their entire candidate portfolio.