Keywords intersession experiences - psychotherapy process - routine outcome monitoring
Schlüsselwörter Intersession Prozesse - Psychotherapieprozess - Routine Outcome Monitoring
Introduction
The systematic assessment of therapeutic processes and outcomes is increasingly
recognized as a cornerstone of evidence-based clinical practice, as routine outcome
monitoring (ROM) and feedback systems have shown benefits for the course and success
of psychotherapy [1 ]. Economic
measurement tools can help increase the already good acceptance of progress
tracking. This facilitates treatment personalization by enabling therapists to make
timely, evidence-informed decisions. However, brief instruments suitable for
repeated measurements are not yet available for all relevant dimensions of
psychotherapy. One such factor that lends itself to being harnessed in
high-frequency monitoring and feedback systems but is scarcely represented within
the international research landscape, are intersession experiences .
Intersession experiences are thoughts, feelings, memories, and fantasies about
therapy or the therapist that occur between sessions [2 ]. The concept is grounded in
attachment theory, object relations theory, and mentalization/reflective
functioning, but it is relevant across theoretical orientations and diagnoses to
understand the translation of therapy contents into daily life. This aligns with the
Generic Model of Psychotherapy [3 ],
which emphasizes that lasting change arises from ongoing micro-outcomes that extend
beyond sessions: As patients internalize therapeutic insights and apply them to
real-life situations, they gain new experiences that contribute to personal growth
and problem-solving capacities.
Intersession experiences have shown associations with patient characteristics, the
current in-session process, the intensity of the therapeutic process, and the
patient’s emotional involvement [4 ].
Positive intersession experiences are correlated with the therapeutic relationship
[5 ]
[6 ]
[7 ] as well as treatment success [4 ]
[5 ]
[8 ]. While frequent assessments and
longitudinal investigations are still scarce [8 ]
[9 ], emerging evidence indicates that
the emotional quality of intersession experiences is decisive for psychotherapeutic
progress and outcome, also in the sense of tightly coupled temporal dynamics [10 ]. Questionnaires suitable for
repeated measurements have been developed [2 ]
[11 ]
[12 ], among which the Intersession
Experience Questionnaire (IEQ) [2 ] is
the most widely used instrument. The original English version has been revised
multiple times and includes 42 items. Hartmann, Orlinsky, Geller, et al. [12 ] validated a German translation that
encompasses 52 items.
A more brief assessment would not only reduce patient burden but also enable rapid
evaluations (to inform ROM), the implementation of intersession processes in more
dense ambulatory assessment protocols, and use as a process variable , a
perspective that is currently underrepresented [13 ].
To this aim, we examine the psychometric properties of a new, shorter instrument we
call the Intersession Experience Questionnaire-Short (IEQ-S; German:
Intersession-Fragebogen-kurz, ISF-K) and test expected positive associations with
the longer, original IEQ, other constructs such as the working alliance, defense
mechanisms, and symptoms (to support convergent validity); and comparatively weak
associations with global assessments of mental and physical health (in terms of
discriminant validity). Furthermore, we explore its sensitivity to changes over
time, a prerequisite for effective monitoring, and how it relates to symptoms in a
longitudinal assessment.
Methods
Design and Participants
Samples 1–5 were collected to support the initial validation of the IEQ-S.
Participants filled out the full IEQ, the shorter IEQ-S, and further
questionnaires once, approximately three weeks after therapy start. This was to
ensure comparability between patients, as intersession experiences may shift
over the course of psychotherapy [8 ]. Sample 6 differs regarding the design as this was a longitudinal
assessment of only the IEQ-S and a symptom measure session by session.
Across samples and settings, patients were included in the study if they were at
least 18 years old, had an ICD-diagnosis F. XX (mental and behavioral
disorders), had participated in at least one past individual/group psychotherapy
session. Patients were excluded if they had acute psychotic symptoms or did not
speak German.
The Ethics Committee of the University of Klagenfurt approved the study in whose
context samples 1–5 were collected (nr. 2018–084). The collection and use of the
data for sample 6 for research purposes were approved by the ethics committee of
the German Psychological Society (DGPs) in 2022. All assessments were conducted
in accordance with the principles of the Declaration of Helsinki and all
participants provided informed consent.
Instruments
The German version of the IEQ [12 ]
was the basis for the development of the IEQ-S. Its items are organized into
five item groups, three of which have subfactors (Suppl. Table 1 ). The
items are rated on a five-point Likert scale from 0=not at all to 4=very often.
The authors of the original IEQ suggest that the analysis be performed at the
factor level.
For the IEQ-S item selection, we chose each item of every factor with the highest
factor loadings based on the published psychometric values of the factor
analyses [6 ]
[12 ]. The original item group
“significant others, sharing intersession experiences” showed poor psychometric
properties. Further, it was not considered as directly relevant to the
therapeutic relationship as other intersession experiences and could furthermore
be confounded by the quality and quantity of patients' relationships
outside of psychotherapy, so it was omitted. As a result, the IEQ-S encompasses
eight items (Suppl. Table 2 ).
In addition to the IEQ and IEQ-S, patients in samples 1–5 filled out the
following questionnaires:
The Working Alliance Inventory – Short revised (WAI-SR) [14 ] is an instrument for the
assessment of therapeutic alliance and comprises a total of 12 items. Items are
being answered on a five-point Likert scale (1–5). The WAI-SR assesses the three
dimensions with four items each, and scale scores are computed as the mean of
the respective items: agreement on the tasks of therapy, agreement on the goals
of therapy, and development of an affective bond.
The Symptom-Checklist Short 9 (SCL-K-9) [15 ] is a one-dimensional 9-item
short version of the SCL-90-R and assesses the symptom burden during the past 14
days. Items are answered on a five-point Likert scale (0–4), mirroring the
extent to which individuals were burdened by the respective symptom, and then
summed up. The sum score thus ranges from 0 to 36.
The Short Form 12 is a self-reported outcome measurement and the short
version of the SF-36. It captures health on eight dimensions: physical
functioning, role-physical, bodily pain, general health, vitality, social
functioning, role-emotional, and mental health. It was also validated based on a
representative German population sample [16 ]. In the present work, we concentrated on the scales physical and
mental health (to investigate the IEQ-S' discriminant validity), which each
contain two items. The scale values determined can range from 0 to 100 points,
with low values reflecting poorer health and higher values reflecting better
health.
The Defensive Style Questionnaire (DSQ-40) [17 ] is a self-report measurement
for the assessment of one's own defense mechanisms. It is composed of a
total of 40 items, which are responded to on a nine-point Likert scale, loading
on a total of three factors: maladaptive defense, intermediate neurotic defense,
and adaptive defense.
In sample 6, patients completed the depression module of the Patient Health
Questionnaire (PHQ-9) [18 ] which
assesses the DSM-IV diagnostic criteria for major depressive disorder.
Respondents indicate how often they have been bothered by each symptom over the
past two weeks, using a 4-point Likert scale (0=not at all to 3=nearly every
day), yielding a total score between 0 and 27.
Statistical Analyses
For the investigation of the psychometric properties of the IEQ-S, we first used
the cross-sectional data of samples 1–5. Exploratory factor analyses (EFA) were
conducted on the combined data. All information regarding the EFA is provided in
the Supplemental Material (Suppl. Table 3 ).
Internal consistency estimates were calculated as McDonald’s omega (ω), which we
preferred over Cronbach’s α as it does not assume tau-equivalence and provides a
more accurate reliability estimate. Values of≥.70 are generally interpreted as
acceptable [19 ].
Based on sample 6 that included longitudinal assessments, we calculated
intraclass correlations (ICCs) to quantify the proportion of variance
attributable to between-person differences versus within-person fluctuations
over time, using unconditional random intercept models. Higher ICC values
indicate that a larger proportion of the variance is stable across persons,
whereas lower ICC values suggest greater within-person variability (i. e.,
state-like fluctuations). To further capture temporal dynamics, we calculated
mean squared successive differences (MSSD), which quantify the magnitude of
short-term variability by examining the squared differences between consecutive
data points. MSSD values were first computed at the person level and then
averaged across the sample. Higher MSSD values suggest comparatively more
dynamic variation in these intersession experiences, whereas lower values
indicate more temporal stability. We also calculated repeated-measures
correlations of the IEQ-S and the PHQ-9 sum score. The repeated-measures
correlation coefficient quantifies this association while accounting for the
non-independence of observations within individuals.
Data were analyzed using R statistics version 4.5.0 [20 ] using the packages dplyr,
esmpack, ggplot2, nlme, psych, remotes, and rmcorr.
Results
Sample characteristics
The combined sample for the psychometric analyses included 237 patients from six
clinical settings in Austria and Germany. [Table 1 ] depicts the separate
samples. Sample 1 is composed of n =37 inpatients from an Austrian private
clinic. Patients are admitted for six weeks and receive multimodal treatment.
The items were answered in relation to the individual therapy. Sample 2
(n =58) was collected at an Austrian inpatient rehabilitation clinic
where patients were also admitted for six weeks and underwent a multimodal
therapy program. The questionnaires also referred to the individual therapy.
Sample 3 is composed of n =27 outpatients from an Austrian outpatient
psychotherapy center.
Table 1 Characteristics of the combined sample and the six
subsamples.
Sample
Combined samples 1–6
1: Private Clinic
2: Rehabilitation Center
3: Psychotherapy Center I
4: Student Counseling Center
5: Day Clinic
6: Psychotherapy Center II
Country
Austria
Austria
Austria
Austria
Germany
Austria
Setting
inpatients
inpatients
outpatients
outpatients
outpatients
outpatients
Therapy
individual
individual
individual
individual
group
individual
n
237
37
58
27
20
23
72
Age
M (SD)
43.59 (11.62)
51 (9.00)
46.65 (9.71)
42.39 (8.23)
28.75 (7.51)
47.17 (12.54)
40.69 (13.21)
Minimum – Maximum
18–81
32–68
25–64
27–55
21–53
23–65
18–81
Gender (n ,%)
Women
133 (56.12)
15 (40.54)
37 (63.80)
20 (74.07)
–
13 (56.52)
48 (66.7)
Men
75 (31.65)
18 (48.65)
21 (36.20)
5 (18.52)
–
10 (43.84)
21 (29.2)
Primary Diagnosis (ICD-10) (n ,%)
F3
129 (54.4)
25 (67.57)
40 (68.97)
19 (70.37)
2 (10.00)
21 (91.30)
22 (30.6)
F4
52 (21.9)
7 (18.92)
15 (25.86)
2 (7.41)
13 (65.00)
2 (8.70)
13 (18.1)
Other
40 (16.9)
1 (2.70)
1 (1.72)
2 (7.41)
–
–
36 (50.0)
Not available
16 (6.8)
4 (10.81)
2 (3.50)
4 (14.81)
5 (25.00)
–
1 (1.4)
Note. Diagnoses of International Statistical Classification of
Diseases and Related Health Problems 10th Revision (ICD-10):
F3: affective disorders; F4: neurotic, stress-related, and somatoform
disorders.
The fourth sample comprised n =20 outpatients from a psychological
counseling center for students. On average, students are in individual treatment
for six months. Sample 5 consists of n =23 patients from a German day
clinic. They answered the items referring to their group therapy.
Sample 6, which was collected longitudinally, is based on routine clinical data
from the Psychotherapeutic Research and Teaching Outpatient Centre
(Psychotherapeutische Forschungs- und Lehrambulanz der Universität Klagenfurt,
PUK) of the University of Klagenfurt, which offers psychodynamic psychotherapy
to adults free of charge. The dataset included 2,119 sessions from 72 patients
(M =29.43; SD =27.35 sessions/patient).
IEQ and IEQ-S descriptives
For each variable within the cross-sectional dataset (samples 1–5), the rate of
missing values was between 0% and 4.2%. The missing completely at random test
(MCAR-Test) shows that the missing values were missing completely at random
(χ
2 (5287)=4484.34, p =0.99). Therefore, we used an
iterative maximum-likelihood method, the expectation maximization (EM)
algorithm, to impute missing values. The descriptive data for each questionnaire
are shown in Suppl. Table 4 . The mean reports on the IEQ-S items are
shown in [Fig. 1 ], indicating
variation in how frequently different intersession experiences were
reported.
Fig. 1 Participants’ responses on the eight Intersession
Questionnaire – Short items. Legend: Each box represents the
distribution of responses for a single item. The horizontal line inside
each box indicates the median, while the box boundaries reflect the
interquartile range (IQR). Whiskers extend to 1.5×IQR. Items 1, 2, 5,
and 7 exhibit relatively high median values and narrow interquartile
ranges, indicating these types of experiences were reported more
frequently and consistently. In contrast, items such as 3, 4, 6, and 8
show lower median scores and broader or more skewed distributions,
suggesting they occurred less frequently or were more variable between
individuals. Responses to items 3 and 6 are strongly skewed toward the
lower end of the scale, with the majority of participants reporting they
occurred rarely or never.
Factor analyses and internal consistency
Exploratory factor analyses supported a three-factor solution that explained 42%
of the variance and showed excellent fit, χ²(7)=5.18, p =0.639. A
screeplot also supported a one-factor solution, but this explained less variance
(28%) and showed worse fit, χ²(20)=38.72, p =0.007.
The correlations between the IEQ-S items are shown in Suppl. Table 5 . The
internal consistency of the IEQ-S is ω=0.72, which can be seen as acceptable and
allows for the calculation of a sum score. We also calculated the internal
consistencies of the longer, original IEQ factors (Suppl. Table 1 ).
Convergent Validity
We examined the correlations between the overall mean score of the IEQ and IEQ-S.
It was r =0.70, p <0.001 (Suppl. Figure 1 ). A Bland-Altman
Plot visualizes their agreement as well (Suppl. Figure 2 ). All IEQ-S
items were correlated with their original factors (item 1 with item group A:
r =0.592; p <0.001; item 2 with B1: r =0.638;
p <0.001; item 3 with B2: r =0.489; p <0.001; item 4
with C1: r =0.561; p <0.001; item 5 with C2: r =0.671;
p <0.001; item 6 with C3: r =0.755; p <.001; item 7
with D1: r =0.530; p <0.001; and item 8 with D2: r =0.551;
p <0.001). The correlation pattern showed that each item correlated
most strongly with its conceptually intended factor (Suppl. Table 6 ).
The sum score of the IEQ-S showed correlations with the WAI-SR subscales (bond:
r =0.26, p =0.002; task: r =0.57, p <0.001; goal:
r= 0.45, p< 0.001), the SCL-K-9 (r =0.23,
p =0.002), and neurotic defense mechanisms (r =0.26,
p =0.002).
Discriminant Validity
There were no correlations between the IEQ-S and patients' health (physical
health r =−0 .03, p =0.76; mental health r =0.002,
p =0.98).
Longitudinal analyses including the association with depression
symptoms
The ICCs indicated that a substantial proportion of variance in the IEQ-S sum
score and items was due to within-person changes over time, but the proportion
varied (Suppl. Table 7 ).
Lastly, we checked repeated-measures correlations of the PHQ-9 sum score with the
IEQ-S sum score (r =0.130, p <0.001) and the IEQ-S single items
(item 1: r =0.192, p <0.001; item 2: r =0.111,
p <0.001; item 3: r =0.086, p <0.001; item 4:
r =0.105, p <0.001; item 5: r =− 0.109, p <0.001;
item 6: r =0.131, p <0.001; item 7: r =− 0.171,
p <0.001; item 8: r =0.210, p <0.001). The largest
correlation is visualized in Suppl. Figure 3.
Discussion
This study aimed to evaluate a short form of the Intersession Experience
Questionnaire, the IEQ-S, for use in both psychotherapy research and routine
clinical monitoring. The goal was to create a time-efficient instrument while
preserving the breadth of the original IEQ.
Accordingly, the present results show that the items capture related but not
redundant aspects of intersession activity. For example, some processes happened
less often than others. Especially scarce were dreams about therapy or therapists
(item 3). While they can be an indicator of how engaged a patient is with therapy;
dreams are unconscious or at least preconscious, which is why they might not be
accessible to conscious recollection. Further, in adapting the item for the short
version, the original wording (“thinking about therapy/therapist in a dream”) was
simplified to “dreaming of therapy or the therapist”, which changes the scope of the
item.
Item 6 is about wondering whether the therapist is thinking about the patient, which
involves reflective functioning, the capacity to consider the mental states of
others [21 ]. This ability, however,
may be limited, especially during times of acute distress, making such reflections
more difficult or even inaccessible. Item 8 captures a negative connotation and
differs from the other items already in terms of polarity. We opted to include it,
as information about negative intersession experiences is also relevant, and
uncomfortable feelings can be a natural part of the therapeutic process, which often
involves working through painful experiences and tolerating temporary distress.
Indeed, psychotherapy tends to be most effective when it maintains a balance between
emotional challenge and a sense of safety [22 ]
[23 ]. As items 7 and 8,
capturing emotionally positive and negative experiences, were only weakly
correlated, this indicates they are not just two sides of the same coin. This
perspective is supported by the longitudinal findings from Sample 6: Particularly
items capturing negative emotions or disruptions covaried more strongly with symptom
severity, suggesting that the single items' clinical relevance is not uniform
[8 ]
[10 ]. Additionally, the ICCs for the
IEQ-S sum score and individual items indicated both between- and within-person
variability, and MSSD values showed item-level variation in within-person dynamics,
meaning that some intersession experiences are more volatile than others.
Taking into account the comparisons with the full IEQ, the IEQ-S can be regarded as
a
pragmatic compromise that balances efficiency and conceptual coverage. Although it
might not replace the long version in all respects, in many clinical settings,
including those represented by samples 1–5 with their diverse inpatient and
outpatient contexts, the short version may be particularly attractive, as it can be
more easily embedded into routine clinical processes.
However, we recommend still focusing on the eight individual items rather than
computing a sum score, as collapsing across conceptually distinct processes risks
obscuring critical patterns of variation. Intersession experiences are, by
definition, multidimensional; they include affective, cognitive, imaginative, and
interpersonal phenomena that vary in form, function, and clinical relevance [2 ]
[12 ]. In future research,
item-/profile-based approaches may yield more fine-grained insights, particularly
in
applied settings where early detection of change is essential. One key direction is
thus to examine the predictive validity of specific items regarding outcomes such
as
symptom reduction, alliance, or dropout. If only a subset consistently shows
clinical relevance, further shortening may be warranted.
Limitations
The first and largest limitation in this study is the sample. Although the
different settings and patient groups add to the investigation's external
validity, individual subsamples were too small and too different to
systematically investigate the effects of their characteristics, and there was
very little information about patients and therapists. In future studies, a
sample with more homogeneous patients and therapists should be examined. Second,
it is necessary to consider recall and biases as intersession experiences might
be fleeting and therefore not be (accurately) remembered. Third, it is difficult
to identify an appropriate measurement point. Patients in samples 1–5
participated about three weeks after admission, when the therapeutic
relationship may not be as evolved as at the end of therapy. This could explain
the low correlations with the WAI-SR. Future studies should add more systematic
comparisons of therapy phases.
Conclusion
The IEQ-S appears to be a psychometrically adequate and clinically promising tool
for
capturing the rich, dynamic field of intersession experiences. By preserving the
heterogeneity of the original measure in a concise format, the IEQ-S provides both
a
practical solution for repeated assessment and an informative, complementary
perspective on the therapy process. We suggest that clinical applications and future
studies move beyond a sum-score-centric approach and leverage the diagnostic and
prognostic value embedded in the diversity of intersession experiences.
Clinical implications
The IEQ-S offers a brief, reliable, and valid tool to monitor patients’
intersession experiences session-by-session. Tracking these thoughts and
feelings between sessions can enhance understanding of patient engagement,
detect ruptures or negative dynamics early, and inform timely therapeutic
interventions. Its brevity makes it feasible for routine use, supporting
feedback-informed care without adding undue burden.