Keywords PSQI - reliability - sleep quality - young adults - psychometric property
Introduction
Good sleep quality in all age groups is essential for physical, mental, and cognitive
functioning, which could impact quality of life[1 ]. Poor sleep quality has become a public health concern that has been associated
with several serious medical conditions.[1 ] This problem is becoming increasingly recognized for adolescents and young adults,[2 ]
[3 ] and it is linked to obesity,[4 ] depression, musculoskeletal pain,[5 ]
[6 ]
[7 ]
[8 ] cognitive impairments and increased risk-taking medications.[1 ]
[9 ]
[10 ]
[11 ] Consequently, a reduction in physical functioning, psychological well-being, self-care,
activities of daily living, ability to work, and interpersonal relationships has been
reported to be associated with reduced sleep quality in adolescents and young adults.[3 ]
[4 ]
[5 ]
[6 ]
[7 ]
[10 ]
[12 ]
[13 ]
[14 ]
[15 ]
[16 ]
[17 ] Emphasizing the importance of sleep quality and including sleep assessment as a
major part of the routine clinical practice are necessary. Therefore, a better understanding
of sleep quality assessment tools can help in the identification and management of
sleep problems before the patients suffer the long-term consequences associated with
poor sleep.
Sleep quality can be assessed objectively and subjectively. Self-reported questionnaires
were found to be one of the most used methods to assess subjective sleep quality among
different populations, including adolescents and young adults. They are also known
to be the most cost-effective measurement method.[18 ] The Pittsburgh Sleep Quality Index (PSQI), the Epworth Sleepiness Scale (ESS), and
the Functional Outcomes of Sleep Questionnaire (FOSQ) are often used to assess an
individual's subjective sleep quality.[19 ]
[20 ] The PSQI contains 19 self-reported items and 5 questions covering 7 domains of subjective
sleep quality (including sleep duration, disturbances, latency, daytime dysfunction,
habitual sleep efficiency, sleep quality, and use of sleeping medications) during
the previous month. This tool has been designed to identify and differentiate good
and poor sleepers.[21 ] The global PSQI score ranges from 0 to 21, and higher scores are indicative of worse
quality of sleep (scores > 5 indicate poorer sleep quality). The PSQI has been widely
used and found to have adequate psychometric properties and sensitivity of 89.6% and
specificity of 86.5% (kappa = 0.75; p < 0.001) in differentiating good from poor sleepers.[22 ]
The PSQI has been translated into 51 languages.[23 ] A systematic review revealed an acceptable internal consistency and construct validity[24 ] for the English,[25 ] Chinese,[11 ] Korean,[26 ] and Portuguese[27 ] versions of the PSQI. The original PSQI has been validated among different populations,
such as healthy individuals,[28 ]
[29 ]
[30 ] and patients with sleep disorders,[31 ] psychiatric disorders,[32 ] neurological diseases, and chronic conditions.[33 ]
[34 ] The internal consistency of the original PSQI has been found to be fair to good,
with a Cronbach alpha coefficient ranging from 0.64 to 0.83.[22 ]
[24 ] The original PSQI total score test-retest reliability estimates have been found
to be moderate (r = 0.56) in pregnant women,[35 ] and strong (r = 0.87) in individuals with primary insomnia.[36 ]
In 2010, the PSQI was translated into Arabic, and was tested by Suleiman et al.[37 ] (2010) on 35 healthy bilingual Arabic-English-speaking individuals. It has correlated
strongly with the Insomnia Severity Index (ISI; r = 0.76), and moderately with the
vitality subscale of the Medical Outcome Study Short Form-36 (r = -0.33).[37 ] However, the Arabic translation of the PSQI has demonstrated borderline/minimal
acceptability (Cronbach alpha = 0.65).[37 ] Another study[38 ] examined the internal consistency of the Arabic PSQI in cancer patients and found
a Cronbach alpha coefficient of 0.77, demonstrating very good acceptability. According
to a systematic review by Al Maqbali et al.[23 ] (2020), the Arabic PSQI meets the quality assessment criteria for content, construct
validity and internal consistency; however, criterion validity agreement, reliability,
responsiveness, floor and ceiling effects, and interpretation have not been reported.
No study has specifically investigated the test-retest reliability of the Arabic version
of the PSQI yet.[39 ] Moreover, (self-administered) bilingual versions of questionnaires/tests may be
more useful and applicable than the monolingual one for bilingual individuals.[40 ]
[41 ]
[42 ] Therefore, in the present study we have examined the test-retest reliability of
a bilingual Arabic-English PSQI (AE-PSQI) among healthy bilingual Arabic-English-speaking
adolescents and young adults of the United Arab Emirates (UAE).
Materials and Methods
Participants
In the present cross-sectional study, 50 participants of both sexes, aged between
14 and 26 years, with either poor or good sleep quality at the baseline assessment,
were recruited from schools and universities in the UAE. Participants were excluded
if they had a medical condition or had recently undergone surgeries that affected
their sleep. The ethical approval for this study was approved by the Research Ethics
Committee of the University of Sharjah (REC-22-02-23-01-S).
Procedure
Body weight was measured to the nearest 100 g using a standard portable digital weighing
scale. Height was measured to the nearest 1 cm using a portable stadiometer. The Body
Mass Index (BMI) was calculated for each participant as body weight (kg) divided by
height in meters squared.
Pittsburgh Sleep Quality Index (PSQI)
The overall sleep quality over the preceding month[22 ] in adolescents and young adults was assessed using the PSQI,[33 ] considering that this questionnaire is used among adolescents and young adults as
a reliable and valid tool.[43 ]
[44 ] It consists of 19 items divided into 7 sleep-related variables: 1) sleep quality;
2) sleep latency; 3) sleep duration; 4) sleep efficiency; 5) sleep disturbance; 6)
medication use; and 7) daytime dysfunction. Every item is rated on a 4-point Likert
scale in terms of frequency or severity. The sum of the component scores yields a
global PSQI score ranging from 0 to 21, with higher scores indicating poor sleep quality.
Scores > 5 indicate poor sleep quality, while those ≤ 5 indicate good sleep quality.[45 ]
Sample Size Estimation
Considering a minimum acceptable reliability (intraclass correlation coefficient,
ICC) of 0.60, an expected reliability (ICC) of 0.80, a significance level of 0.05,
and a power of 0.80, the number of participants required is 49 for 2 measurements
(test [baseline] versus retest [after 7 days]).[46 ] Therefore, 50 participants were recruited for the present study.
Procedure
The subjects were invited to participate in the study through social media adverts,
university/school notice boards, and word of mouth. The study procedures were explained
to the interested participants. Prior to being enrolled in the study, participants
and/or their parents read the information sheet and informed consent was provided
by them (in the case of adults) or by their parents/guardians (in the case of adolescents).
We provided both English and the corresponding Arabic translations of each item of
the PSQI together to all participants. The participants were asked to fill the AE-PSQI
twice, seven days apart.
Statistical Analysis
Descriptive characteristics of the participants were presented as mean and standard
deviation (SD) values. Data were tested for normal distribution using the Shapiro-Wilk
test and histograms. As the data were not normally distributed, log and square-root
transformation were applied, but the transformed data did not meet the required assumption
of normality. The distribution of means from any skewed distribution was considered
nearly normal if the number of participants is considered large enough (∼ > 30).[47 ] Therefore, we used parametric tests for the statistical analysis. The McNemar test
was used to compare the proportion of participants with good and poor sleep quality
regarding the baseline and retest global scores on the AE-PSQI. The IBM SPSS Statistics
for Windows (IBM Corp., Armonk, NY, United States) software, version 28.0, was used
for the statistical analysis, and values of p < 0.05 were set as the threshold for statistical significance.
Floor and Ceiling Effects
The floor and ceiling effects were assessed with the percentage of participants who
scored the lowest (0) and highest (21) respectively. If more than 15% participants
scored lowest or highest scores, then floor or ceiling effects were considered to
exist.[48 ]
Internal Consistency
Internal consistency refers to the degree of consistency among all internal items
of the questionnaire. The internal consistency of the AE-PSQI was assessed using the
Cronbach alpha; in addition, the item-to-total correlation was assessed using the
Pearson correlation coefficient and the alpha values for the tool, if each item was
deleted, were reported. The item-to-total correlation refers to the correlation between
each item/component and the global score on the PSQI. The alpha score was interpreted
according to the following criteria: lower than 0.60: “unacceptable”; 0.60 to 0.65:
“undesirable”; 0.65 to 0.70: “minimally acceptable”; 0.70 to 0.80 “respectable”; 0.80
to 0.90 “very good”; and much higher than 0.90: “consider shortening the scale”.[49 ]
Test-Retest Reliability
We compared the Pearson correlation coefficient, the Spearman correlation coefficient,
and the ICC ([3,1]; two-way mixed effects, consistency, single measurements, agreement)
for the test-rest reliability analysis of the baseline PSQI global and component scores
and seven-day retest scores. As the reliability estimates were almost the same for
all three analyses for all comparisons, and there were 50 participants, the ICC(3,1)
was used for further interpretation; ICC values > 0.75 are considered strong, those
from 0.40 to 0.75 are moderate, and those < 0.40 are considered poor to estimate reliability.[50 ] The standard error of measurement (SEM), as a measure of agreement, was calculated
using the following equation: Sp √(1- r ), in which Sp is the pooled standard deviation of test-retest measures and r is the reliability coefficient (ICC).[51 ]
[52 ] Additionally, the smallest real difference (SRD), the threshold to detect a “real”
change beyond the measurement error, was analyzed using the formula 1.96 * SEM * √2.[52 ]
Bland-Altman Plots
To further explore the agreement of test-retest AE-PSQI scores, the Bland-Altman plot
was used. The plots with mean values against differences of global PSQI scores between
baseline (1) and retest (2) with 95% limits of agreement (mean bias ± [1.96 * SD])
were used. Here, mean bias and SD are the mean ± SD values of differences respectively.
A significance level < 0.050 was set for all analyses. While assessing the test-retest
agreement in the plot, the differences between the tests were arbitrarily considered
high if they were ≥ 1.5 SDs, moderate if the differences ranged from 1.0 to 1.49 SDs,
and low if the differences were < 1.0 SD.[53 ]
Results
Participant Characteristics
This study included 50 participants. The mean age of the sample was of 20.82 ± 2.7
years, and it included 34 female (68%) and 16 male (32%) subjects. Participants characteristics
are shown in [Table 1 ]. The proportion of poor sleepers (PSQI > 5) at baseline was of 60% (n = 30/50),
which was significantly different (p = 0.039) from that of the retest (46%; n = 23/50).
Table 1
Characteristics of the study participants (n = 50).
Characteristics
Mean ± SD
Age (in years)
20.82 ± 2.7
Sex: n (female/male)
50 (34/16)
Body mass (in kg)
59.28 ± 15.9
Height (in cm)
164.58 ± 7.7
BMI (in kg/m2 )
21.80 ± 5.3
Abbreviations: BMI, Body Mass Index; SD, standard deviation.
Floor and Ceiling Effects
None of the included participants showed floor or ceiling effects based on the AE-PSQI
global score calculated with test-retest responses.
Internal Consistency
A Cronbach alpha score of 0.65 was obtained, which met “the minimally acceptable”
criterion for the internal consistency of the AE-PSQI. The alpha scores were nearly
the same for both the baseline and retest scores. The item-to-total correlation coefficients
ranged from 0.31 to 0.74, and the smallest component-total correlation coefficient
was found for the use of sleep medications, while the largest was found for sleep
latency ([Tables 2 ]).
Test-retest Reliability
The ICC(3,1) values revealed strong relative reliability for the global PSQI score.
Except for the sleep efficiency component (ICC = 0.26), all other subcomponents showed
moderate to strong reliability estimates. There were no statistically significant
differences in paired t -tests comparing the test-retest scores (p > 0.05), and there was no statistically significant systematic bias in the data ([3 ]). The SEM for the global AE-PSQI score was of 1.6 and the SRD was found to be of
4.5.
Table 2
Item-to-total correlation.
Items
Item-to-total correlation*
Alpha if item was deleted
Subjective sleep quality
0.71
0.56
Sleep latency
0.74
0.54
Sleep duration
0.70
0.58
Sleep efficiency
0.46
0.68
Sleep disturbances
0.44
0.64
Use of sleeping medicine
0.31
0.66
Daytime dysfunction
0.55
0.63
Note:
* Pearson correlation coefficient.
Table 3
PSQI item characteristics, paired t -test (test 1 versus test 2) p -values, and intraclass correlation coefficients (ICC[3,1]) with 95% confidence intervals
and p -values.
PSQI Items
Test 1:
mean
Test 1: standard deviation
Test 2:
mean
Test 2: standard deviation
Paired t -test:
p -value
ICC(3,1)
95%CI
p -value
Lower bound
Upper bound
Subjective sleep quality
1.16
± 0.91
1.16
± 0.79
1.000
0.66
0.48
0.79
< 0.001
Sleep
latency
0.78
± 0.86
0.80
± 0.86
0.785
0.82
0.70
0.89
< 0.001
Sleep duration
1.28
± 1.03
1.36
± 1.05
0.552
0.59
0.37
0.74
< 0.001
Sleep efficiency
0.56
± 0.97
0.52
± 0.84
0.799
0.26
0
0.50
0.034
Sleep disturbance
1.04
± 0.53
0.94
± 0.62
0.168
0.62
0.41
0.76
< 0.001
Use of sleep medication
0.12
± 0.39
0.08
± 0.34
0.322
0.70
0.52
0.82
< 0.001
Daytime dysfunction
1.22
± 0.84
1.10
± 0.79
0.182
0.70
0.53
0.82
< 0.001
Global PSQI
score
6.16
± 3.27
5.96
± 3.11
0.510
0.77
0.63
0.87
< 0.001
Abbreviations: 95%CI, 95% confidence interval; ICC, intraclass correlation coefficient; PSQI, Pittsburg
Sleep Quality Index.
Bland-Altman Plot
The Bland-Altman plot depicting the test (1) versus retest (2) mean global AE-PSQI
scores is shown in [Figure 1 ]. An analysis of the plot revealed a moderate agreement between both tests because
nearly all differences were falling within 1.0 to 1.49 SDs. Only one outlier was present
in the plot. Nevertheless, this interpretation is based on arbitrary thresholds suggested
by Jensen et al.[53 ] (2016).
Fig. 1 Bland-Altman plot showing agreement between baseline (T1) and retest (T2) global
PSQI scores.
Discussion
In the present study, healthy bilingual Arabic-English-speaking adolescents and young
adults were recruited to assess the test-retest reliability of AE-PSQI. Even though
previous studies[54 ]
[55 ] have used the Arabic version of the PSQI for data collection in young healthy individuals,
none of them have assessed the test-retest reliability of the Arabic version or of
the AE-PSQI.
The global AE-PSQI score did not show either floor or ceiling effects, which indicates
that the item analysis supported the quality of the content validity of the AE-PSQI.
In the present study, < 15% of the participants scored lowest or highest scores. The
internal consistency of the AE-PQSI was minimally acceptable (Cronbach alpha = 0.65)
among our healthy adolescent and young adult participants. Our results are comparable
with those of previous studies validating the Italian PSQI in healthy children[56 ] and the Arabic PSQI in healthy Arab Americans.[37 ] The Italian and Brazilian version of the PSQI presented good internal consistency
values, of 0.72 and 0.71 respectively; however, both versions were focused on one
age group: either children or adolescents.[56 ]
[57 ] The Arabic PSQI has been reported to meet the quality assessment criteria for internal
consistency.[23 ] The internal consistency of the original PSQI was found to be fair to good, with
a Cronbach alpha value ranging from 0.64 to 0.83.[22 ]
[24 ]
Overall, the test-retest reliability estimate for the AE-PSQI global score was strong
(ICC = 0.77), while the reliability estimates of other subcomponents, except the sleep
efficiency, ranged from moderate to strong. Previous studies have found test-retest
reliability estimates to be moderate (r = 0.65) for the Brazilian PSQI version in
healthy adolescents[57 ] and strong (r = 0.83) for the Italian PSQI version in healthy children.[56 ] Furthermore, two other studies including both healthy and symptomatic participants
(with sleep problems) revealed a high internal consistency (Cronbach α = 0.84) and
moderate reliability (r = 0.65) for the Korean PSQI,[26 ] and good internal consistency (Cronbach α = 0.70) and strong reliability (r = 0.83)
for the Kurdish PSQI.[58 ] Therefore, the PSQI has been found to have acceptable internal consistency and reliability,
irrespective of the language used.
The SEM of the AE-PSQI global score was of 1.6. Moreover, the SEM of the Brazilian
PSQI has been reported to be of 1.1 for healthy adolescents.[57 ] As the previous study[37 ] investigating the psychometric properties of the Arabic PSQI in healthy adults has
not reported agreement measures, comparisons of SEM/SRD values of the AE-PSQI with
that of the Arabic PSQI were not possible.
Strengths and Limitations of the Study
To our knowledge, the present study is the first of its kind investigating the floor
and ceiling effects, internal consistency, and test-retest reliability of the AE-PSQI,
and positive findings were observed for the AE-PQSI using multiple reliability and
agreement estimates. As only healthy adolescents and young adults with good or poor
sleep quality were included, the results cannot be generalized to individuals with
clinical conditions affecting sleep.
Future Recommendation
Future research is essential to explore various population groups, as a valid and
reliable AE-PSQI is needed to support clinical decision-making for interventions that
can improve sleep quality. This is particularly relevant for bilingual individuals
who speak both Arabic and English and present issues such as insomnia, sleep disorders,
chronic pain, fibromyalgia, multiple sclerosis etc. Moreover, other psychometric properties
(such as validity and responsiveness) of the AE-PSQI should be investigated further.
Conclusion
The AE-PSQI was found to be a reliable instrument to assess sleep quality in bilingual
Arabic-English-speaking adolescents and young adults with good or poor sleep quality.
The AE-PSQI demonstrated no floor or ceiling effects, minimally acceptable internal
consistency, and moderate to strong test-retest reliability estimates.