Key-words:
Altered perception - chronic - Indian - low back pain - modified Fremantle questionnaire
Introduction
Patients with chronic low back pain (LBP) develop back-specific altered body perceptions.
These malperceptions are known to contribute to the disease, hence might offer a potential
target for treatment if measured.[[1]],[[2]] Psychometrics refers to measuring one's mental abilities and capacities and is
the answer to assess these altered perceptions.[[3]] The Fremantle Back Awareness Questionnaire (FreBAQ) was developed as a reliable
tool measuring back-specific body perception.[[4]],[[5]],[[6]] However, this has not been evaluated and validated in the Indian context.
Therefore, the present study was planned to establish the psychometric ability of
the questionnaire in the Indian context.
Materials and Methods
Translation of the questionnaire
The English edition of the FreBAQ was transformed into Indian classical language –
Odiya, utilizing a forward–backward process.[[4]] Two native Odiya speakers translated the original questionnaire into the Odiya
language. The differences were cleared after discussion among them. This version of
translation was back translated by a person well versed with both Odiya and the English
languages. The back-translated text was then sent to the developer of the original
questionnaire, which was checked and approved. The provisional Odiya version of the
survey was then administered to ten native Odiya speakers attending a pain clinic
for LBP. Their inputs were inculcated with the formation of the final version (FreBAQ–I),
which was put into evaluation for the purpose of the study. The study was recorded
with the Indian clinical trials registry (CTRI/2018/02/011772 dated February 8, 2018).
Assessment of the questionnaire
Participants
Patients for the study were pooled from four different outpatient departments, namely
pain clinic, orthopedics, neurosurgery, and physical medicine. Patients with chronic
low backache (duration >3 months) aged between 18 and 70 years were included for the
study purpose. The exclusion criteria were those with red flag signs, known psychiatric
illness, and those who refused to participate. The study was approved by the institute
ethics committee. Consent was obtained from all the participants for inclusion in
the study in written form.
Procedure
Demographic parameters such as age, sex, body mass index, marital status, and profession
and clinical characteristics like duration and severity of pain, Oswestry disability
index,[[7]] and presence of depression (Beck's depression inventory)[[8]] were assessed. Pain intensity was measured utilizing on a visual analog scale (VAS)
having 0–10 points, where “0 = no pain” and “10 = worst pain imaginable.” In addition,
all the participants were evaluated for the FreBAQ-I.
Sample size
As previously recommended, a sample of 100 participants was fixed to do Rasch analysis
(RA) to ensure stable item calibration within − 0.5 logits with 95% confidence.[[9]]
Rasch analysis
The translated version of the questionnaire was analysed under the following elements.
Targeting
RA is described as a probabilistic model where targeting refers to the ability of
the questionnaire items to target the specific population with perceptual disturbances.
It means persons with higher bodily disturbance should be more agreeable than those
with lower perception disturbance. Similarly, items indicating a greater disturbance
in the questionnaire are to be lesser endorsable than those indicating smaller disturbance.[[10]]
Category order
There were five categories of responses such as never, rarely, occasionally, often,
and always. Curves for category probability were drawn to find out the scale function.
Each curve was expected to have a distinct, separate peak and clear threshold, representing
the point at which the possibility of favoring one category is similar to that of
supporting another. Disordered threshold values are possible when a class is either
underutilized or respondents use the types differently (e.g., participants were finding
it challenging to differentiate between two groups).
Uni-dimensionality
The advantage of RA is its scope for testing the dimensionality of the scale rather
than testing the instrument as a whole.[[11]] In addition, RA provides a clue of the “item difficulty” grading of the questionnaire.
We used Rasch residual principal component analysis (PCA) to evaluate the unidimensionality
of the measure scale. The PCA permits assessment of the primary Rasch dimension. Unidimensionality
of a range is validated when the Rasch dimension explains 40% of the variance of the
data along with the first contrast of Rasch residual and the eigenvalue of the first
contrast should be ≤2.0.[[11]]
Item fit statistics were used to examine the unidimensionality. These are Chi square
based statistics reported as mean squares (in logits), with a presumed value of 1
logit. Excessive large fit residuals (>1.4 logits) suggest a major difference between
the observed and expected performance. In contrast, excessively small fit residuals
(<0.6 logits) imply that the thing is behaving too predictably.[[12]]
The PCA residual correlation matrix was visually scrutinized to identify groups of
things that would suggest a second dimension. An estimated eigenvalue >2.0 for the
PCA of residuals was considered pointing toward the second dimension.[[13]],[[14]]
Internal consistency
Cronbach's alpha and person reliability are the two measures used to evaluate consistency
or reliability in RA.[[15]],[[16]] Person reliability defines the discriminative ability of the scale at different
levels, and as the value increases, the level of discrimination increases, which is
independent of sample size. A least amount of 0.7 was recommended for a cluster of
respondents, and a minimum value of 0.85 was advised for discrete participants. Cronbach's
alpha was also used to compare to that of the original study findings.[[5]]
Person fit
Patients with outfit residuals >1.5 logits were evaluated for a poor fit. Fisher's
exact test and Student's t-test were used for each item of the FreBAQ-I to compare
the poor fit versus better fit in the model.[[17]]
Item functioning
The questionnaire items are expected to work similarly for all participants of comparable
agreeability. Differential item functioning (DIF) is the method to identify bias in
items or confounding factors (other than the construct). We examined DIF across six
subgroups: sex, age (18–60 years, >60 years), job status (no work vs. at work), pain
during motion (VAS ≤5 vs. >5), duration of pain (≤1 year vs. >1 year), and disability
(≤5 vs. >5). DIF was verified applying a Mantel–Haenszel Chi-square test (P = 0.01
for each of the items). DIF was further explored if an issue resulted in a statistically
significant difference of >0.5 logits between the subgroups.[[18]]
A “logit” scale is used to express the individual item difficulty on a linear scale,
which extends from negative to positive infinity.
Results
Sample characteristics
A total of 100 participants were recruited over a period of 6 months (Feb 2018 - July
2018). The demographic profile of patients is shown in [[Table 1]] and [[Figure 1]]. The frequency of responses of the study participants to every nine items of the
questionnaire is presented in [[Table 2]].
Figure 1: Item-person threshold map showing the relationship between Fremantle Back Awareness
Questionnaire-I items and person logit ratings
Table 1: Demographic and clinical status of the study participants
Table 2: Frequency of responses to each item
Relationship to clinical status
The FreBAQ-I correlated signifantly with pain intensity (r = −0.19, P = 0.04), duration
of the low backache (r = 0.35, P < 0.001), depression score (r = 0.25, P = −0.012),
but not with disability (r = 0.06, P = −0.49) [[Table 3]].
Table 3: Correlation between total Fremantle Back Awareness Questionnaire I score and clinical
variables
Rasch analysis (Fremantle Back Awareness Questionnaire-I)
Targeting
The relationship between different questionnaire items and person logit ratings is
depicted in [[Figure 2]], and the enforceability thresholds for each of the items are shown in Table 4.
The mean person endorsebility was – 0.83 ± 0.49 (−2.24–0.16) logits compared to a
default item endorsebility average of 0 ± 0.43 (−1.02–0.42) logit. Person agreeability
shifts to the left compared to items endorsability, which indicates that persons having
low scores were not well addressed by the scale. Item 2 was the easiest to endorse,
followed by items 4 and 9. Item 3 was found to be the most difficult to endorse.
Figure 2: Category order showing average agreeability measures of the respondents resulting
in neither excessive positive nor negative fit statistics, suggesting the category
structure is adequate
Ten participants of a hundred (10%) scored 0 for all the items, but none scored full
points on all the items of the questionnaire.
Category order
The fit statistics were neither excessively positive nor negative, and the average
agreeability measure of study participants progressed as expected along with the different
rating categories. Hence, the category structure was found to be adequate, although
the first category (rarely) was found to be less often utilized probably due to the
difficulty in differentiating “rarely” from “occasionally” [[Figure 3]].
Figure 3: Graphical representation showing item characteristic curve
Unidimensionality
[[Table 4]] depicts the fit statistics for all the nine items in the questionnaire. The item
with slightly excessive positive infit statistics (1.50) was ninth, and the curve
suggested the misfit is probably due to a low score given by those individuals with
a high level of perceptual impairment. PCA of residuals revealed that the variance
of the first contrast was 2.24 eigenvalue units; 67.2% of the raw variance was explained
by measures. Visual inspection of the PCA correlation matrix suggested that items
5, 2, and 4 could possibly constitute a second dimension. Two of these items (items
5 and 2) address reduced proprioception, and item 4 indicates body size and shape.
It was also found that items 4 and 5 were interdependent, as a positive correlation
was established in the local dependence assessment (r = 0.42).
Table 4: Average item endorsability thresholds, including fit statistics
Internal consistency
It was found to be suitable with a Pearson's reliability (0.54) and Cronbach's alpha
(0.91).
Person fit
As no association was found on age (P = 0.99), gender (P = 0.99), response to therapy
(P = 0.035), pain severity (P = 0.30), functional disability (P = 0.09), and depression
(P = 0.76) in between those who fit as against those who did not suit to the Rasch
model, no further analysis was required.
Differential item functioning
We did not obtain any DIF for age, sex, job, VAS, and duration of pain.
Discussion
The FreBAQ-I was utilized to evaluate its psychometric properties in a sample of the
Indian population with chronic LBP. We found it to possess a suitable internal consistency
with a minor deviation from unidimensionality. Ten of a hundred participants scored
0 in all the items reflecting floor effect in our population. As there was no DIF
for age, sex, job, VAS, and duration of pain, etc., no meaningful impact is expected
upon the practical application of the translated Indian version of the questionnaire.
We found a positive association between FreBAQ-I and depression and the duration of
the illness but a negative association with the intensity of pain. This probably reflects
a participant who is in severe pain is unable to concentrate on the altered perception;
rather, he/she appreciates the alteration only when pain reduces. It is only when
pain is reduced, other issues are unmasked. This probably also reflects the timing
for the evaluation of altered body perception. Similarly, Janssens et al. did not
observe any significant relationship between VAS intensity and a score of the Dutch
version of the questionnaire.[[19]]
Depression has been found to be a common accompaniment of chronic LBP and even so
in patients with altered body perception. Further, psychosocial factors have been
found to be associated with onset, maintenance, and treatment for chronic LBP.[[20]] Similar to these findings, we observed an association between depression and study
participants with chronic LBP.
Unlike the Japanese version, both the English version and the Indian version showed
a direct relation of the questionnaire results with a duration of the LBP. In all
probabilities, this is implicating as the duration of the disease increases, chances
of altered back perception are higher.
Surprisingly, we did not observe any significant relation with a disability, probably
reflecting altered body perception is not simply a function of disability. The major
reason leading to disability seen in patients with chronic low backache without any
red flags is the pain itself. It reflects the study population with higher scores
for altered body perception was not experiencing much pain. This might explain the
lack of relationship between questionnaire scores and disability.
The RA model is centered on the postulation that, to measure on the basis of a test
item, a researcher must consider the difficulty of each of the items along with a
variable and the ability level of a test taker or a participant in this case with
respect to the variable. The model suggested by Rasch specifies that when a respondent
answers an item, the possibilities are two: answering correctly and not responding
correctly.[[21]] This relationship is to be expressed as the natural log of the probability of participant
answering correctly the test item divided by the likelihood of the same respondent
not answering the test item appropriately. Therefore, the Rasch mathematical model
uses a single variable, the position of the respondent or participant, along with
the variable and the position of each of the test items satisfying along with the
variable. In the current context, we are interested in evaluating the performance
of each test item of the translated version of the questionnaire (FreBAQ-I) to define
the variable i.e., altered body perception.[[10]] The RA suggested some limitations of the questionnaire. It showed persons having
low scores were not addressed well by the survey, probably speaks in favor of the
tool and its validity. Item 2 which was “I need to focus all my attention on my back
to make it moves the way I wish to” was found to be the easiest one to endorse followed
by item 4 which was “when performing everyday tasks, I don't know how much my back
is moving” and item 9 (“my back feels lopsided or asymmetrical”). On the other hand,
item 3, which was “I feel as if my back moves involuntarily without my control,” was
found to be the most difficult one to endorse.
Any questionnaire per standard error has two ends: one easy where it starts and ends
slowly with increasing difficulty. Similarly, of nine items in the FreBAQ-I, some
are on the easy end and some are on the difficult end of the continuum. It is normally
expected that, regardless of the ability of the respondents, this easy and difficult
should stand true for all the participants. When items do not fit this model of assumption,
they tend to measure different variables rather than one. As the intention is to address
only one variable, the items misfitting the model must be removed and replaced with
appropriate questions. In a RA, identification of issues which do not contribute to
the assessment for which questionnaire is meant can be accomplished by utilizing “fit
statistics.”[[22]]
There are five categories in the questionnaire likely “never,” “rarely,” “occasionally,”
“often,” and “always.” While evaluating the types of a survey or scale, the category
items must be clearly ordered so as to make the respondent clear about the responses
to be given. However, as people might respond to the same question differently, the
category order and fit statistics are utilized to assess how close we are to the intended
ordering. In the given study, we found neither too high positive not too high negative
fit statistics, implicating an adequate agreeability measure of the study participants.
However, participants found it hard to differentiate “rarely” from “occasionally.”
The item found to be showing slightly excessive positive infit statistics (1.50) was
ninth, and the item-specific curve hinted the misfit is probably due to a low score
given by those individuals with a high level of perceptual impairment.
The evaluation of the questionnaire also includes all the nine items that must fit
the scale individually and independently. In contrast to other researchers, we found
item nine to possess more positive infit statistics. Although we agree to the notion
that back enlargement is quite more common than the feeling of shrunken, we did not
obtain any difference in items 7 and 8, probably because our population understood
the difference clearly. In contrast, item nine was not felt appropriate as although
back pain is unilateral, they less often felt their back to be asymmetrical. Items
probably perform differently in different population and the role of duration of backache
and underlying pathology cannot be ruled out. Midline pathology is less likely to
create an altered lopsided back perception than a unilateral pathology. The majority
of our study population had axial pathology such as a prolapsed disc, spondylosis,
and internal disc disruption rather than facet arthropathy or sacroiliitis, which
are often sided in nature [[Figure 3]]. This is probably one area, which still needs to be explored.
Similarly, there should not be any interdependence between the questionnaire items
so that they do not affect each other. In contrast to other study findings, we found
items 4 and 5 were interdependent (r = 0.42) and very likely influenced each other.
This is possible as both the items are addressing proprioception acuity. Again, as
suggested by Nishigami et al., although these items are dependent, they address different
aspects of a similar perceptive problem, hence must be retained.[[6]]
Internal consistency is typically expressed as Cronbach's alpha (a), which ranges
from 0 to 1. In our study results, we observed a Cronbach's alpha of 0.91, indicating
adequate internal consistency and optimal reliability.[[23]]
Item hierarchy of the FreBAQ-I revealed item 2 to be the easiest and item 3 as the
most difficult one, unlike the Japanese version, which found item 7 as the harder
item, and items 6 and 8 are the easier ones. As suggested, these differences are probably
the result of translational, cultural, and population differences.[[6]]
The current study has some limitations like we could not evaluate test–retest reliability,
as getting the same patients for evaluation was not possible. In addition to the educational
level, we missed following the correlation of altered perception and response to therapy.
Conclusions
Our FreBAQ-I has acceptable psychometric properties and is suitable for use in patients
with chronic LBP with adequate internal consistency in the Indian population. Participants
with higher disturbed body perception are addressed appropriately by the questionnaire
rather than those with lower levels of altered perception. All nine items are essential
and adequate, which make the questionnaire complete. Even though item 4 and 5 are
found to be locally dependent and might influence each other, as both are addressing
proprioceptive acuity of different aspects, both the items deserve to be as placed.