Cervical Pessary Plus Progesterone for Twin Pregnancy with Short Cervix Compared to Unselected and Non-Treated Twin Pregnancy: A Historical Equivalence Cohort Study (EPM Twin Pessary Study)

Abstract Objective The present study aims to determine if the use of cervical pessary plus progesterone in short-cervix (≤ 25 mm) dichorionic-diamniotic (DC-DA) twin pregnancies is equivalent to the rate of preterm births (PBs) with no intervention in unselected DC-DA twin pregnancies. Methods A historical cohort study was performed between 2010 and 2018, including a total of 57 pregnant women with DC-DA twin pregnancies. The women admitted from 2010 to 2012 (n = 32) received no treatment, and were not selected by cervical length (Non-Treated group, NTG), whereas those admitted from 2013 to 2018 (n = 25), were routinely submitted to cervical pessary plus progesterone after the diagnosis of short cervix from the 18th to the 27th weeks of gestation (Pessary-Progesterone group, PPG). The primary outcome analyzed was the rate of PBs before 34 weeks. Results There were no statistical differences between the NTG and the PPG regarding PB < 34 weeks (18.8% versus 40.0% respectively; p = 0.07) and the mean birthweight of the smallest twin (2,037 ± 425 g versus 2,195 ± 665 g; p = 0.327). The Kaplan-Meyer Survival analysis was performed, and there were no differences between the groups before 31.5 weeks. Logistic regression showed that a previous PB (< 37 weeks) presented an odds ratio (OR) of 15.951 (95% confidence interval [95%CI]: 1.294–196.557; p = 0.031*) for PB < 34 weeks in the PPG. Conclusion In DC-DA twin pregnancies with a short cervix, (which means a higher risk of PB), the treatment with cervical pessary plus progesterone could be considered equivalent in several aspects related to PB in the NTG, despite the big difference between these groups.


Introduction
Despite the low prevalence of twin pregnancies (only 2%), they are responsible for 15% of all spontaneous early preterm births (PBs) < 32 weeks. The higher number of PBs probably occurs due to uterine overdistension. The rate of PBs in twin pregnancies < 37 weeks is around 50%, and the mean gestational age at delivery is around 36.5 weeks. [1][2][3] A cervical length < 25 mm measured between 20 to 24 weeks in twin gestations is accepted as a good predictor for PB. A short cervix increases the risk of preterm birth before 28 weeks of gestation from 3.5% to 25.8%, and from 41.2% to 75.5% before 37 weeks of gestation. 4,5 Although the prediction is relatively well determined with short cervical length, the intervention is still a challenge in twin pregnancies. Different strategies for prevention of preterm delivery in twin pregnancies have been considered, such as vaginal progesterone, cervical pessary, and cervical cerclage. [6][7][8] A recent metanalysis 9 of individual data concluded that vaginal progesterone in twin gestations with short cervix (< 25 mm) reduced the risk of PB before 33 weeks from 43.1% to 31.4% (relative risk [RR]: 0.69; 95% confidence interval [95%CI]: 0.51-0.93), and reduced the risk of composite neonatal morbidity and mortality from 40% to 27.4% 10 (RR: 0.61; 95%CI: 0.34-0.98), when compared with no treatment, but these results are not a consensus in literature. 11,12 Another multicentric randomized controlled trials (RCT) in twins demonstrated that the prophylactic use of the cervical pessary could reduce the rate of early PB in the subgroup with a short cervix. Despite this evidence, the largest study 13 using the cervical pessary in twin pregnancies did not demonstrate the benefits of its use.
In 2016, in New Jersey, a retrospective study 8 compared the use of cervical cerclage to no treatment in twin pregnancies with short cervix (< 25 mm), and significant results were obtained in favor of cerclage (odds ratio [OR]: 0.22; 95% CI: 0.058-0835), despite the fact that previous studies 14 did not corroborate this data.
Neither cerclage, the cervical pessary or progesterone could be considered a better choice for intervention in twin pregnancies with a short cervix, nor have they been Conclusion In DC-DA twin pregnancies with a short cervix, (which means a higher risk of PB), the treatment with cervical pessary plus progesterone could be considered equivalent in several aspects related to PB in the NTG, despite the big difference between these groups.

Palavras-chave
discarded as an option for this particular type of pregnancy. 15 But some studies observe favorable results in the pessary group after the comparison with progesterone (regarding PB and morbidity) in twin pregnancies with short cervix. 16,17 Moreover, an economic analysis 18 was published recently with positive results for the pessary group in twin pregnancies with a short cervix. At the present moment, there is no publication comparing the use of the cervical pessary in twin gestations with a short cervix to low-risk dichorionic-diamniotic (DC-DA) twin pregnancies. The objective of the present study was to determine the equivalence of the use of the cervical pessary associated with progesterone in DC-DA twin gestations with a short cervix compared with no intervention in unselected twin pregnancies.

Methods
The present historical equivalence cohort study in asymptomatic DC-DA twin pregnancies was performed from January 2010 to July 2018 in Escola Paulista de Medicina, Universidade Federal de São Paulo, a public quaternary service in Brazil; it was approved by the Ethics Committee (under CAAE number 30873613.8.0000.5505; http://plataformabrasil.saude.gov.br/login.jsf), and was called the EPM Twin Pessary Study. From January 2013 to July 2018, after obtaining informed consent, we included in the study 25 women with cervical length 25 mm measured by transvaginal scan (Samsung Ultrasound System WS80A, Seongnam-si, Gyeonggi-do, South Korea), during gestational age between 18 to 27 weeks and 6 days (Pessary-Progesterone group, PPG). The PPG received 200-mg daily doses of vaginal micronized progesterone, and the Ingámed (Maringá, PR, Brazil) AM cervical pessary was placed ( Figure B -Addendum), which is registered in the Brazilian Medical Regulatory Agency (Agência Nacional de Vigilância Sanitária, ANVISA, in Portuguese), under number 80086720036. 19 Baseline characteristics and outcomes were compared with 32 DC-DA twin pregnancies from the same university from January 2010 to December 2012, neither selected by cervical length nor treated (Non-Treated group, NTG). The exclusion criteria for both groups were fetal malformation, selective fetal growth restriction, or refusal to sign the informed consent form. The exclusion criteria for the PPG were also exposed membranes, rupture of membranes, or labor.
For a description of the technique of the transvaginal cervical ultrasonography and the cervical pessary insertion, see the Addendum.
The primary outcome was defined as PB < 34 weeks. The secondary outcomes were defined as the mean gestational age at delivery (AEstandard deviation), the mean weights of the biggest and smallest newborns, the comparison of the rate of PB < 37, 35, 32 and 28 weeks, the performance of consecutive deliveries during the study, the Kaplan-Meyer survival analysis, and the backward stepwise logistic regression for PB < 34 weeks for PPG.
The risk of PB < 37 weeks in both study groups was assessed using the Kaplan-Meyer Survival analysis. The consecutive gestational age deliveries compared both groups and evaluated the performance and the learning curve of groups during the study.
The continuous variables were expressed as medians and standard deviations, and the categorical variables were presented in numbers and percentages (%). The comparison between the outcome groups was made using the Chisquared (χ 2 ) test or the Mann-Whitney U test for the categorical variables, and the Student t-test for the continuous variables. Significance was set at a p-value < 0.05, two-tailed, and marked with an asterisk ( Ã ).
Using the primary outcome measure of PB < 34 weeks of gestation, with an effect size of 40% and an error level of α ¼ 0.5, a sample size of 60 women (30 in each group) achieved a power of 72%. For the analyses of the data, we used the Statistical Package for the Social Sciences (SPSS, IBM Corp., Armonk, NY, US) software, version 23.0, and Statplus (Mac v5 for Excel, AnalystSoft, Inc., Walnut, CA, US).

Demographic Characteristics
In total, 57 women with DC-DA twin pregnancies participated in the study. Considering both groups, 40 (70.2%) women were primigravidas, and 17 (29.8%) were multiparous; 32 (56.1%) were white, and 25 (43.9%) were non-white. The differences between the groups are expressed in ►Table 1. For the PPG, the gestational age at the diagnosis varied between 18 and 27 weeks and 4 days (mean age of 24 weeks and 1 day AE 2.4 weeks). The mean cervical length of these gestations at the time of the pessary placement was of 14.3 AE 7.1 mm.

Parametric Comparison
In our consecutive (n ¼ 32) DC-DA twin pregnancy NTG, the mean gestational age at delivery was of 35.83 AE 8.7 weeks, and in the PPG (n ¼ 25), it was of 34.59 AE 2.72 weeks (p ¼ 0.11), a difference of only 1.24 weeks, despite the big difference between the groups regarding the risk of PB due to the short cervix. The mean interval of permanence of the cervical pessary was of 10.18 AE 3.6 weeks.

Comparison of Birthweight
Regarding birthweigth for the smallest twin, the findings for the PPG and NTG were respectively: 2,038 AE 426 g versus 2,195 AE 665 g, and they were not statistically significant (p ¼ 0.327). For the heaviest twins, the difference was statistically significant (2,148 AE 434 g versus 2,493 AE 643 g; p ¼ 0.028 Ã ). Furthermore, the use of the cervical pessary did not influence the birthweight difference between the bigger and smaller fetuses in each group. For the NTG, the mean difference was of 12 AE 6%; for the GPP, it was of 11 AE 2% (p ¼ 0.375).

Logistic Regression
Univariate logistic regression nor adjusted was performed for PPG, and it considered maternal age (! 35 years), ethnicity (white and non-white), BMI (> 30), smoking, week of inclusion in the study (< 23 weeks), IVF, previous PB (< 37 weeks), and multiparity, considering the number of deliveries < 34 weeks as a dependent variable. A statistical difference was observed only for previous PB (OR: 15.951; 95%CI: 1.294-196.557; p ¼ 0.031 Ã ), as shown in ►Table 2, and none of the other variables analyzed could be classified as relevant to determine PB < 34 weeks (Hosmer-Lemeshow Test; p ¼ 0.08) (►Table 2).

Cumulative Outcome and Learning Curve
To analyze the performance of the PPG, the age at delivery of each case was plotted consecutively in the chart in ►Fig. 1. The performance of the GPP in the middle of the consecutive analysis was decreasing, and, in the end, it presented a recovery (red curve). In comparison, the cervical length decreased continuously without a similar recovery (green curve). In contrast, the performance of the NTG was homogeneous, and was overlapping the 36th week during the whole period of the analysis (blue curve).

Kaplan-Meyer Survival Analysis
The cumulative percentage of participants who did not give birth spontaneously before 37 weeks was statistically significant between the two groups after the Kaplan-Meyer analysis. The median gestational age at delivery for the PPG was of 35.14 (95%CI: 33.88-36.40) weeks, that is, slightly lower than that of the NTG (36.86 weeks; 95%CI: 35.90-37.80), and it was

Discussion
The short cervix is a rare complication in human pregnancy, and only 1% to 2% of women have a cervix shorter than 25 mm. Considering the prevalence of twin pregnancies as  1% to 2%, we can speculate that the prevalence of short cervices in twin pregnancies is extremely rare, $ 1 to 2/ 10,000 gestations. For that reason, it is very probable that in the NTG there was a small number of women with short cervices, justifying the difference regarding this maternal characteristic between the groups. 20,21 An important main finding of our study was the absence of difference in birthweight regarding the smaller twins of patients with short cervices when compared with the low risk for PB twin pregnancies. For the heavier twins, the difference was statistically significant, but they are less susceptible to an unfavorable outcome. The study sample had enough power to demonstrate the statistical difference between the biggest newborn presenting (p ¼ 0.028 Ã ), but that difference was not sustained on the smallest newborn (p ¼ 0.327), although the sample had theoretical power to do so. This observation enables us to assume that regardless of the important difference between the groups regarding the risk of PB, considering perinatal results for the smallest dichorionic twin, both groups are equivalent.
In the same way, there was only one variable with a statistically significant difference between the groups. In the univariate logistic regression, we could classify the variables relevant to PB, and previous PB was the only important variable to determine PB < 34 weeks on PPG. The OR demonstrates odds almost 16 times higher of delivery < 34 weeks if previous PB (< 37 weeks) was on the clinical history (p ¼ 0.031 Ã ). So, if previous PB is associated with a short cervix and a twin pregnancy, the risk of preterm delivery is so high that neither the cervical pessary nor vaginal progesterone will be enough to prevent it.
The Kaplan-Meyer analyses of the present study suggest that the differences between the groups were significant (p ¼ 0.025 Ã ); furthermore, analyzing the risk until 31.5 weeks, there was no difference in the cumulative risk in the survival analysis, and the biggest difference between the groups was only 0.2 at 35 weeks, an important landmark on gestational age for twins (►Fig. 2). These numbers could be hopeful if we consider that the results of the association of twin pregnancies and short cervices without treatment, and of the cervical pessary with progesterone, could be an alternative in rare cases with this association to reach better outcomes in twin pregnancies, corroborating the data of the ProTwin Study. 22,23 In a recent randomized clinical trial 24 from Egypt, El-Refaie et al analyzed the outcome of twin pregnancies with short cervices after the administration of placebo or vaginal progesterone. We could compare the results of the aforementioned study (progesterone and controls) with those of the present study. The rate of PB < 34 weeks was of 40% (PPG) versus 35% (El-Refaie et al progesterone group) versus 52.8% The percentages of PB are similar for the PPG and the El-Refaie et al progesterone group, but when compared with the El-Refaie et al controls, the performance of the PPG was superior in both gestational ages. We must consider that these results from the El-Refaie et al progesterone group were not reproducible in the most relevant studies in twin pregnancies treated with isolated progesterone, and some articles even demonstrated that intramuscular progesterone could increase PB in twin pregnancies more intensely < 32 weeks. 10,[25][26][27][28][29][30][31] We can consider that the cervical pessary plus progesterone may have a better performance in the protection against PB in twin pregnancies compared with isolated progesterone, especially < 32 weeks, as our data indicated a similar performance on the Kaplan-Meyer survival analysis(< 31.5 weeks).
A weakness of the methodology employed in the present study is the demographic characteristics demonstrating that the groups were comparable except for maternal age and gestational age at inclusion; it is a potential risk of bias. Older maternal age is more frequently pointed out as a variable associated with PB, but in our cohort, regarding this characteristic, it is statistically different. There were more older pregnant women in the PPG, which is a clear factor to increase the risk on that population, despite this statistical difference in demographic characteristics. All of these factors clearly increased the risk of PB in the PPG. However, surprisingly, the PPG demonstrated similar results in the prevention of prematurity, and the intervention was probably the reason for that.
Another potential risk of bias is that the gestational age at inclusion was statistically significant, and the difference was of 6.28 weeks. As an inclusion criterion, all patients in the sample were between the 18th to 27th weeks of gestation. This important difference is not so relevant as a risk of bias, because the NTG received no intervention; therefore had the inclusion occurred at any gestational age, there would have been no difference in the final results of the NTG.
Another weakness of this methodology is the difference regarding the cervical length, because the cervical length of the PPG must be much shorter than that of the NTG, and this characteristic was probably responsible for the worse performance of the PPG regarding PB. 4 One strength of the present study could be the change in the rates of PB using a mechanical device, which has been demonstrated as secure, and with a low rate of fetal and maternal complications; due to the presence of the device, the patients described that they felt safer regarding the attachment to the cervix itself, and these data had not yet been described on the literature. 32 Another strength of the methods herein employed was the application of transvaginal ultrasound after the insertion of the pessary. It was only because of the ultrasound that we were able to recognize patients with the pessary in a bad position; after the diagnosis, the device was repositioned in all of those cases. 33,34 Our data also suggest that it was the relocation of the pessary in a bad position that was responsible for the better performance in individual rates of PB (red line) during the study, as demonstrated by the learning curve, even despite the reduction in cervical length (green line) (►Fig. 1). This is important information, because medical experts could be more efficient in using this device. The learning curve can be enhanced through better prediction of the risk of PB during pregnancy, and sometimes by increasing rest, or treating with antibiotics if the cases are associated with amniotic fluid sludge or repositioning of cervical pessary when it is not correctly placed around the cervix. 35,36 Due to the expertise of the team involved, there was a lower rate of unnecessary interventions, such as the unreasonable removal of the cervical pessary, probably because of fear of the unknown, which was very common in the first years of the study of the device in hospitals that were not familiarized with it.
The present is one of the first studies published with this new pessary developed in Brazil; it is very similar in shape to the Arabin (Dr. Arabin GmbH & Co., Witten, Germany) pessary, and our staff in the present study started the research, in a single center, in 2012 after the study by Goya et al. 33 The team involved in the present study is headed by two senior medical researchers who acquired, over the course of 7 years, a lot of experience in cervical pessaries by using them and analyzing our perinatal results (this was how our team acquired experience: with practice and aligned to the literature). 33 The Brazilian pessary has three differences in comparison with the Arabin pessary: 1) the surface of the internal ring is not soft, producing a "grasping" effect on the cervix, and during the entire study (which involved around 200 cervical pessaries in women at different gestational ages and in singleton pregnancies), we did not have any problems to remove this pessary, and probably because of this structure we did not have any escape of the pessary after 1 week of insertion; all re-insertions or maneuvers for pessary reposition (when necessary) occurred at the first week after insertion; 2) it is made with a harder silicone if compared with the Arabin, but it is completely malleable and adaptable to the vagina. This non-soft silicone with tighter adherence to the cervix placed over the perineal muscles can improve resistance against the pressure exhorted on the cervix by the uterus; and 3) it is a single-size cervical pessary, and sometimes the adjustment is not easy, especially in multiparous women in whom the cervix is commonly larger than that of primigravidas.
More studies are necessary to evaluate the real efficacy of the cervical pessary plus progesterone on PB in DC/DA twin pregnancies, and new trials must be designed with this purpose. It is relevant for the success of the new studies to consider the appropriate training of the researchers regarding insertion and evaluation by ultrasound of the correct position of the pessary, which should completely involve the cervix, as well as the development of a protocol regarding the performance of the transvaginal ultrasound during routine prenatal appointments to ensure a better performance on the prevention of PB.
In conclusion, the comparable birthweight of the smallest twin, the similar risk of preterm birth < 31.5 weeks (by the Kaplan-Meyer survival analysis), the absence of statistical difference regarding important variables in the logistic regression, and the absence of statistical difference in the rate of PB < 28, 32, 34 and 35 weeks can suggest an equivalence between the NTG and the PPG concerning some important aspects, despite the big difference between these groups.

Contributors
All of the authors contributed with the project and the interpretation of the data, with the writing of the article, the critical review of the intellectual content, and with the final approval of the version to be published.

Conflict of Interests
The authors have no conflict of interests to declare.

The technique of transvaginal ultrasound for cervical measure
The transvaginal ultrasound cervical measure for inclusion was performed during the appointment of routine anomaly 2nd trimester scan and was used, by itself, to determine whether to place the pessary or not, with the cervical length between 0 to 25 mm. Each transvaginal scan was performed over a period of about 10 minutes and the shortest of three measurements was considered. The exclusion criteria for GPP were the exposed membranes, rupture of membranes or labor.
All sonographers involved in this study obtained the appropriate Certificate of Competence for Cervical Assessment of The Fetal Medicine Foundation (https://courses. fetalmedicine.com).
After pessary insertion we performed transvaginal ultrasound during a regular monthly prenatal appointment to check the position of the pessary and evaluate funneling inside the pessary or suspicion of membranes protrusion.

The technique of pessary insertion
With a patient in gynecological position, sterile cervical pessary was folded in the middle and inserted in the vagina, it was unfolded after the insertion with the minor orifice involving the entire cervix, and the major orifice supported by the posterior vaginal wall. This insertion did not require any other equipment, besides sterile gloves and lidocaine gel to lubricate the distal part of the vagina and minimize the discomfort of the insertion. A digital vaginal examination was performed after the insertion to check the correct position of the device, and immediately after the clinical exam, a transvaginal ultrasound was performed with a similar intention. If the pessary was considered in a questionable position (►Figure C -Addendum), some maneuvers to relocate the device were performed. These maneuvers consisted in pushing the pessary against the cervix, or rotating the pessary to involve the entire cervix, or pushing the vaginal anterior wall to put the posterior lip of the cervix inside the pessary, or even a mix of these maneuvers (►Figure C/D -Addendum).
One week after pessary placement, we have performed transvaginal ultrasonography to identify the position of cervix concerning internal aperture of pessary. This is done to give us feedback on pessary placement. This feedback training us on placing the device as high as feasible and reduces inter-operator variability. This information is used to determine whether to reposition pessary if it is not completely involving the cervix (►Figure D -Addendum).   It is relevant observe that CGA is centralized (blue area); it was considered for our team well positioned and no further procedures was required after this image. Both images are from the same patient, with difference a couple minutes (5 minutes difference between each image).