Introduction

Vaginal pessaries are widely used as a conservative treatment option in the management of pelvic organ prolapse (POP) [1, 2] and have proven effective in relieving POP symptoms [3,4,5]. However, multiple attempts with different pessaries are sometimes required before obtaining an adequate fit [6]. Additionally, pessary fitting is reported as unsuccessful in up to 59% of the women [7], the most common reasons being pessary dislodgment, discomfort/pain, de novo urinary symptoms and failure to relieve POP symptoms [8]. Many studies have been published on the factors associated with (un) successful pessary fitting for POP [7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39]. Among other potential predictors, age, body mass index (BMI), prior surgeries, predominant POP compartments and advanced POP have been assessed, but results differ across studies. It is thus necessary to clarify which parameters are associated with unsuccessful pessary fitting. This knowledge could improve the clinical practice of physicians dealing with POP: the counselling for pessary treatment would be more effective and more targeted, and potential parameters associated with failure would be known and discussed with the patient. In addition, modifiable factors could be addressed to increase the probability of success.

The aim of the current review and meta-analysis is to clarify which clinical, demographical and anatomical (assessed by clinical examination or imaging techniques) parameters are associated with unsuccessful pessary fitting for POP up to 3 months follow-up. A maximum of 3 months follow-up was chosen to focus on pessary fitting process instead of long-term pessary use.

Methods

Sources

The first author searched Emtree/MeSH terms and keywords related to prolapse, pessary and the exposures (i.e. parameters associated with unsuccessful pessary fitting) through Embase, PubMed and the Cochrane CENTRAL library. The outcome, e.g. unsuccessful pessary fitting, was not included in the search to avoid the risk of missing relevant records. The terms searched through Embase are reported in Table 1 (the same search strategy was translated to PubMed and Cochrane CENTRAL library). The final search was made on the 8 May 2020. No time restrictions were applied, while restrictions were used for language (i.e. English). All results were exported to RefWorks (Legacy version), and duplicates were removed. If an abstract and a paper reporting the same data were retrieved, the abstract was considered a duplicate and removed.

Table 1 Embase search strategy

Eligibility criteria

Studies were included in which (1) pessary fitting was attempted in women with symptomatic POP (at least 80% of the study population had to have symptomatic POP), (2) one of the assessed outcomes was the success of “initial fitting” and/or “fitting process” with a maximal follow-up of 3 months (in the case of a longer follow-up, at least 80% of the unsuccessful group had to have discontinued the pessary within 3 months from the initial fitting) and (3) baseline parameters (i.e. clinical, demographic and anatomical parameters) were compared between the successful and unsuccessful group. Study design was not a selection criterion and studies reported only in conference abstracts were not excluded. In the following, “initial fitting” will refer to the first visit, which is considered successful if the patient leaves the clinic with a pessary that stays comfortably in place. “Fitting process” will refer to pessary use from initial fitting until a defined follow-up time. It is considered successful if the patient is still using the pessary at follow-up. “Pessary fitting” will refer to both initial fitting and fitting process, if no distinction between the two is needed.

Study selection

To select records eligible for full text assessment, title and abstract were screened by the first and second author, independently from each other. Any disagreement was resolved by discussion and the opinion of a third party (last author). The full text of the selected records was independently assessed by the same two authors. Disagreements were again resolved by discussion and the opinion of a third party (last author). The authors of a record were contacted if the full text of their paper was not accessible either online or at our institutional library and if some relevant parts of the records were unclear [e.g. definition of pessary fitting (un)success, time to follow-up, statistical significance of the observed differences or incorrect numbers].

Data extraction

A standardized data extraction form was created to retrieve the information relevant to the research question. The following data were extracted: reference (first author, year, journal citation), study design type, study setting, inclusion and exclusion criteria, sample size, prolapse assessment (i.e. Pelvic Organ Prolapse Quantification system or Baden-Walker), pessary types used, assessment of initial fitting and/or fitting process, definition of successful fitting, success rate, time to follow-up, parameters compared between successful and unsuccessful group, significant parameters on univariate analysis and significant parameters on multivariate analysis (if performed). In case a record reported follow-ups beyond 3 months, only the parameters relating to the follow-ups of the first 3 months were extracted.

Assessment of risk of bias

The Newcastle-Ottawa Scale (NOS) for case-control studies was used to assess the risk of bias of the included full-text articles [40]. Records only available as abstracts (i.e. no full-text available) were not assessed because of the limited amount of information they can provide. The NOS is specifically designed for non-randomized studies. It consists of three domains: Selection, Comparability and Exposure. The maximum total score is nine (four for the Selection domain, two for the Comparability domain and three for the Exposure domain). The first item assessed in the Selection domain is the adequacy of case definition and requires an independent validation. Since the success of pessary fitting is mostly patient self-reported, and no independent validation is applicable, no points could be given to this item. Therefore, the maximum score for the Selection domain was 3. A standard criterion for what constitutes a high-quality study base on the NOS has not yet been established. Generally, a study scoring ≥ 7 is considered high quality [41]. However, since no studies could get the maximum score on the Selection domain, we used a score of ≥ 6 as definition of high-quality studies.

Data synthesis

To produce a qualitative synthesis of the results, all parameters assessed on their association with unsuccessful pessary fitting were clustered in a limited number of domains. For each domain one table was produced enumerating all studies in which a specific parameter was assessed on univariate and/or multivariate analysis.

To assess pessary fitting success rate, the weighted success rate at different times to follow-up was calculated. Sub-analyses were made for those studies which excluded and included women with unsuccessful initial fitting.

A meta-analysis of the parameters compared between successful and unsuccessful group in at least two records was performed. All available studies were combined without making any distinction based on the time to follow-up. A study was not included in the meta-analysis if the necessary input data were not reported and if, after having contacted the authors, they did not provide the requested data. In case of overlap between study populations of two records, the record with the largest sample size reporting the parameter of interest was included in the analysis. The meta-analysis was done with the Comprehensive Meta-analysis (CMA) version 3 software. Input data for dichotomous variables were number of exposed (i.e. number of patients with a specific parameter, e.g. prior hysterectomy) and sample size of unsuccessful and successful group, when available, or odds ratio (OR) and confidence intervals. In the last case, unadjusted ORs were used in the meta-analysis. For continuous variable input data were mean, standard deviation (SD) and sample size of unsuccessful and successful group or, if a t-test was run to compare the two groups, p value and sample size of the two groups. If the data were reported as median and range (minimum-maximum) or interquartile range (IQR), the authors were contacted and asked for mean and SD. In case of no response, mean and SD would have to be imputed to include the study in the meta-analysis. At first, the meta-analysis was run excluding the studies that required data imputation. To test if the imputed data would have influenced the results, the meta-analysis was also run after data imputation. If the data were reported as median and range, the mean was imputed using the method described by Hozo et al. [42] and the SD was imputed using the method described by Wan et al. [43]. If the data were reported as median and IQR, mean and SD were derived using Wan’s method. Authors were also contacted if they reported a parameter as significant or not significant without providing quantitative data. A random effect model was applied for the analysis. The summary measure used was OR. Heterogeneity was assessed with Q test and I-squared. For the significant parameters the risk of publication bias was assessed with the trim and fill procedure [44]. The meta-analysis without data imputation is presented in the result section, while the meta-analysis with data imputation is reported in Appendix E.

The review was conducted in adherence to the PRISMA and MOOSE guidelines. The protocol of the review was not registered before implementation.

Results

Study selection

Using the search strategy described, 1084 unique records were identified. The screening of title and abstract left 151 records. Of these, 119 were excluded after full text assessment and are reported in Appendix A. Thirty-two records (27 papers and five conference abstracts) were included in the qualitative synthesis and 24 in the meta-analysis (Fig. 1).

Fig. 1
figure 1

Records identification, inclusions and exclusions with reasons

Study characteristics

The characteristics of the 32 included records are enumerated in Table 2. In the following, the included records will be referred to according to the numbers reported in Table 2 and a superscript number will be used in the text. It has to be noted that there is an overlap between the study populations of Cheung et al. (2017) and Cheung et al. (2018) and Manchana (2011) and Manchana et al. (2012). In Appendix B the list of the authors contacted during the review process is reported.

Table 2 Characteristics of the included records

Risk of bias

In Table 3 the Newcastle-Ottawa Scale scores for the three domains and the total scores are reported. Mean total score was 6.

Table 3 Newcastle-Ottawa Scale scores

Synthesis of results: success rate

Pessary fitting success rate ranged from 41%17 to 96%19. In Table 4 the weighted means at different times to follow-up are shown. Sub-analyses were made for those studies which excluded and included women with unsuccessful initial fitting. When the unsuccessful initial fitting was included, the success rates were overall lower (data at 3–4 weeks and 3 months). No sub-analysis was run for studies assessing fitting process success rate at 1/2 weeks, because only one study excluded women with unsuccessful initial fitting2.

Table 4 Weighted mean of pessary fitting success rate at different times to follow-up. Study reference refers to Table 2

Synthesis of results: parameters

The parameters assessed on their association with unsuccessful pessary fitting by different authors were clustered into nine domains: (1) Demographics, (2) Obstetric history, (3) (Uro) gynaecological symptoms and medications, (4) Prior surgeries, (5) General history, (6) Questionnaires, (7) POP and pelvic floor assessment, (8) Pessary and (9) Imaging. Appendix C shows the domain tables enumerating all studies in which a specific parameter was assessed on univariate and/or multivariate analysis. The results of the meta-analysis excluding imputed data are shown in Table 5 and the corresponding forest plots in Figs. 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14 (significant parameters) and Appendix D (non-significant parameters).

Table 5 Results of the meta-analysis (imputed data excluded)
Fig. 2
figure 2

Forest plots of the significant parameters (results of the meta-analysis excluding imputed data)

Fig. 3
figure 3

Forest plot for the association of age with successful pessary fitting up to 3-month follow-up (N = 2901)

Fig. 4
figure 4

Forest plot for the association of BMI with unsuccessful pessary fitting up to 3-month follow-up (N = 2244)

Fig. 5
figure 5

Forest plot for the association of menopausal status with successful pessary fitting up to 3-month follow-up (N = 1338)

Fig. 6
figure 6

Forest plot for the association of Stress urinary incontinence (SUI) (i.e. pre-existing or de novo SUI) with unsuccessful pessary fitting up to 3-month follow-up (N = 1065)

Fig. 7
figure 7

Forest plot for the association of prior hysterectomy with unsuccessful pessary fitting up to 3-month follow-up (N = 3431)

Fig. 8
figure 8

Forest plot for the association of prior prolapse surgery with unsuccessful pessary fitting up to 3-month follow-up (N = 2330)

Fig. 9
figure 9

Forest plot for the association of prior pelvic surgery with unsuccessful pessary fitting up to 3-month follow-up (N = 230)

Fig. 10
figure 10

Forest plot for the association of prior incontinence surgery with unsuccessful pessary fitting up to 3-month follow-up (N = 497)

Fig. 11
figure 11

Forest plot for the association of “CRADI-8” (i.e. Colorectal-Anal Distress Inventory-8) scores with unsuccessful pessary fitting up to 3-month follow-up (N = 401)

Fig. 12
figure 12

Forest plot for the association of TVL (i.e. total vaginal length) with successful pessary fitting up to 3-month follow-up (N = 1135)

Fig. 13
figure 13

Forest plot for the association of wide introitus (i.e. ≥ 4 fingerbreadths) with unsuccessful pessary fitting up to 3-month follow-up (N = 200)

Fig. 14
figure 14

Forest plot for the association of levator ani muscle avulsion with unsuccessful pessary fitting up to 3-month follow-up (N = 339)

Parameters associated with unsuccessful pessary fitting are: younger age, higher BMI, pre-menopausal status, stress urinary incontinence (SUI), prior surgery (i.e. hysterectomy, POP surgery, pelvic surgery, and incontinence surgery), higher Colorectal-Anal Distress Inventory-8 (CRADI-8) scores (which assess symptoms of obstructive defecation, anal incontinence, pain during defecation, faecal urgency and rectal bulging), shorter total vaginal length (TVL), wide introitus, levator ani avulsion and larger hiatal area on maximum Valsalva. The heterogeneity between studies and risk of publication bias is low for age, BMI, menopausal status, prior hysterectomy, prior pelvic surgery and prior incontinence surgery. SUI, prior POP surgery and TVL show a low risk of publication bias, but a relatively high heterogeneity between studies. For CRADI-8 scores, wide introitus, levator ani avulsion and hiatal area on Valsalva, the heterogeneity between studies is low, but the impact of publication bias could not be quantified because only two studies could be included in the analysis.

In Appendix E the results of the meta-analysis including imputed data are shown and in Appendix F the corresponding forest plots. Running the analysis without and with the imputed data did not qualitatively change the results: significant parameters remained significant and non-significant parameters remained non-significant. Sub-analyses were made for the parameters SUI and predominant posterior compartment. SUI is associated with unsuccessful pessary fitting (OR 2.06, 95% CI 1.15–3.66, z-value 2.45, p value 0.01). However, grouping the studies into those which assessed pre-existing SUI only and those which also assessed de novo SUI (alone or in combination with pre-existing SUI), de novo SUI remains significant (OR 5.59, 95% CI 2.24–13.99, z-value 3.68, p value 0.00), while pre-existing SUI does not (OR 1.44, 95% CI 0.88–2.36, z-value 1.45, p value 0.15) with small heterogeneity within groups (Q-value 11.17, p value 0.13).

Predominant posterior compartment is not associated with unsuccessful pessary fitting (OR 1.78, 95% CI 0.98–3.24, z-value 1.88, p value 0.06). However, in case of predominant multiple compartments (e.g. maximum POP stadium in the apical and posterior compartment), the patient was included in all relevant groups (e.g. predominant apical compartment POP and predominant posterior compartment POP). Analysing solitary predominant posterior compartment POP (i.e. excluding women with multiple predominant compartments), a significant association with unsuccessful fitting is observed (OR 1.59, 95% CI 1.08–2.35, z-value 2.37, p value 0.02, Q-value 4.51, df (Q) 5, Q-test p value 0.48, I-squared 0.00) with low risk of publication bias (trim and fill procedure: OR 1.75, 95% CI 1.21–2.53, Q-value 7.04).

Discussion

The aim of the current review and meta-analysis was to clarify which clinical, demographical and anatomical parameters are associated with unsuccessful pessary fitting for POP up to 3 months follow-up.

Main findings: success rate

In the current review the success rate of pessary fitting ranged from 41% to 96%. However, these differences become smaller if sub-analyses are made based on the follow-up time. From initial fitting to 3 to 4 weeks follow-up, the mean success rate decreased from 86% (95% CI 78%–92%) to 65% (95% CI 54%–75%). Interestingly, after 4 weeks the success rate remained substantially stable [success rate of 63% (95% CI 53%–72%) at 3 months follow-up]. This suggests that planning a follow-up at 4 weeks after initial fitting would ensure the vast majority of the unsuccessful fittings were identified (as also reported by Lone et al. [45]). Studies in which only women with successful initial fitting were included reported higher success rates compared to studies in which also women with unsuccessful initial fitting were included. Therefore, our suggestion for future research is to clearly report whether this selection is made or not.

Main findings: parameters

Parameters associated with unsuccessful pessary fitting include: younger age, higher BMI, pre-menopausal status, SUI, prior surgery (i.e. hysterectomy, POP surgery, pelvic surgery and incontinence surgery), higher CRADI-8 scores, shorter TVL, wide introitus, levator ani avulsion and larger hiatal area on maximum Valsalva.

In the case of SUI and prior POP surgery, the risk of publication bias is small, but the heterogeneity is relatively high. With respect to SUI, analysing separately the studies which assessed pre-existing SUI only, and those which also assessed de novo SUI, the heterogeneity within groups becomes smaller. Interestingly, de novo SUI remains significant, while pre-existing SUI does not. This suggests that pre-existing SUI alone is not associated with failure. Therefore, when counselling a patient for pessary treatment for POP, presence of pre-existing SUI should not be considered a reason for advising a different treatment. With respect to prior POP surgery, a possible explanation for the relatively high heterogeneity is that all women of the unsuccessful group in the study of Nemeth et al. (2017) had prior POP surgery with consequent extremely high OR in this study compared to the others.

Some parameters that are significant in the meta-analysis have to be taken with caution. First, TVL shows high heterogeneity between studies. Second, the impact of publication bias could not be quantified for CRADI-8, wide introitus, levator ani avulsion and hiatal area on Valsalva because only two studies could be included in the analysis. In addition, levator avulsion shows moderate heterogeneity, which can be explained by the different definitions of unsuccessful pessary fitting: pessary expulsion in the study of Cheung et al. and pessary discontinuation within 3 months follow-up in the study of Turel et al. The same explanation can be given to the moderate heterogeneity of other non-significant parameters, i.e. predominant apical compartment, advanced POP and GH. These parameters were associated with pessary dislodgment in the study of Cheung et al. but were not associated with unsuccessful pessary fitting when no distinction was made between different reasons for unsuccessful pessary fitting. The reasons for unsuccessful pessary fitting are numerous, e.g. dislodgment, discomfort/pain, de novo urinary symptoms and failure to relieve POP symptoms [8]. Some parameters could be associated only with specific reasons for pessary fitting failure, but not others; future research should analyse the association between anatomical parameters and individual causes of pessary fitting failure.

Parameters related to obstetric history, e.g. number of pregnancies, deliveries and vaginal deliveries, were not found to be associated with unsuccessful pessary fitting. However, no study assessed the influence of prior vaginal delivery vs no prior vaginal delivery on pessary fitting failure. If pessaries are supported by the pelvic floor muscles, prior vaginal delivery (which can cause pelvic floor muscles damage [46]) could be a risk factor for failure, even if POP mostly occurs in parous women. Being sexually active and hormone replacement therapy (HRT) use are not associated with (un) successful pessary fitting. Therefore, a sexually active woman with POP can be encouraged to try this treatment option and prescribing HRT only in case of indication is confirmed to be good practice.

Interestingly, advanced POP stage (3–4) is not associated with unsuccessful fitting. Therefore, pessary treatment can be advised to women with any stage of POP. Predominant anterior, apical or posterior compartment POPs are also not associated with unsuccessful fitting. However, higher CRADI-8 scores (which assess colorectal symptoms) and solitary predominant posterior compartment POP (i.e. maximum POP stage only in the posterior compartment, while women with multiple predominant compartments being excluded) are associated with unsuccessful fitting. These results confirm that pessary treatment is less effective in relieving colorectal symptoms [47].

Recently, a systematic review and meta-analysis has been published on the factors associated with unsuccessful pessary fitting in women with symptomatic POP [48]. Differences between their work and ours are the following. First, the follow-up for pessary fitting was 1 to 3 weeks in their work, while we included studies with a maximal follow-up of 3 months. Second, our search was performed in Embase, PubMed and Cochrane CENTRAL library, while theirs was performed in PubMed, and we screened 1084 records, while they screened 350. Third, they only included prospective studies, while we also included retrospective studies. Fourth, we assessed the weighted success rate of pessary fitting at different times to follow-up, which was not assessed in their work, while they assessed the reasons for pessary discontinuation after successful insertion, which we did not assess. Fifth, in our meta-analysis 24 studies were included, while 21 studies were included in theirs. Sixth, we performed a meta-analysis of 29 parameters, while they performed a meta-analysis of seven parameters. Seventh, we performed the analysis without and with data imputation, while they did not specify if imputed data were also included. With respect to the results, BMI and prior POP surgery were associated with pessary fitting failure in both works. In addition, GH was consistently not associated with pessary fitting failure. Different results were obtained for age, TVL, prior hysterectomy and advanced POP, which can be partially due to the differences described above. Furthermore, more studies were included in our meta-analysis, which should make our results more solid. Only three studies were included in the meta-analysis of the parameter “advanced POP” in their work. The one with the highest relative weight was the study of Cheung et al. in which the definition of failure was pessary dislodgment. It might be that advanced POP is a predictor of pessary dislodgment but not a predictor of other reasons for failure. Lastly, since we analysed more parameters, we also observed that menopausal status, de novo SUI, solitary predominant posterior compartment POP, higher CRADI-8 score, wide introitus, levator ani avulsion and larger hiatal area on maximum Valsalva are associated with unsuccessful pessary fitting.

Strengths and limitations

The current review and meta-analysis has several strengths. It was conducted according to the PRISMA and MOOSE guidelines. Multiple databases were searched. Study selection was made, independently, by two authors. The included papers were, on average, high-quality studies with a low risk of bias, as assessed by the Newcastle-Ottawa Scale. Moreover, authors were contacted in the case of missing information. Some limitations have to be acknowledged. Meta-analyses have the limitation that the interaction between different parameters cannot be assessed. For example, it is highly probable that younger age and pre-menopausal status are correlated. However, we cannot establish whether one of the two is a confounder or both are independently associated with unsuccessful pessary fitting. In addition, mean and SD of continuous variables are needed to perform a meta-analysis, but some authors reported only median and range or median and IQR. To include these studies in the meta-analysis, mean and SD would have to be imputed. While we decided to exclude these studies from the meta-analysis to avoid any possible bias due to data imputation, we note that imputing mean and SD in these studies and including them do not qualitatively change the results: significant parameters remain significant and non-significant parameters remain non-significant. This suggests that our conclusions are robust.

Conclusions

In women with symptomatic POP, younger age, higher BMI, pre-menopausal status, de novo SUI, prior surgery (i.e. hysterectomy, POP surgery, pelvic surgery or incontinence surgery), solitary predominant posterior compartment POP, presence of colorectal symptoms, shorter TVL, wide introitus, levator ani avulsion and larger hiatal area on maximum Valsalva are associated with unsuccessful pessary fitting up to 3 months follow-up.

These results do not imply that an alternative treatment should always be recommended to women with these characteristics, but rather that the higher risk of failure should be acknowledged and discussed during counselling for pessary treatment. Women with high risk of unsuccessful fitting because of, among others, a high BMI could work on this modifiable parameter to increase their probability of success, especially if they do not have many other treatment options (e.g. women who wish to have more children or those unwilling or not suitable to undergo surgery [49]). If pessary treatment is chosen, being aware of the higher risk of failure would relieve some of the frustration related to the unsuccessful pessary fitting process. One might object that such a counselling could lower women’s expectation thus increasing the risk of failure. However, any counselling should be evidence based and should allow women to make informed decisions to be ethical. In addition, the risk of pessary fitting failure should be weighted against the risks related to other treatments (e.g. surgery), which in many cases would encourage women to try pessary treatment.

Ethnicity, obstetric history, pre-existing SUI, sexual activity, use of HRT, smoking, predominant anterior, apical or multiple compartment POP, and advanced POP are not associated with unsuccessful pessary fitting. Therefore, women with these characteristics can be reassured that they do not have an increased risk of failure and can be encouraged to try pessary treatment.

With respect to the anatomical parameters (assessed by clinical examination or imaging techniques), more research is needed to investigate their association with specific reasons for unsuccessful pessary fitting, i.e. whether it is dislodgment, discomfort/pain or other reasons. In addition, only two studies included in the meta-analysis assessed the association between TPUS parameters and unsuccessful pessary fitting. Therefore, the added value of TPUS in the pessary fitting process should be further investigated.