Validation and factor structure of the Italian version of the Birth Satisfaction Scale-Revised (BSS-R)

ABSTRACT Objective To validate the Italian-language version of the Birth Satisfaction Scale-Revised (BSS-R) and report key measurement properties of the tool. To evaluate the impact of antenatal class attendance on BSS-R assessed birth satisfaction. Background Maternal satisfaction is one of the standards of care defined by the World Health Organisation (WHO) to improve the quality of services. The BSS-R is a multi-dimensional self-report measure of the experience of labour and birth. Methods Cross-sectional instrument evaluation design examining factor structure and key aspects of validity and reliability. Embedded between-subjects design to examine known-group discriminant validity and the impact of antenatal class attendance on BSS-R sub-scale and total scores as dependent variables. After giving birth, 297 women provided data for analysis. Results The Italian version of the BSS-R (I-BSS-R) was the key study measure. The established three-factor and bi-factor models of the BSS-R were found to offer an excellent fit to the data. Comparison of the tri-dimensional measurement model and the bi-factor model of the BSS-R found no significant differences between models. Women who attended antenatal classes had significantly lower stress experienced during childbearing sub-scale scores (I-BSS-R SE), compared to those who did not. Good convergent, divergent validity and known-groups discriminant validity were established for the I-BSS-R. Internal consistency observations were found to be sub-optimal in this population. Conclusions On all key psychometric indices, with the exception of internal consistency that requires further investigation, the I-BSS-R was found to be a valid translation of the original BSS-R. The impact of antenatal classes on birth satisfaction warrants further research.


Introduction
Giving birth is a complex psychological individual experience, with elements of universal physiological processes and life event significance (Larkin et al., 2009). Evidence suggest that the experience of labour and birth is complex and subjective (Larkin et al., 2009).
A positive perceptions and satisfaction with the birth experience can be influenced by expectations' fulfilment, staff characteristics including quality of care and support, involvement in decision making, woman-centred care and women's perception of control (Bayes et al., 2008;DeLuca & Lobel, 2014;Hildingsson, 2015;Hollander et al., 2017;Lewis et al., 2016). Furthermore, women's experience with birth could have long-term implications for woman and baby's health, both physically and emotionally (Karlström et al., 2015).
The WHO (World Health Organisation, 2016b) reported that satisfaction reflects the extent to which expectations of service standards have been met. There is consensus that satisfaction with care (Christiaens & Bracke, 2007;Larkin et al., 2009) is a complex psychological individual experience, with elements of universal physiological processes and life event significance, therefore influenced by a variety of factors. Satisfaction has been defined as a 'positive feeling' or 'affective response' to an event (Bramadat & Driedger, 1993). Understanding women's perception of care and satisfaction with services is important, as perceived quality is a key determinant of service utilisation (Srivastava et al., 2015). Healthcare systems could be more effective if they considered women's experiences, with the aim to provide quality care and meet families' needs and expectations (Chief Nursing Officers of England, Northen Ireland, Scotland and Wales, 2010;Rao et al., 2006).
Maternal satisfaction is one of the standards of care defined by the World Health Organisation (WHO) to improve the quality of maternity services and to evaluate the organisation of Health Care Systems (World Health Organisation, 2016b); it should be considered as one of the most relevant indicators within both the midwifery and obstetric fields. Childbirth is one of the most common reasons for accessing health facilities; therefore, planners, managers and healthcare providers should assess women's satisfaction with care to evaluate services (Goodman et al., 2004;Hodnett, 2002).
A relatively under-explored though critically important element of care of relevance to the birth experience and birth satisfaction concerns the impact of antenatal classes. Contemporary practice and evidence advocates the use of antenatal classes to optimise birth preparedness to enhance the birth experience (Ricchi et al., 2020); however, this position is equivocal, with observations that the content of classes is not consistent with maternal expectations (Pålsson et al., 2019) and wide variation in the content of antenatal classes (Barimani et al., 2018).
Women's satisfaction with childbirth is an important measure of the quality of maternity care services, its assessment should be conducted using validated selfcompletion questionnaires due to their high reliability and low cost (Romero-Gonzalez et al., 2019). There are several instruments specifically developed to assess maternal satisfaction with care received during labour and birth (Alfaro Blazquez et al., 2017;Nilvér et al., 2017).
A systematic review (Nilvér et al., 2017) conducted with the aim to identify and present validated instruments measuring women's childbirth experience collected 36 tools. Among these, two Scales have been used within the Italian context. 'The childbirth perception questionnaire' by Bertucci et al. (2012) does not present testing of psychometric properties and it should be further evaluated. The scale 'Women's delivery experience measures' by Mannarini et al. (2013) aimed to evaluate birth experiences after both spontaneous and medically assisted pregnancy, focusing on indices considering the type of conception. This tool comprises a high number of items, 18, and does not focus only on the intrapartum care experience.
Considering the Italian birth context and the model of midwifery care provided, the BSS-R was evaluated as the most appropriate instrument to be culturally validated, in order to evaluate maternal satisfaction with birth. The Italian maternal care is quite medicalised (Euro-Peristat Project, 2018), obstetricians are the primary providers of all antenatal care with the majority of women having a private doctor, who will not be present at their birth. Although the Italian National Healthcare System is free of change at the point of use, 44.7% of Italian women choose to pay for a private obstetrician (Lauria et al., 2012), even though they have the opportunity to go to a free public community or hospital. The lack of continuity of maternity care (Lauria et al., 2012), with the majority of women who get to know a midwife only at the time of labour and birth, concentrated our choice on a scale that could focus on the intrapartum care aspects. This could give the opportunity to assess the quality of the intrapartum midwifery care and to evaluate and implement quality improvement programmes, in order to offer maternity services based on women's needs.
The Birth Satisfaction Scale-Revised (BSS-R; Hollins Martin & Martin, 2014) is a validated 10-item self-report measure that was developed in the United Kingdom to evaluate women's satisfaction with birth. Comprising three sub-scales of (i.) quality of care provided, (ii.) personal attributes of women and (iii.) stress experienced during childbirth. The BSS-R is a short, valid, reliable and theoretically anchored measure to assess mothers' satisfaction with birth and has recently been recommended as the key outcome measure for assessing birth experience globally (Nijagal et al., 2018). Widely translated and validated, the BSS-R has been shown to demonstrate generally excellent psychometric properties and conceptual alignment to the original UK version (Barbosa-Leiker et al., 2015;Burduli et al., 2017;Fleming et al., 2016;Jefford et al., 2018;Martin et al., 2017;Romero-Gonzalez et al., 2019;Vardavaki et al., 2015). The Italian version of the BSS-R has been recently developed following an extensive translation process to ensure congruence with the original version's conceptual and theoretical alignment, as well as emphasising contextual anchoring in an Italian childbearing context (Nespoli et al., 2018). To date, however, the Italian version of the BSS-R (I-BSS-R) has yet to be evaluated in a clinical population to determine and establish key psychometric properties; thus, the current investigation sought to validate the I-BSS-R in relation to these measurement parameters. Consistent with the approach taken with other translation/validation studies of the BSS-R (e.g. Romero-Gonzalez et al., 2019) our objectives were to: (1) Replicate the established tri-dimensional measurement model of the BSS-R in the I-BSS-R (2) Assess the psychometric properties of the Italian version of the BSS-R (3) Evaluate the impact of antenatal class attendance on I-BSS-R total and sub-scale scores.

Design
A cross-sectional instrument evaluation design utilising convenience sampling incorporating an embedded between-subjects design for known-groups discriminant validity testing. The Italian birth context has a classification system for levels of maternal care for Obstetric Units, comprising Level I Maternity Units providing care to women with lowrisk pregnancies or with minor complications and Level II Maternity Units dedicated to high-risk conditions. Participants were recruited from an Obstetric Unit of a Level I Italian Maternity Hospital. Participating mothers signed a consent form, which informed them of the voluntary nature of their participation, about the aim of the study, the procedures and the confidentiality of data (anonymous codification). Participants were consented to take part in the study and completed the I-BSS-R prior to discharge from hospital and within 72 hours after birth.

Ethical approval
Ethical approval was obtained from the Hospitals' Ethical Review Board. Written informed consent was gained from all the participants.

Measures
The BSS-R comprises 10 items scored using a 5-point Likert-type response format ranging from strongly agree to strongly disagree for each item. Higher scores on both the total score and the sub-scales equate to comparatively greater birth satisfaction. The three subscales, stress experienced during child-bearing (SE; 4 items), quality of care (QC; 4 items), and women's attributes (WA; 2 items), assess distinct aspects of birth satisfaction and a total score can also be calculated to indicate overall birth satisfaction. Several translation/validation studies have indicated both the robust measurement characteristics of the BSS-R and conceptual alignment with the tri-dimensional measurement model of the tool ( Barbosa-Leiker et al., 2015;Burduli et al., 2017;Jefford et al., 2018;Martin et al., 2017;Romero-Gonzalez et al., 2019).

A brief review of the translation process of the BSS-R into Italian
The exhaustive process taken to translating the BSS-R into Italian is described in detail by Nespoli et al. (2018). The Italian version of the BSS-R was developed achieving the crosscultural and conceptual equivalence of the English instrument (World Health Organisation, 2016). Briefly, the process reported by Nespoli et al. (2018) included a five-step forward translation, expert panel translation, back translation, pre-testing and cognitive interviewing to derive the final agreed version. A sample of 100 women were recruited to the study to determine the understandability of the measure during the pre-testing and cognitive interviewing stages. This translation study required two BSS-R items to be modified; specifically item 1. 'I came through my childbirth unscathed', was changed to 'I came through my childbirth without physical or psychological consequences' and item 9. 'I was not distressed at all during labour' was changed to 'I was not struggling at all during labour'.

Confirmatory factor analysis
Evaluation of the tri-dimensional measurement model of the BSS-R (Objective 1) was conducted using Confirmatory Factor Analysis (CFA; Brown, 2015). CFA represents the statistical approach to evaluate a measurement model and has been applied to several BSS-R validation studies (Barbosa-Leiker et al., 2015;Göncü Serhatlıoğlu et al., 2018;Jefford et al., 2018;Vardavaki et al., 2015). The bi-factor model was also evaluated using CFA. The generally benign distributional characteristics of the BSS-R observed in previous studies (normal distribution, absence of skew and kurtosis) predicate a Maximumlikelihood approach to model estimation (R. B. Kline, 2011a). Consistent with conventional practice (Brown, 2015), multiple measures of model fit were applied, these being the comparative fit index (CFI: Bentler, 1990), root-mean-squared error of approximation (RMSEA: Byrne, 2010), the square root mean residual (SRMR: Hu & Bentler, 1999), this also being consistent with contemporary BSS-R validation studies (e.g. Jefford et al., 2018;Romero-Gonzalez et al., 2019).
It should be noted that in terms of model evaluation, χ 2 is influenced by both sample size and data variation, and contemporary practice is, therefore, to evaluate models using the fit indices above rather than χ 2 where sample size and trivial variations in data can lead to a significant χ 2 even with the context of the well-fitting model (Byrne, 2010).

Divergent validity
Following the approach of Romero-Gonzalez (Romero-Gonzalez et al., 2019), divergent validity was assessed by correlation between the age of participants and I-BSS-R sub-scale and total scores with no statistically significant relationships between sub-scales/total score and age being anticipated.

Convergent validity
Taking a more sophisticated approach to convergent validity that has been seen in recent BSS-R validation studies to differentiate BSS-R sub-scales (e.g. Romero-Gonzalez et al., 2019), convergent validity was assessed by examining the correlation between I-BSS-R subscales and the total I-BSS-R score and the number of days of gestation. It is predicted that the correlation between the I-BSS-R SE sub-scale and number of days of gestation would be statistically significant and positive (thus greater satisfaction with the duration of pregnancy). It is also predicted that there would be no statistically significant correlation between the I-BSS-R WA and I-BSS-R QC sub-scales and number of days of gestation. No specific prediction is made regarding the relationship between the I-BSS-R total score and duration of pregnancy since this obviously represents a composite of all three sub-scales.

Known-groups discriminant validity
A standard approach taken with most BSS-R studies (e.g. Jefford et al., 2018;Romero-Gonzalez et al., 2019;Vardavaki et al., 2015) to determine known-groups discriminant validity is to compare BSS-R sub-scale and total scores between those who have (i.) an unassisted vaginal delivery (UVD) and those who have an (ii.) assisted delivery (Intervention). An advantage of this approach to discriminant validity evaluation is the unambiguous discrete dichotomisation of the between-groups variable that allows clarity in the evaluation of the discriminability of the measure and sub-scales along with the generation of effect sizes which can be used for comparison with observations from previous studies and the planning of future studies, for example, in terms of sample size calculations. No instrument deliveries are offered within the obstetric practice at the research site therefore the intervention group comprises delivery by suction cap, elective Caesarean section or emergency Caesarean section. It is anticipated that those in the UVD group would have significantly higher I-BSS-R SE and WA sub-scale scores and I-BSS-R total scores compared to those in the Intervention group. BSS-R QC sub-scale scores vary between studies, with no difference on this sub-scale reported between groups in the original validation study (Hollins Martin & Martin, 2014) in contrast to large differences being observed in Greek, Spanish and United States BSS-R validation studies Romero-Gonzalez et al., 2019;Vardavaki et al., 2015). Therefore, no specific prediction either in terms of statistical significance or directionality is made in relation to this I-BSS-R QC group comparison.

Antenatal class attendance
Antenatal class has the aim to guide parents towards childbirth and parenthood, providing evidence-based information. Antenatal classes start between 25 and 28 weeks of gestation, women decide spontaneously if attend them or not and they are suitable for first, second and third (plus) time parents. Classes are based either in the hospital or in the community; in both cases, they are run by a midwife and parents-to-be can choose which one they want to attend. Sometimes, during the hospital-based antenatal classes, parents have also a meeting with an anaesthetist, a paediatrician and an obstetrician. Classes are normally held once a week, for around 2 hours, for a total of nine sessions during pregnancy and one following birth; usually, these are pregnant women only classes with just a couple of partner sessions when the midwife covers topics such as when to go to the hospital, what happens in labour and caring for the baby. Classes usually combine pregnancy-specific exercise (pelvic floor exercises, relaxation, positions during labour), with information on different topics: physical and emotional changes during pregnancy, the development of the baby during the three trimesters, the stages of labour, coping strategies during childbirth, including different kinds of pain relief, feelings about birth, caring for a newborn baby, information on breastfeeding and parenthood.
Antenatal class attendance dichotomised into (i.) Attended (A) and (ii.) Did Not Attend (DNA) was used as a categorical variable in a further evaluation of knowngroup discriminant validity. No specific predictions are made regarding differences in I-BSS-R sub-scales and the I-BSS-R total score between groups neither is directionality intimated.

Internal consistency
Internal consistency was evaluated using Cronbach's coefficient alpha (Cronbach, 1951) with I-BSS-R sub-scale and total score acceptability determined by threshold values of 0.70 or higher (P. Kline, 2000).

Participants
Three-hundred Italian-speaking women took part in the study. Those consented to take part had given birth and completed the I-BSS-R within 72-hour postpartum. Three multivariate outliers were identified by reference to Mahalanobis distances exceeding the cutoff criteria of χ 2 > 29.59 for a ten-item measure leaving a total sample size of N = 297 for analysis. The mean age of participants was 33.12 (SD 4.96), range = 19-47 years. The mean duration of pregnancy was 39.36 (SD 1.16, range 37-41) weeks. One hundred and sixtyseven (56%) participants had had their first baby. Comparison with the mean BSS-R subscale and total scores reported by Hollins Martin and Martin (2014) revealed all I-BSS-R sub-scales and the I-BSS-R total score to be significantly lower (Table 1). Mean I-BSS-R subscale and total scores are summarised in Table 1 and reveal an absence of significant skew or kurtosis based on Kline's (R. B. Kline, 2011b) limits of 3 (skew) and 10 (kurtosis), a finding mirrored in the distributional characteristics of individual I-BSS-R items ( Table 2).
I-BSS-R SE, WA, and QC sub-scales were all highly correlated with the I-BSS-R total score, r= 0.89, p < 0.001, r= 0.73, p < 0.001 and r= 0.47, p < 0.001, respectively. The SE subscale was observed to be significantly correlated with the WA (r= 0.51, p < 0.001) and QC (r= 0.19, p = 0.001) sub-scales. No statistically significant correlation was observed between WA and QC sub-scales (r= 0.05, p = 0.44). Replicating the approach taken previously (Romero-Gonzalez et al., 2019), I-BSS-R sub-scale/total score correlation was compared with those reported by Hollins Martin and Martin (2014) in the original BSS-R development study. Adopting the statistical approach of Diedenhofen and Musch (2015), statistically significant differences between studies were observed on two scale combinations: (i.) WA and QC sub-scales and (ii.) BSS-R total score and QC sub-scale. No other statistically significant differences were observed (Table 3).

Convergent validity
Correlations between S-BSS-R total score and the SE sub-scale score and number of days of gestation were both positive and statistically significant, r = 0.17, p = 0.004, r = 0.21,  p < 0.001, respectively. No statistically significant correlations were observed between WA and QC sub-scales and number of days of gestation, r = 0.02, p = 0.72, r = 0.08, p = 0.16, respectively.

Known-groups discriminant validity
Highly statistically significant differences were observed on I-BSS-R total score and SE and QC sub-scales in favour (higher scores) of the UVD group compared to the intervention group. No difference was observed between groups on WA sub-scale scores (Table 4). Participants who attended antenatal classes were observed to have significantly lower I-BSS-R SE sub-scale scores compared to those who did not attend such classes. No other statistically significant differences were observed (Table 5).

Internal consistency
Cronbach's alphas for the I-BSS-R total scale and sub-scales are summarised in Table 6. Significantly lower Cronbach's alphas were observed on the I-BSS-R total scale and SE and QC sub-scales compared to those reported by Hollins Martin and Martin (2014). No significant difference was observed between the I-BSS-R WA sub-scale and that reported by Hollins Martin and Martin (2014).

Discussion
The current investigation sought to validate the recently developed Italian version of the BSS-R in an Italian clinical population. Across a range of psychometric criteria, the I-BSS-R appears to perform well and is consistent with the original BSS-R; however, on indices of internal consistency, findings are inconsistent with previous translation/validation studies of this increasingly used measure and therefore, overall, the current findings are equivocal. To understand the findings in both a theoretical and a clinically applied manner, each of the key observations from the psychometric evaluation of the I-BSS-R will be examined on a point by point basis. Finally, the observation of significant differences in I-BSS-R SE sub-scale scores as a function of antenatal class attendance will also be explored.
Starting with the measurement model, the confirmatory factor analysis revealed an excellent fit to the tri-dimensional model of the BSS-R (Hollins Martin & Martin, 2014), which is a finding not only consistent with the original measure but also consistent with several other validation studies (Göncü Serhatlıoğlu et al., 2018;Jefford et al., 2018;Martin et al., 2017;Romero-Gonzalez et al., 2019;Vardavaki et al., 2015). The bi-factor model of the BSS-R was also found to offer an excellent fit to data and thus our findings are consistent with the bi-factor model of the BSS-R evaluated by Martin et al. (2018) evidence of which supports the use of BSS-R sub-scale scores and the BSS-R total score, or both, depending on the clinical context of practice application of the measure and/or clinical research. Thus, in terms of routine clinical outcome monitoring the use of the BSS-R total score can be utilised with confidence, this being consistent with the recommended use of the measure and total item scoring approach recommended by The International Consortium for Health Outcome Measurement Pregnancy and childbirth standard set guidelines. Drilling down further into the three sub-scales within the instrument, application to research in understanding the relationship of these domains of stress, attributes and quality of care offers the opportunity to explore these factors in terms of their relative sensitivity to discrete aspects of clinical context in terms of elucidating the specific areas of birth experience that impact on birth satisfaction and utilising such observations to improve care and the woman's individual birth experience. Our study was also the first to empirically compare the established three-factor measurement model of the BSS-R to the bi-factor model and we found no statistically significant difference between models thus further supporting the assertion of Martin et al. (2018) regarding equivalence of scoring approach.
Further validity evidence for the BSS-R was found in the divergent analysis observations which consistent with prediction confirmed the absence of significant relationships with participants' age, a finding consistent with the recent investigation by Romero-Gonzalez et al. (2019) and thus indicating no age-related adjustment is required to I-BSS-R scores.
In terms of novel insights from the application of the BSS-R in this population, the findings that the BSS-R SE sub-scale was highly correlated with duration of pregnancy (as was also the BSS-R total score), offers a useful insight into the little explored relationship between length of gestation and stress experienced during childbearing. Traditional focus has been around the issues of pre-term birth and primarily the health and well-being of the baby rather than the mother, for example, Maslow et al. (2016). Our findings would suggest that further research is desirable into this relationship at term and specifically into conceptualising the relationship as a continuum, thus illuminating other areas of clinical wellbeing of the mother herself, which also are underexplored in the literature. For example, the timing of interventions at term. The findings do not intend to promote an induction of labour at term, which has the potential to do more harm than good, and its resource implications are staggering (Menticoglou & Hall, 2002;World Health Organization, 2011). Moreover, induction of labour could also be perceived as another intrapartum intervention that leads to a high level of stress with potentially a negative impact on maternal satisfaction. Instead, we suggest that women should be well informed and supported when they get close to the due date; in fact, they would like and should be involved in decision-making to decide how to proceed once the term is reached (Schwarz et al., 2016). These strategies could improve women's satisfaction with the entire birth process.
The finding that women experiencing an intervention delivery compared to an unassisted vaginal delivery have significantly lower I-BSS-R SE sub-scale and I-BSS-R total scores is entirely consistent with several BSS-R translation/validation studies that confirm the negative impact of the intervention on birth satisfaction in relation to the BSS-R SE sub-scale and total BSS-R score (Hollins Martin & Martin, 2014;Jefford et al., 2018;Martin et al., 2017;Romero-Gonzalez et al., 2019;Vardavaki et al., 2015). Clearly, this observation is not a critique of interventions per se; however, it is an inditement of the negative impact of birth satisfaction and a pertinent reminder of the need to consider carefully the actual clinical need for an intervention (is the intervention really necessary?) and women's informed choice in the decision-making matrix regarding optimising their birth experience.
The finding that women who attended antenatal classes had significantly lower BSS-R SE sub-scale scores compared to those who did not attend classes is striking in terms of both directionality and level of statistical significance. Antenatal classes are generally perceived as being of universal benefit for preparing women for birth and its potential consequences. Our study notes a negative impact of attending antenatal classes and effects upon maternal birth satisfaction. In accordance with Soriano-Vidal et al. (2018), women who receive antenatal education are more likely to gain evidence-based information, which in turn increases their awareness and empowers them in relation to understanding birth processes. As such, receiving education could change or increase women's expectations of their birth experience, with mothers who have higher expectations experiencing lower fulfilment in relation to personal requests (Christiaens et al., 2008;Mei et al., 2016), which will effect overall birth satisfaction (Mei et al., 2016). Furthermore, we should consider the Italian context, where there is a high-risk culture surrounding childbirth (Euro-Peristat Project, 2018;Rota et al., 2017), which could add even more discrepancy between perceived best practice and actual midwifery care provided during labour and birth. Nonetheless, it is right that the midwifery services should strive to be evidence-based in their approach towards educating childbearing women, with knowledge imparted inevitably shaping women's birth experience.
The internal consistency observations represent one aspect of the current validation study that raises an area of concern and is inconsistent with the psychometric evaluation outlined thus far which generally indicates exemplary measurement characteristics. The Cronbach alpha findings of the I-BSS-R sub-scales and the total measure were all suboptimal according to conventional criteria of 0.70 (P. Kline, 2000). This represents a surprising finding, not only in terms of the exhaustive translation process undertaken by Nespoli et al. (2018) to ensure equivalence of items to the original BSS-R but also in terms of previous validation studies which generally find acceptable alpha for the BSS-R total scale Hollins Martin & Martin, 2014;Jefford et al., 2018;Romero-Gonzalez et al., 2019;Vardavaki et al., 2015). However, with the exception of Burduli et al. (2017), the BSS-R WA sub-scale is invariable observed to have an alpha of <0.70 largely interpreted as a function of the small number of items in this particular BSS-R sub-scale (N = 2). There have also been occasions in validation studies of the BSS-R where the BSS-R QC sub-scale (Romero-Gonzalez et al., 2019) and the BSS-R SE sub-scale (Göncü Serhatlıoğlu et al., 2018) have been observed to have alpha <0.70. The observation of a sub-optimal alpha for a particular BSS-R sub-scale in these studies has generally been considered against a backdrop of generally excellent psychometric properties of the measure on other tests of validity and reliability and thus given a context which accepts this as a relatively minor shortcoming where it occurs. Further, given that the BSS-R can be used as a total score measure, the consistent observations prior to the current study that alpha for the total BSS-R scale is always >0.70 have minimised the potential issue of suboptimal alpha in a specific BSS-R sub-scale. Thus, our observations in terms of internal validity represent a departure from previous observations and suggest further investigation of this aspect of reliability is required, and possibly the most appropriate method is a replication study in a further Italian-speaking population, in contrast to further revision of the scale.
A potential limitation of the study was that a comparison between the I-BSS-R and another measure of birth satisfaction was not made. Given that this could offer a valuable insight into both the I-BSS-R and indeed the measure to which it is compared in terms of evaluating the degree of overlap and congruence, it is suggested that future research with the I-BSS-R compare the degree of association of the measure with another birth satisfaction scale. The findings in relation to the relationship of antenatal classes and education to birth satisfaction have clearly been identified within the current study as an area to be 'unpacked' through further research. Given the variability in content and format in antenatal class provision both within and across countries, this would likely represent a complex research agenda to develop; however, the implications for care would indicate such effort is both worthwhile and indeed necessary.

Conclusion
Findings from the current investigation have found the I-BSS-R to be a valid and reliable version of the BSS-R, and for the most part consistent in terms of measurement characteristics and factor structure to the original English-language version. It was observed that women who attended antenatal classes had comparatively lower I-BSS-R SE sub-scale scores compared to antenatal class attendees a perhaps counterintuitive observation and one that clearly circumscribes an important agenda for further investigation in future research.