Improving health-related quality of life instrument translation into South African languages

Background Most health-related quality of life (HRQoL) instruments have been created in English, which can influence their reliability and validity in non-English speaking populations. This study assessed the translation methodology of HRQoL instruments that have been used and translated into South African languages and which could be applied in cost-utility analyses (CUAs). Methods A 2019 systematic review was updated with searches conducted in Medline, the Web of ScienceTM (WoSTM) Core Collection and the South African SciELO collection via the WoSTM Platform. Additional searches in Sabinet’s African Journals database and on instrument developers’ webpages were performed. Only HRQoL instruments suitable for CUAs were included. Articles reporting at least one element of the translation methods were included. Established good practice principles were used to evaluate the translation methodology. Results Within the 39 publications identified, a dozen translated instruments have been used in South Africa. All instruments used were translated from English and none had originally been created in South Africa. Instrument developers’ translations were used more than study investigators’ translations. Almost all instrument developer versions met the full translation criteria. No investigator translated instrument met the full translation criteria primarily because recommendations on forward and back translations were not followed. However, this analysis was hampered by a lack of methodological reporting details. The most used instruments, which also had the most translated versions available, were the EQ-5D-3L, SF-36 version 2 and EORTC QLQ-C30. Conclusion Instrument developers’ translations more often met recommended translation methodology compared with investigators’ versions. The EQ-5D-3L may be best suited for South African economic evaluations and for use in clinical practice, but further work may be needed.


Introduction
Patient reported outcomes measures (PROMs) can be applied in diverse settings to guide the choice of healthcare interventions. It does so not only by providing data on the clinical outcomes with a health technology or the disease course, but because it can also be combined with cost data in economic evaluations to determine the cost-effectiveness thereof. Measuring provider performance and quality of services using PROMs, the use of health technology assessment (HTA) and economic evaluations will play an increasingly important role in South African private healthcare provision and national public health policy decisions. 1,2,3,4 Specifically, under National Health Insurance (NHI), economic evaluations as part of a more formal HTA programme could be used to determine the cost-effectiveness of treatments provided as part of the NHI fund. On this point the 2019 NHI Bill stated that: The Ministerial Advisory Committee on Health Technology Assessment for National Health Insurance, which must be established to advise the Minister on Health Technology Assessment … must regularly review the range of health interventions and technology by using the best available evidence on costeffectiveness, allocative, productive and technical efficiency and Health Technology Assessment. 1 (p. 29) Such cost-effective evaluations frequently use health-related quality of life (HRQoL) data in costutility analyses (CUAs) because it allows for comparisons across health situations and programmes. These analyses can subsequently be employed for making resource allocation decisions. Indeed, this is the intention of the 2021 draft HTA guideline issued for comment by the National Department of Health as part of the Essential Drugs Programme's appraisal of medicine informing their inclusion on to the National Essential Medicines List (NEML) 5 : the guideline proposes that a CUA be conducted if a comprehensive costeffectiveness evaluation is required.
However, not all HRQoL data are suitable for CUAs. Data from HRQoL instruments should be generated by a PROM that is a generic instrument because it assesses the treatment outcomes across a range of populations and interventions. Such instruments should also measure attributes across multiple health domains, have a health state classification system based on the combination of responses and the health states should have associated numeric values that represent the patient or general public preference for each health state generated by the instrument. 6 For this reason, they are also called multi-attribute utility instruments (MAUIs) or preference-based measures. It is possible to convert the results from some non-MAUIs/preference-based measures into generic HRQoL data that can be used in CUAs, but this should be avoided if possible because of the limitations thereof. 6,7 Whether for clinical decisions, economic evaluation or health policy decisions, it is important that PROMs are reliable and valid in the target population. But this can be influenced by instrument language (amongst others): if English language measurements are used in non-English speaking populations, their understanding of the questions and responses given may not correspond to the intended concepts. 8 The importance of adequate translations into South African languages is emphasised by the findings of studies that evaluated the validity, reliability and cultural adaptation of translated HRQoL instruments. Researchers found that reliability and validity were influenced by socioeconomic factors such as education, literacy and rural or urban living, which were often associated with populations' cultural background and historical racial inequalities. 9,10,11,12 In addition, understanding could be impacted when no equivalent word existed in the South African language, 13 or because of difficulty in transferring English concepts into African culture. 11,14 Consequently, the issues identified required the researchers to make semantic and conceptual changes to the instruments. Changes in delivery of some instruments such as an oral explanation of the nature of the instrument and questions or requesting permission to ask questions considered sensitive in a particular cultural group were also necessary to alleviate participants' discomfort in completing the questionnaires. Therefore, by using an inadequate translation, the interpretation of the results and conclusions generated by it may be limited. Moreover, inadequately translated instruments impact a study's generalisability to the wider population. This in turn may cause uncertainty when such results are needed to inform the priority setting process, thereby potentially delaying patients' access to new technologies. 15 Therefore, to optimise patient outcomes and support patients' access to technologies, HRQoL instruments should be administered in the target population's home language(s). If such a language version is not available, they should be translated according to good translation principles as this will give greater certainty in the results obtained. Unsurprisingly, the South African Guidelines for Pharmacoeconomic Submissions (SAGPS), point out that all HRQoL instruments should be validated using South African data and request detailed information supporting the validity of the tool in the South African context. 16 Similarly, the draft HTA guideline for the NEML proposes that HRQoL for a CUA should be measured from a representative sample of the South African population using a validated instrument, and the effects should be valued with a South African-based value set 5 (a value set is a single score based on the South African general population's preference for different health states possible with a MAUI). However, whilst a systematic review found that a body of HRQoL data exists in South Africa, details on the instrument language versions used or the translation methods employed were reported infrequently. 17 This study therefore sought to (1) provide a list of HRQoL instruments suitable for CUAs that have been used in South Africa using one or more South African languages (other than English), (2) critique the methods used to produce the translated versions, and (3) make recommendations on which instruments may be suitable for future HRQoL measurement within the context of conducting CUAs in South Africa as part of national HTA.

Study design
The study forms part of a broader project that aimed to identify evidence and research gaps in HRQoL data in South Africa against the background of national HTA and CUA; and for which a systematic review was conducted in January 2019. The methods and results of this bigger project have previously been published. 17,18 The current analyses build on the earlier systematic review through updated and new searches and new qualitative and quantitative analyses focussing on the translation methods used in the identified studies.

Search strategy and inclusion criteria
Briefly, the 2019 systematic review included full text articles and abstracts that reported South African-based HRQoL research using HRQoL instruments suitable for CUAs, namely MAUIs and instruments that can be mapped to it. There were no publication time restrictions and studies of any design were included. Multiple literature databases were searched, combining keywords and free text in the title and abstract and subject heading terms. The 2019 review was revisited to identify multi-country studies that were originally excluded because the publications reported the use of a MAUI, but did not report South African results separately. This allowed the inclusion of all publications that reported the use of translated instruments in the current analysis, regardless of whether the South African cohort results were reported.

Data sources
Searches covering 2019 onwards were conducted in the Web of Science TM (WoS TM ) platform on 11 April 2021 as per the 2019 review: the databases searched were Medline, the WoS TM Core Collection and the South African SciELO collection. As reference list searching in the original review identified a small number of articles from South African specific journals not indexed on any of the databases included in the WoS TM platform, additional keyword and free-text searches were conducted on Sabinet's African Journals database to account for non-indexed South African journals.

Screening, inclusion criteria and information extraction
In the 2019 systematic review, two reviewers were responsible for first and second pass screening and data extraction into Excel ® , whereas only one reviewer (the first author) screened articles and extracted the data from the updated searches. In addition to the criteria used in the 2019 review, articles were excluded if they did not contain information that identified the instrument as a South African language version. Instruments created in South African languages were included, but studies and their associated publications were excluded if only English language instruments were used. Furthermore, translated instruments available from, or endorsed by, the original instrument developer were considered for inclusion as were instruments translated by investigators for their specific study without input from the original instrument developer. However, only publications reporting that the instrument developer's translated and endorsed version was used and publications by investigators that reported at least one of their translation stages, were selected (see Table 1). Finally, multiple publications of the same study were marked as 'duplicate' and only the one reporting the most detailed translation methodology were retained. For instrument developer versions, their webpages and the Mapi Research Trust's PROQOLID TM database were searched for information on the translation methods. In some instances, written requests were sent to the instrument developers or their appointed representatives to confirm availability of translated South African language instruments and to request information on translation methods.

Assessment criteria
The final set of publications were evaluated against the guidelines for cross-cultural adaptation of HRQoL measurements first outlined by Guillemin et al., 8 and the International Society for Pharmacoeconomics and Outcomes Research (ISPOR) Principles of Good Practice for the translation and cultural adaptation process for PROMs. 19 These best practice recommendations were selected because Guillemin et al. has been used most often in the reported literature, and the ISPOR publication by Wild et al. is frequently referenced as a recognised methodology within the pharmaceutical industry and by the United States Food and Drug Administration. The classification of each stage as outlined in Table 1 was scored as: positive (+): procedure performed according to the quality criteria used; negative (-): procedure not performed as recommended; uncertain (?): insufficient information available to rate the stage; unknown (0): no information available to rate the stage.

Ethical considerations
The Faculty Postgraduate Studies Committee at the Nelson Mandela University reviewed the study proposal and granted ethics approval (ethics clearance reference number: H18-HEA-PHA-009).

Literature retrieved
The updated literature searches identified an additional 614 publications for screening, of which 32 were retained. The 2019 review consisted of 123 articles and together with a further 12 articles, which were originally excluded as no study results were reported, constituted the bulk of the articles. Thus, 167 publications were assessed for their instrument translation methodology. Of these, 128 were excluded because they did not report the instrument language (n = 76), reported on the same study (n = 23), used only English language instruments (n = 15), and contained no description of the translation methods or information on whether the translated version was obtained from the instrument developer (n = 14). Thus, the remaining 39 studies were critiqued. The flow of articles identified and included in the analysis is illustrated in Figure 1.

Instruments identified
Three new instruments were found that had not been identified in the 2019 review: the Asthma Quality of Life Questionnaire, Women's Health Questionnaire and Parkinson's Disease Questionnaire. However, only the

Forward translation
Professional translators, who are native speakers of the target language and fluent in the instrument source language, independently conduct at least two parallel translations of the original instrument into the target language. This enables detection of errors and divergence of conceptual meaning.

Reconciliation and consensus
Translations are synthesised by comparing and merging the forward translations into a single translation, creating a consensus version. Approaches may differ, but ideally this should be performed by a committee, an independent native speaker of the target language not previously involved in forward translation or in-country investigator who may have prepared one of the forward translations.
Back translation A quality control step whereby professional translators who are native speakers of the source language and fluent in the target language conduct at least two independent translations of the reconciled consensus version back into the instrument's source language. This ensures that the same meanings have been derived in the translated version and avoids having a different conceptual basis to the source measure.

Review and harmonisation
Another quality control step whereby the back translated versions are reviewed by a committee, or the project manager and the back translators, or the project manager and key in-country consultants, against the original document. This aims to detect and deal with translation discrepancies between the different language versions and supports production of a conceptually equivalent version. Thereafter, the pre-final version is produced.

Pilot and cognitive debriefing
Lay people or a sample of the target population test the comprehensibility and equivalence of the pre-final version through soliciting feedback on the understandability, interpretation and cultural relevance of the translated instrument.

Finalisation
Results from the cognitive debriefing are incorporated into the translation, which is proofread and finalised by a committee or the project manager and a key in-country person. Parkinson's Disease Questionnaire provided information on the language version used. This instrument, and the other included instruments are listed in Table 2. The table describes the translated HRQoL instruments suitable for CUA that have been used in South Africa, the language versions used in the studies and the language versions currently available from the instrument developer. It shows that in some instances where there was an absence of instrument developer translated versions, the investigators created their own versions. In addition, currently most instruments listed have at least one South African language version available from the developer (other than English). However, none are available in all South African languages but the EQ-5D-3L has the most translated versions available from the developer. In more than a third of the studies using generic instruments (which are suitable for a range of diseases), people living with the human immunodeficiency virus (HIV) formed the target population. The remainder of the studies using generic instruments included people with gastroenterological conditions, musculoskeletal conditions, unspecified or multiple chronic conditions and tuberculosis.

Translation of instruments
None of the measures were originally created in South African languages; all were originally English language instruments.
Nearly all studies reported the use of HRQoL instruments to measure health outcomes, only six publications were methodological articles providing the results of testing an instrument's reliability and validity or issues with its cultural adaptation.
Most studies (25/39) used the instrument developer's translated version or created a version based on the developer's translation manual and 14 studies used versions translated by the investigators. In three instances the researchers translated the instrument despite the availability of a translated version from the developer. The EQ-5D-3L was the most translated instrument (n = 15), followed by the SF-36 version 2 (n = 8) and the EORTC QLQ-C30 (n = 5). The most translated languages were isiXhosa (n = 21), Afrikaans (n = 21) and isiZulu (n = 17).
When translated by the investigators, no instrument met all the translation methodology criteria, but this analysis was often hampered by a lack of detailed reporting. In contrast, almost all the instrument developer versions met the complete set of recommended translation criteria used in this study. It was observed that translations produced by the European Organisation for Research and Treatment of Cancer or on their behalf, do not require professional translators for the forward translation step 20,21 and the PedsQL TM only required one backward translation. 22 These instruments were therefore judged to not meet our study's translation criteria. Only the instrument developers' versions of the EQ-5D-3L (Afrikaans, isiZulu, isiXhosa, Sesotho and Setswana) and -5L (isiXhosa), SF-36 version 2 (Afrikaans, isiZulu, Sesotho) and SF-12 (isiXhosa) fully met the translation criteria.
As reported in Table 3, the stages reported most often in accordance with the translation criteria (i.e. a rating of '+') were the pilot and cognitive debriefing stage (n = 32), review and harmonisation stage (n = 28), and the reconciliation and consensus stage (n = 25). Forward and back translations were often not performed in accordance with the recommendations (i.e. a rating of '-') (both n = 8). Lack of any reporting details (i.e. a rating of '0') occurred most often for the reconciliation and consensus stage after forward translation and the finalisation of the instrument through proofreading (both n = 11).
Common pitfalls by all investigators who conducted their own translations were producing only one forward or backward translation or using bilingual or native language speaking academics, research assistants or healthcare workers rather than employing professional translators. In addition, none reported the use of reconciliation and consensus stage and only one reported how the instrument was finalised. Half of investigators did however report the use of a pilot testing phase.

Discussion
To the best of the authors' knowledge, this is the first review and assessment of the translation quality of HRQoL

Breast
The instrument consists of the FACT-General plus a breast cancer subscale, becoming a 37-item questionnaire focussing on five domains of HRQoL in breast cancer patients.    (Table 2). However, few have been translated for use in the South African population according to good practice guidelines (Table 3). Encouragingly, where reported, most studies used a language version in one of the three most spoken home languages, namely isiXhosa, Afrikaans and isiZulu, which together represents just over 50% of the population. 56 Of the studies included in this review, only the instrument developers' language versions of the EQ-5D-3L and -5L, SF-36 version 2 and SF-12 fully met the translation criteria used in the analysis. Of these, the EQ-5D-3L is currently the best placed of the existing HRQoL instruments suitable for CUAs for use in South Africa. This is based on the following factors: (1) It allows use and comparisons of results across multiple diseases (and is therefore preferred by many HTA agencies supporting healthcare priority setting and funding decisions 6 ), (2) It was shown in this study to meet the standards for translation of PROMS, (3) It was found to have the most translations available of those studies included in the current analysis and (4) It has been used in a wide range of settings, populations and diseases in South Africa. 17 The SF-36 and the EORTC QLQ-C30 were also frequently used but the EORTC QLQ-C30 did not meet the review's translation criteria.
Within the context of using the studies for clinical and priority setting decisions in a multi-cultural country, it was concerning that only 39 out of the initial 144 studies included (27.1%) reported the use of translated instrument versions. However, this is higher than that reported by Bello et al. 57 Only 14.0% of the studies in their systematic review on the properties of stroke quality of life outcomes measures in Africa used translated instruments. The remainder of the findings is consistent with the reported literature. A 2003 systematic review of the translation and adaptation of generic HRQoL measures in Africa, Asia, Eastern Europe, the Middle East and South America highlighted that in the 1990s several publications raised concern about the quality of translated versions of HRQoL instruments, focusing on the quality of the methodology. 58 They found very few studies that reported that the instruments were assessed for equivalence during  translation and none considered it in any detail. The authors concluded that more work needs to be performed to improve translations. Yet despite several decades of experience and numerous translation and adaptation frameworks, guidelines and good practice principles 59,60,61 problems remain with instrument translation into non-English languages. For example, more recently Al Sayah et al. 62 also observed that reporting details were lacking in Arabic translations and cross-cultural adaptations of HRQoL instruments. Consequently, the authors could not evaluate the translation quality of most of the instruments identified. Reporting details were also lacking for Brazilian Portuguese generic and cancer specific PROMs, and most problems were with the forward translation, back translation and expert review steps. 63 This study's finding that there was a lack of a pilot and cognitive debriefing step by half of investigators who created their own translations is also not unique: a systematic review of childhood HRQoL measures in sub-Saharan Africa found only two instruments (out of 10 identified) that attempted to establish crosscultural adaptation through linguistic and conceptual equivalence testing. 64 Although not explored in these analyses, it is worth reflecting on the possible reasons for the lack of improvement in translation quality. For instance, the importance of high-quality translations and impact of poor translations and adaptations of instruments outside of these countries may not receive the necessary attention in the literature read by researchers and clinicians interested in HRQoL. Certainly, HRQoL research between 2000 and 2019 were predominantly conducted in North America and Europe, and the studies were published in English language journals 65 (which are owned by publishing companies based in these regions). Furthermore, researchers in lowand middle-income countries may also not have the resources available to conduct translations according to the guidelines and the guidelines themselves may not account for the complexities. These last two points were identified and discussed in detail by De Wet et al. in a case study of their study's research documents that were translated into isiXhosa, 66 and were also highlighted in some of the publications included in this review. 11,13 Finally, the studies included in this review were mainly in one disease area (HIV). The 2019 review also found a strong focus on HIV and showed that HRQoL data in chronic conditions that contributed the most to the country's burden of disease, were lacking. 17 Given the existing requirement in the SAGPS and the draft NEML HTA guidelines to use a South African validated HRQoL instrument for economic evaluations and the gaps identified in this and 2019 review, further work may be needed if CUA has to form the basis of HTA under NHI. For example, if the EQ-5D were to be used for HTA under NHI, a South African value set will be needed. Until such a value set is created, one from another country must be used, which increases uncertainty in CUA results and limits health policy decisions.
Measuring patient outcomes is the basis for decisions about the most appropriate treatment for a patient, tracking healthcare quality, evaluating service performance, and priority setting such as funding of cost-effective treatments.

What Explanation
Researchers and research organisations/ clinicians and clinical organisations Use HRQoL measurements with existing translated versions available from instrument developers regardless of the purpose.
Evidence from this study suggests that such measurements have likely been created according to best practices that reflect standardised and tested approaches, thus ensuring the instrument is valid in the target population.

Researchers and research organisations/ clinicians and clinical organisations
In the absence of a translated version from the developer, retrieve and employ the translation methods suggested by the developer and, ideally, getting their support or input into the process.
This could maintain the validity of the adapted measure because the evidence from this review suggests that developer's methods are most likely to conform to current best practice. Instrument developers are also likely to have access to individuals or organisations who can contribute towards the process.

Researchers and research organisations/ clinicians and clinical organisations
When translating an existing instrument, consult and follow guidelines and best practice documents that report recommended PROM translation methods.
These documents reflect consensus views on acceptable, standardised methods 19 and are likely to meet regulatory and HTA agencies' requirements for validated PROMs relevant to the local context.

Researchers and research organisations/ clinicians and clinical organisations
For generating HRQoL data for CUAs choose generic HRQoL MAUIs.
Health technology assessment agencies and funding decision-makers prefer generic MAUI over disease specific instruments. 6 In addition, disease specific or non-MAUIs will require mapping to MAUIs, which increases uncertainty in the CUA results. 7

Instrument developers
Provide user-support guides such as translation manuals and/or provide contact information to alert researchers intending to translate instruments of the interest and availability to work collaboratively.
This would support the creation of valid instruments and as suggested by this review, is likely to increase the use of the instrument in populations speaking the target language compared with instruments that require translation by the study investigator.
Health policy and funding decision makers Require that the HRQoL instruments be valid in the local context and provide guidance on how to establish validity.
Local data will be needed for economic evaluations, 16 but guidance on what constitutes a validated instrument for such purposes is currently lacking from the NHI and National Department of Health. Such guidance will support the generation of HRQoL data for HTA and strengthen the level of confidence that decisions are evidence based and relevant to the local context.
Health policy and funding decision makers Provide clear guidance on which HRQoL instruments will be needed for decision-making and how to create such instruments if they do not currently exist in a suitable form.
Evidence from this review showed that there are a range of HRQoL instruments suitable for CUAs in South Africa, but not all may be valid in the local context because of the translation methods used. Whilst most translation methods for PROM recommended by guidelines would achieve comparable results, 59 South African specific guidance on choice of HRQoL instrument for HTA and how to create such instrument would provide clarity to researchers, avoid wasteful research and prevent any decision that is not evidence based or relevant to the local context.
HRQoL, health-related quality of life; PROM, Patient reported outcomes measures; HTA, health technology assessment; CUA, cost-utility analyse; MAUI, multi-attribute utility instruments; NHI, National Health Insurance.
Consequently, this work has several implications for clinical practice, health policy and HRQoL research. Without using HRQoL measures in the target patient and population's native language or instruments translated using sound methodologies, it cannot be certain that the instruments accurately evaluate the patients' views about their health and subsequent health outcomes. This brings into question the validity of the results obtained and compromises the instrument's use in clinical practice and for health policy decisions. Using poorly translated instruments also contribute to wasteful and inefficient resources use when time and money are spent on administering an instrument that may produce unreliable results. This review therefore makes several recommendations based on the study's findings and existing literature (Table 4) specific to HRQoL measurement in any multi-cultural setting.
For South Africa, where national HTA and economic evaluations are going to play a more important role in sustainable patient access to health technologies, both patients and funders would benefit from incorporating the recommendations presented in Table 4. This is because providing high quality HRQoL data for new technologies could get patients faster access to new technologies, whilst both private and public funders will have more certainty that the introduction of technologies is proven value for money.

Strengths and limitations
The analysis was based on a comprehensive literature review, the foundation of which was a 2019 systematic review. However, only one researcher was involved in extracting the translation methods from the publications, which may have resulted in errors. To mitigate this, the extraction form used in Excel ® served as a checklist, and the articles were re-evaluated on more than three separate occasions. It is acknowledged that various translation frameworks exist and that the choice of Guillemin et al. and Wild et al. may be considered arbitrary. However, evidence is lacking on the best methods for translation and cross-cultural adaptation and most translation methods recommended by guidelines would achieve comparable results. 59 It is therefore unlikely that the results of this review would have been much different if other translation criteria were used. In addition, the use of publications to measure the quality of research is limited by the information provided in the article and it is therefore possible that the methodology in the studies did follow recommended translation methods but was simply poorly reported. Lastly, the analysis focussed on HRQoL instruments suitable for CUAs, thus it is not a comprehensive review of all PROMs used in South Africa.

Conclusion
The EQ-5D-3L may be best suited for use in South Africa where data are needed for a CUA. However, further work and detailed guidance from the National Department of Health on the most suitable HRQoL instrument for the South African context will be needed if the CUAs will be required as part of the HTA process under NHI. Acting upon the recommendations from this study could result in more robust measurement of HRQoL in South Africa and more informed decisions on the introduction and use of health technologies from both a patient and national healthcare policy standpoint. The recommendations on translation methodology quality are also relevant to clinicians wanting to obtain reliable health outcomes data from the patients' perspective.