Recommendations for the guideline were formulated based on the GRADE Evidence-to-Decision (EtD) approach for public health interventions (64).
Familiarizing the GDG with the evidence
Prior to the GDG proposing draft recommendations in its second meeting series (September 2022), the GDG received an EtD summary document for each intervention, including GRADE evidence profile tables for each comparator, with detailed subpopulation analyses. In addition to evidence profile tables, each summary document provided the qualitative evidence (including confidence in the evidence) for the EtD domains of values and preferences, resource implications, equity, acceptability and feasibility of the interventions and their outcomes from the perspective of older people and/or their carers. Definitions of the EtD domains considered are summarized in (7).
EtD domains that informed the direction and strength of a recommendation.
Proposing recommendations
The GRADE recommendation formulation approach was guided by the independent methodologist using a semi-structured facilitation process. The GDG was supported to interpret the evidence of benefits and harms for each intervention by reviewing GRADE evidence profile tables with reference to the GRADE certainty-of-evidence assessments and considering what a clinically worthwhile effect would be.
Determination of what is a clinically worthwhile effect for an intervention is context-specific and should consider what is meaningful to people with CPLBP (78, 79). The GDG was responsible for determining worthwhile benefit and risk of harm for an intervention, considering the effect size estimates from systematic reviews and other factors related to the delivery and accessibility of an intervention and what might constitute a meaningful benefit for a person with CPLBP. To assist the GDG, suggested benchmarks for effects that might be clinically worthwhile were provided by systematic review teams as a guide, rather than a definitive threshold, for the GDG to consider. This approach overcomes forcing a decision based on a single (arbitrary) effect size. For example, the following guiding benchmarks on clinical relevance were provided to the GDG:
Small: mean difference (MD) 0.5 to 1.0 point on a scale of 0 to 10, or equivalent; standardized mean difference (SMD) 0.2 to 0.5; or risk ratio (RR) 1.2 to 1.4. Effects were considered “trivial” where the estimates were statistically significant, but of a magnitude smaller than the thresholds for a “small” effect.
Moderate: MD > 1 to 2 points on a scale of 0 to 10, or equivalent; SMD > 0.5 to 0.8; or RR 1.5 to < 2.0.
Large: MD > 2 points on a scale of 0 to 10, or equivalent; SMD >0.8, or RR ≥ 2.0.
The GDG was guided in making a judgement on the benefits, harms, and balance of benefits and harms for each intervention by considering the quantitative evidence and the certainty of the evidence, as presented in GRADE evidence profile tables and as summarized by systematic review suppliers in the meeting. The GDG was then guided in making judgements on the other EtD domains (). Judgements regarding values related to the outcomes and preferences related to the interventions were based on GDG members’ experience, knowledge and observations of their own contexts, as were judgements related to acceptability and feasibility. Discussion and judgmentsjudgements regarding equity and human rights during the GDG meeting focused on vulnerable populations such as the older person and those GDG members with lived experiences and/or particular expertise in gender, equity and human rights were asked expressly to comment on the potential consequences of implementing or not implementing an intervention with respect to this domain.
For the EtD domain “resource implications”, the GDG considered up to three information sources, including:
Evidenceevidence from the qualitative evidence synthesis, where data were available.;
Evidenceevidence of resource burden from the included trials, such as the number and duration of treatment sessions.; and
GDG members’ own knowledge and experience relatedrelating to treatment costs within their setting/region, while acknowledging that subsidies for treatment would vary between health systems.
Judgements relevant to values and preferences, acceptability, feasibility, and equity and human rights pertaining to older people specifically were informed by the qualitative evidence synthesis. Where qualitative evidence was lacking for a particular domain or intervention, judgements were determined based on the experience and observation of the GDG, as for all adults.
Guided by the methodologist, initial draft recommendations were proposed by means of a process which was designed to achieve consensus among GDG members regarding whether to make a recommendation (or not make a recommendation), and the direction and strength of that recommendation, when proposed. Consensus is defined as the situation in which a pre-specified threshold of GDG members (as determined by the GDG) can agree to, or can “live with”, the recommendation as proposed (direction and strength), and includes the option to make “no recommendation”, where appropriate. Consensus is still considered to have been achieved where there are minor disagreements concerning the supporting remarks or key considerations, or their supporting rationale statement.
Principles and thresholds for decisions on recommendations
Principles and thresholds for agreeing on recommendations and other decisions were established at the beginning of the second GDG meeting series (September 2022). The GDG decided that a quorum of at least two thirds of GDG members needed to participate in any proposal of a recommendation, and that at least 60% of GDG members in attendance needed to agree on the proposal to formulate (or not) a recommendation, along with its direction and strength. All proposals for recommendations (or not to make a recommendation) were finally presented to the entire GDG in its fourth meeting in April 2023 (see Section 2.6), allowing for further discussion and out-of-session voting to ensure all members could vote on and/or approve a final recommendation decision for each intervention. Where a decision about the strength of a recommendation in a particular direction was needed (i.e. conditional versus strong), at least 70% of GDG members were required to endorse it as strong, which is consistent with published criteria when using the GRADE approach in guidelines development (80).
The GDG considered the direction and strength of a recommendation using the criteria outlined in . The GDG also considered related recommendations for older people, where relevant, and formulated accompanying remarks that should be considered in conjunction with the recommendation. In general, strong recommendations were limited to those recommendations where there was moderate or high certainty evidence in support of a balance in favour of benefits over harms (or vice versa), and/or when interventions were judged to be highly acceptable, feasible, would increase equity and where people with CPLBP placed a high value on the outcomes of the intervention. Conditional recommendations were made when overall certainty was low or very low, and/or when the judgements in other domains indicated variability or uncertainty. In most situations, guidance from WHO about an intervention is needed, although WHO processes allow for no recommendation to be made in some circumstances. A good practice statement was formulated to reflect a body of indirect evidence that is difficult to summarize and indicates that the desirable consequences of the intervention far outweigh its undesirable consequences, and that as such the intervention is recommended. provides operational definitions of these categories (7).
Operational definitions for developing recommendations.