Jonathan Green is a reader in child and adolescent psychiatry at the University of Manchester and an honorary consultant child psychiatrist for Manchester Childrens Hospitals Trust. His clinical and research interests have focused on disorders of social development and the study of complex treatment interventions in child and adolescent mental health services (CAMHS). He has led cohort studies and four large randomised trials in CAMHS, including most recently the Medical Research Councils Preschool Autism Communication Trial (PACT) of a new psychosocial intervention for autism. He has a particular interest in the measurement of process in clinical trials and has written on the therapeutic alliance.
|
|
|---|
|
|
|---|
However, protecting vulnerable populations from research activity can also exclude them from its benefits. For instance, the recent debate over the prescription of selective serotonin reuptake inhibitors (SSRIs) to children and adolescents highlighted how sparse has been adequate academic research and treatment trials with medication in those under 18. Pharmaceutical companies do not need child studies to obtain product licences and it has also often been felt appropriate to protect the young from this and other academic research (Green, 2004). This is an issue that applies to a wide range of treatment research in the paediatric population, arguably to the detriment of childrens healthcare in general.
There is a more radical alternative view. If systematic open enquiry is the contemporary guarantor of robust and stable social knowledge within a plethora of easily accessible opinion (Theodosiou & Green, 2003), it can be argued that engaging in research to help generate such knowledge is a social duty rather than something from which to be protected (Harris, 2005). It follows that it would be a professional duty of clinicians to advocate for more research for their patients. This changing ethical perspective coincides with recent changes in theory and practice in trials themselves and their potential place in mental health practice.
|
|
|---|
|
|
|---|
Seminal publications from the Medical Research Council (MRC) have presented an approach to the design of trials for complex interventions in health outlining with useful clarity the steps that should be considered in designing them (Medical Research Council, 2000). A subsequent MRC document (Medical Research Council, 2003) further emphasised the need to adapt trials to test complex interventions used in practice, particularly in the area of psychological and psychosocial treatments. So-called platform funding for trials was introduced, which is intended to support initial proof-of-concept studies, tests of feasibility and development of designs appropriate in complexity. In addition, there was a new emphasis on a collaborative ethos both in relation to clinicians and service users, with a drive to engage the public in the methodology and practice of trials. This can be seen as part of a cultural shift linked to the views articulated by Harris (2005), which aims to put the generation of robust science-based knowledge at the centre of cultural concern rather than as a peripheral specialised issue.
|
|
|---|
|
|
|---|
Of course, apparently simpler treatments may also contain hidden complexity. The impact of so-called placebo- (non-drug- or process-) related variance in drug trials is apparently increasing (Fava et al, 2003) and commonly outweighs the effect of the drug itself. And the issue of treatment complexity is applicable not just to mental health interventions: many preventive programmes or complex medical interventions will have the same characteristics.
Why use randomised trials for evaluating complex interventions?
On the face of it, the rigours of the RCT design might seem to be ill-suited to studying situations of high treatment complexity. There has often been an assumption that more qualitative methods are more suitable for use in such situations. But, paradoxically, there are a number of ways in which the randomised design approach is particularly suited to testing complex treatments. This is because it is in the nature of complex treatments to have multiple potential factors, both known and unknown, that have a bearing on outcome. Only an adequately powered randomised design technique allows for these variables to be properly controlled: those that are known and also those that are not known.
In addition, RCTs are the gold standard of treatment trial methodology, and to deprive complex (often psychosocial) interventions of their imprimatur is potentially to undervalue these areas in an evidence-based climate (Medical Research Council, 2003).
Steps to the development of randomised trials for complex interventions
In the new conceptualisation of trial methodology for complex interventions (Medical Research Council, 2000) there are three broad stages, with the actual randomised trial itself coming only at the end of important development work. These stages offer a series of fascinating intellectual and practical challenges for clinicians and researchers alike. Steps need to be taken that essentially involve a reflection on exactly what the treatment in question involves and what the active ingredients are likely to be. In this sense clinicians are being asked to address questions of the utmost interest about their practice, and questions that should logically be a prerequisite of professional activity: what exactly am I doing and how can I best model its effects?
Modelling the treatment
The first, pre-trial, phase is in many ways the most intellectually stimulating. It is a phase of deconstructing and modelling the treatment to be studied into researchable questions. In this crucial phase there are obviously dangers of either oversimplified reductionism in the modelling which misses key aspects, or an undersimplified re-description that does not allow research questions to be framed. This is where qualitative investigation, pencil and paper or more sophisticated modelling, and user and clinician consultation may be of the greatest usefulness. For a particular treatment this phase may last for years while the experience is gathered and the intervention modelled in various ways. Jump too quickly past this phase and salient aspects of the intervention may be missed and thus not tested for in terms of outcome measures.
Example 1: Modelling in-patient CAMHS treatment
A sequence of clinical modelling can be illustrated in the development of a series of studies in relation to in-patient treatment in child and adolescent mental health services (CAMHS). As a group of practising clinicians in in-patient CAMHS we began to describe and model the different potential components of this highly complex intervention (Table 1
). What exactly does the admission experience involve and what might be its key aspects? Which components might be essential to the treatment effect and which incidental? We considered the experience of admission itself and the fact of removal from local family and social environment; then the impact of the general ward environment or milieu, including the effect of other young people and relationships with staff. These general effects were distinguished from specific treatment programmes which might look more like out-patient work but use the ward as a base. We considered the effect of relocation to a unit school. We explored all these aspects in a descriptive way, drawing on the experience of colleagues in the discipline as well as reviewing extant models of their operation and their evidence base. This work culminated in a book that synthesised and extended these discussions (Green & Jacobs, 1998).
|
View this table: [in a new window] | Table 1 Example of the modelling of a complex intervention: components of child and adolescent in-patient admission |
We were then able to proceed to more specific consideration of the separate components: it is possible to model how familiar out-patient treatments might look within the in-patient setting and be affected by some of the more non-specific aspects of milieu (Green, 2004). The ward milieu itself was studied through application of existing measures and consultation with ward staff (Imrie & Green, 1998), and a new measure was generated to try to capture ward atmosphere so as to be able to test it as a variable in outcome studies. Similarly with the interpersonal relationships on the ward. We generated a new measure to try to capture the complexity of the therapeutic alliance within the in-patient unit (Kroll & Green, 1997; Green et al, 2001). This alliance would seem at first sight a particularly difficult phenomenon to model: the young person has relationships with the in-patient team as well as with other patients, and the parents have a largely separate set of contacts. However, it proved to be usefully measurable in fact the childs alliance proved to be the most powerful independent predictor of health gain during treatment (Green et al, 2001) and emphasised the importance of measuring process variables in treatment studies. Finally, we were able to use health needs assessment methods based largely on interviews with young people to understand the actual experience of the intervention received during admission (and how different this might be from the formal management plans made by the team) and to evaluate how effective the different interventions were.
This modelling of the nature of the intervention and the process of treatment progressed in relation to a series of cohort studies (Green et al, 2001). These studies both stimulated the need for the modelling and were made possible by it. We used them to test hypotheses about the relative impact and effectiveness of the different components of care and what best predicted the outcomes of treatment. Ideally, data from these studies should now be fed back into adjustments to practice and further refinement of the modelling of the treatment and its measurement. We will then be ready to mount a systematic randomised trial against alternative interventions. To date, the process has taken 10 years of collaborative work.
The pre-trial phase must end with a robust operationalisation of what the treatment in question involves, which will usually form the basis of a manual. Such a process raises concerns about a cook-book approach to therapy; a rigidity which diminishes the capacity to respond to patient individuality. But this results only if the modelling is unsophisticated. It is possible to operationalise process as much as content, and build flexibility of response into the protocol. Detailed process modelling may be more relevant for exploratory/efficacy studies; in the classic pragmatic study (see below) some of the detailed elements of the intervention can be left undefined as long as the overall approach is defined well enough to be replicable across the different intervention sites.
Confidence that a variety of treatments can be modelled in this way will be an important counterweight to a predictable tendency otherwise for the design of new psychological treatments to follow lines most easily testable in trials rather than those most adapted to patient need.
Constructing measurement
Another key purpose of the pre-trial phase is to define the parameters against which the treatment should be judged, strategies for deciding how to test it and, crucially, what comparison group should be chosen. Further tasks for the pre-trial phase are the testing and development of relevant measures both for process and outcome and preliminary observational studies to test various working hypotheses.
Co-construction
From a previous position where measures were solely chosen on the basis of theoretical or researcher decision, we are now entering a phase where measurement is likely to become more and more co-constructed with both fellow clinicians and service users. This is both a major challenge and an exciting opportunity. Involving users in this way should increase the face validity and external validity of trial designs, as well as form a step in the process of integrating trials into the general culture. However, clearly moves in this direction must not compromise the essential rigour of a trial. Measures must be fit for the purpose and designed to answer the primary hypothesis of the study. Measurement selection is critical and often underplayed: inadequate or superficially pragmatic measures may lead to the effort of a trial being wasted.
Example 2: Collaborative development of measures
In a new trial of an intervention for preschool children with autism, we are using initial focus groups with parents to identify what aspects of family and child functioning they think would be most relevant for a treatment to change, i.e. what are the key aspects of functioning that matter? Outcome from these groups is then refined by a process of iteration into a set of likely parameters for consideration and will be posted on the users website for a more extensive internet-mediated consultation before a further refinement into a new quantitative measure of family functioning which will be used in the main trial.
Related questions to professionals and service users can also inform the power calculation for necessary sample size by establishing a clinically relevant number needed to treat (NNT) figure. The question to professionals could be: For you to decide to include this new treatment into your service, what clinical effect size would be necessary, i.e. how many cases treated to achieve one positive outcome? Such dialogue with professionals and service users (and commissioners) will be a key feature to integrate trials within the mainstream of clinical planning and evidence-based medicine. Clinicians will be more influenced by trials if they are involved in their design and if they see that the trial is measuring things that are relevant to them. Increasingly, the major funding bodies are requiring that such consultations have taken place to convince them of the feasibility of a new trial.
The exploratory pilot trial
Before the fully powered RCT receives funding it is usually necessary to run a preliminary exploratory or pilot trial. Here the operationalisation and manualisation of the treatment will be tested and practical matters focused on: can the treatment be delivered reliably in different sites to high enough standard? Can sufficient patient numbers be collected and how much attrition can be expected during the trial? Will the idea of the trial and its measurement be accepted by patients and practitioners? What are the correct dosage effects and should trials of different dosages of the intervention be tried? (This does not necessarily apply just to medications: dosage might be the frequency of a psychological intervention.) What are the effect sizes shown in the measures of change and how well do they reflect the functioning of the treatment?
The main study design variations
Exploratory v. pragmatic trial design
Exploratory (efficacy) trials and pragmatic (effectiveness) trials are often contrasted (Jahad, 1998; Harrington et al, 2002; Box 1
). This distinction derives from the classic procedure of first validating a useful treatment in controlled conditions and then studying whether such efficacious treatment will generalise effectively into routine clinical practice. This is a model suited to much pharmaceutical or laboratory-based treatment development, but may be of less conceptual value in developing and testing complex mental health interventions, where the context may be part of the object of study and where the high costs of trials may mean that it is impracticable to plan such separate stages. Nevertheless, the distinction does help clarify the issue of fitness of trial design to purpose.
| Box 1 The spectrum of pragmatic and explanatory trials Explanatory (efficacy) trials
Pragmatic (effectiveness) trials
|
Efficacy trials are organised to test mode of action as well as outcome of treatment. The design must have high internal validity; that is, it must treat the most homogeneous population group possible (to restrict variance in sampling) and must try to restrict comorbidity. From the treatment perspective it must ensure the highest level of fidelity and consistency of treatment administration that is possible. It must address precise questions with a priori sub-analysis. Difficulties with efficacy designs of this kind are that they are extremely difficult to achieve in everyday psychiatric practice, as in the real world it is difficult to obtain such a pure sample or to ensure such consistency of treatment intervention. Even if such things are achieved, the efficacy trial is often compromised in terms of answering practical questions because of the lack of external validity: what is being studied in the trial bears little relationship to what happens in everyday clinical practice.
At the other end of the spectrum, effectiveness trials should have high external validity. That is, they should test as far as possible the way treatments are actually delivered in clinical practice. This is their great strength. The reciprocal weakness is the variation in trial population and details of treatment, particularly since the definition of the treatment often has to be more flexible for a pragmatic trial, for instance including patient preference (see below). There are various ways of dealing with these problems within the trial design, but the end result often is that they need larger sample sizes to maintain statistical power to identify treatment effects. Thus, pragmatic trials tend to need large samples with well-targeted and broad outcome measures in order to detect moderate treatment effects in practice (Hotopf, 2002; Harrington et al, 2002).
However, the idea is now becoming more accepted that large trials in mental health should be trying to address both pragmatic questions (Does it work? Is it cost effective?) and explanatory ones (How does it work? What components are responsible for efficacy, costs and patient-related outcomes? Can it be tailored to work more effectively or cost-effectively with particular types of patient?). The view is gaining ground that there is no reason why improving both the design and analysis of a trial to answer the explanatory questions of scientific interest should compromise its ability to answer the management-oriented pragmatic one. At its best the complex intervention trial will be a sophisticated clinical experiment designed to test the theories motivating the intervention and also help understand the underlying nature of the clinical problem being treated, in the context of patient- and service-level characteristics. It is important that these trials explicitly consider how and why the treatments work clinically and have their impact on economic outcomes (Kraemer et al, 2002; Kazdin & Nock, 2003; Oakley et al, 2006). The virtues of explanatory-type designs can be supplemented by prior consultation and co-construction of measurement; the virtues of pragmatic designs by the addition of process measurement (see below).
One outcome or many?
The classic trial discipline is to have one pre-specified primary outcome measure which tests the key hypothesis of the trial (what Bradford Hill called the essential precisely framed question; Hill, 1955). Secondary outcome and intermediate measures do give an opportunity for testing wider aspects of outcome but they are used only sparingly. Although this kind of trial discipline acts against the construction of post hoc fishing expeditions in the data, there is increasing questioning whether this kind of rigour is really appropriate for testing complex interventions where the outcomes of relevance are unlikely to be unitary or totally simple and where the intermediate effects are similarly complex. Such ambitions imply that more than just single simple outcome measures may be needed. However, the power of the study has to be adequate to carry these more complex measures.
Patient preference
Anticipated patient resistance to random allocation has led to the use of preference trial designs, which allow patients to opt for a preferred treatment rather than be randomly allocated. This can result in a cohort study with an RCT imbedded in it (Brewin & Bradley, 1989). Variations on preference trials include Zelens design (Zelen, 1979). Here, an identified patient group is randomised before consent is sought. Those allocated to treatment as usual never know that they are in a trial. Those allocated to the experimental intervention are approached for consent. Patients who decline to participate are given the standard intervention but analysed under intention to treat as if they had had the experimental intervention. Quite apart from the (significant) ethical issues about undertaking randomisation prior to consent, it is not possible for such trials to be masked. The ethical concerns of not telling patients that they have been randomised can be met by telling each participant, after randomisation, to which group they have been allocated. They can then choose to swap to the other treatment if they wish, but are considered in the original treatment arm for the purposes of the intention-to-treat analysis. Researchers disagree about the value of preference designs. Relatively larger samples are needed to allow for statistical modelling of the outcome, and this may well make the trial impracticable. It may be that, if randomised designs gain more cultural acceptance, the need for these preference variations will disappear.
Supplementing intention-to-treat analysis
Intention-to-treat analysis is typical of pragmatic trial designs. Data on all participants recruited are analysed, whether or not they completed the trial. This is in keeping with the philosophy that the trial tests the effect of the offer of an intervention to a patient group and avoids the potential bias of only studying patients who adhere to the treatment. This analysis can be supplemented by modelling such as the complier average causal effect (CACE) analysis (Angrist et al, 1996), which allows for the real-life situation where a proportion of patients switch arms of the trial during the treatment phase. However, such analysis generally needs larger sample sizes.
Using trials to study development
Hybrid trial designs have been suggested that combine the virtues of an explanatory randomised intervention trial with a longitudinal developmental study to form a potentially powerful way of investigating the development of disorders over time (Howe et al, 2002). In essence, the active intervention is seen as a controlled perturbation of the development of the disorder. In so far as the intervention changes variables thought to be central in the evolution of a disorder, then by comparing the longitudinal development of each arm using repeated measures the trial can act as a natural experiment to test developmental hypotheses. For such a design to work, the active intervention has to be able to make discrete changes in key developmental (mediating) variables as well as affecting target outcomes.
Control groups
Various interesting problems arise in relation to choosing control groups. First, should the control be treatment as usual, no treatment or a contact condition in which the additional therapist time in the active arm is balanced by non-specific additional therapist time in the contact arm? In the study of complex interventions it will be necessary to decide what key variables should be considered when constructing the control group condition. In wholly pragmatic trials, there is a strong argument that the best control condition is treatment as currently practised, since this reflects the practical question at issue: Does the test treatment confer additional benefit over best current practice treatment? (Harrington et al, 2002). However, the limitation of such a design is that one cannot be sure whether the treatment effect found is due to the specific properties of the actual intervention or to some other, more non-specific, therapeutic effect. It is for this reason that the study of treatment process variables has become of increasing interest.
|
|
|---|
Second, they may take the form of more general factors that have an impact on the effectiveness of a treatment, for instance the patients pre-treatment functioning, their relationship with the therapist, their motivation or the therapists fidelity to the treatment model. These factors are often called moderators.
The terms mediation and moderation have been used in varying ways. An early formulation (Baron & Kenny, 1986) suggested that a mediator directly influences the treatment outcome, whereas a moderator affects the relationship between treatment and outcome. Kraemer et al(2002) add clarity and rigour to this definition (Box 2
). Here, a moderator must be a baseline or pre-randomisation characteristic which can be shown to interact with treatment to affect outcome. A mediator of treatment has to be a change occurring during treatment which is correlated with the specific treatment chosen and has a main or interactive effect on outcome. A moderator must therefore precede the intervention in time, and be independent of an association with treatment. For instance, lack of social support before treatment is not the same as change of social support during treatment. Moderators cannot explain the overall effect of treatment but can indicate individual characteristics or circumstances associated with greater treatment effects. Mediators identify possible mechanisms through which the treatment might achieve its effect. By this strict definition, treatment alliance would not qualify as a moderator, although some baseline social competency in the patient that might be a factor in generating alliance could be a moderator.
| Box 2 Mediators and moderators of treatment outcomes Moderator A baseline (pre-treatment) characteristic that shows statistically an interactive effect with treatment on outcome Mediator An event or change occurring during treatment, altering with treatment and showing statistically a main or interactive effect on outcome (Adapted from Kraemer et al, 2002)
|
Table 2
categorises variables as mediators, moderators or neither on the basis of the stage at which they are measured and their relationships with treatment and outcome.
|
View this table: [in a new window] | Table 2 Identification of a variable as a mediator, a moderator or neither |
There are a number of reasons for studying these process variables.
Shortcomings in current research on treatment process
There has been a great deal of study of certain process measures, particularly in psychotherapy research, but methodology has often been weak (Kazdin & Nock, 2003; Hill & Lambert 2004). Typical problems include the following.
In response to these shortcomings, Kazdin & Nock (2003) propose criteria for the more rigorous establishment of the validity of process variables in a study (Box 3
).
Box 3 Rigorous criteria for identifying process variables
(Adapted from Kazdin & Nock, 2003)
|
Testing for moderating and mediating effects
A number of steps are recommended to test statistically for mediating effects (Box 4
). One of the advantages of the RCT design is that it allows a more powerful version of this type of analysis using linear modelling to test differences in process between the treatment group and the control group (Kraemer et al, 2002).
Box 4 Testing a variable for statistical mediation
(Adapted from Baron & Kenny, 1986)
|
An example of a process measure: the therapeutic alliance
The therapeutic alliance (Hougaard, 1994; Green, 2006b) refers to a variety of interactional and relational factors operating between therapist and client in the delivery of treatment. Although therapeutic alliance is traditionally thought of in the context of psychodynamic therapies, there is no reason why it should be confined to this form of treatment. Nor should its measurement in a trial be taken as implying a psychogenic aetiology of the condition treated. The quality of the therapeutic alliance may be part of an effective psychological treatment for disorders of all including organic aetiologies (Green, 2006b).
The importance of alliance relates partly to its face validity clinicians consistently rate the therapeutic relationship as crucial to outcome (Kazdin et al, 1990). But there is also strong empirical evidence that the quality of alliance predicts outcome independent of other factors. Meta-analysis of studies in both adult (Martin et al, 2000) and child (Shirk & Carver, 2003) mental health treatment shows a consistent overall correlation of alliance with treatment outcome of about 0.2. More detailed studies within randomised designs have tended to suggest that the quality of alliance is not specific to a particular treatment style and that it is a powerful independent predictor of outcome.
For example, in one randomised trial (Krupnick et al, 1996) three interventions cognitive therapy, interpersonal therapy and pharmacotherapy along with placebo control were studied in the treatment of adult depression. Therapeutic alliance was measured through structured observations at three time points during the treatment. Results showed that the quality of patient alliance was similar across all arms of the trial and independent of baseline symptoms. Alliance showed a strong independent effect on outcome in all arms (r = 0.46), explaining 19% of the outcome variance. Controlling for pre-treatment severity, patients with a good therapeutic alliance were 17.2 times more likely to show post-treatment remission of their depression. These effect sizes were larger than those associated with the specifics of each treatment. The therapists component of the alliance did not show much predictive effect, but this is probably because the structured protocol in the trial had trained the therapists to an extent that there was little variance in therapist effectiveness.
Similarly, the meta-analysis of trials involving children (Shirk & Karver, 2003) found important effects of alliance on outcome, particularly in externalising disorder and when the alliance was measured later in treatment by professionals (who also, however, often rated the outcome in question).
Steps to improve the study of alliance
In line with the suggestions in Box 3
, Kazdin & Nock (2003) have made a number of recommendations for better testing of the therapeutic alliance in treatment trials:
One study testing for the direction of effects between the alliance process and outcome has been undertaken in adults (Barber et al, 2000). This did show some ongoing reciprocal effect of early symptom change on evolving alliance. However, when they controlled for this there was still a remaining overall effect of early alliance on eventual treatment-term outcome.
|
|
|---|
Furthermore, inclusion of process measures immediately increases the face validity and reality of treatment trials for clinicians and other practical consumers of the research. Process measures usually tap the clinical feel of what a study is testing, reducing the sense that an RCT is a rather artificial design.
Enthusiasts who have promoted the values of the RCT within mental health research have long felt that it has particular qualities to illuminate the complex processes involved in mental health interventions. These modern developments in RCT design, including the measurement of process, may make it more likely that clinicians will agree.
|
|
|---|
|
|
|---|
MCQ answers
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
|---|
This article has been cited by other articles:
![]() |
J. Green and G. Dunn Using intervention trials in developmental psychiatry to illuminate basic science The British Journal of Psychiatry, May 1, 2008; 192(5): 323 - 325. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||