Introduction
Research design
DARP evaluation results
Implications of the DARP follow-up study results
Author: S. B. SELLS , D.D. SIMPSON
Pages: 1 to 11
Creation Date: 1979/01/01
In 1968 the Institute of Behavioral Research (IBR), Texas Christian University, initiated a systematic prospective research programme on the effectiveness of drug abuse treatment. The programme, which is still in progress, has come to be known by the acronym DARP, the initial letters of the name of the reporting system in which the data were collected, the Drug Abuse Reporting Programme. The DARP data base involves records on approximately 44,000 clients at 52 widely dispersed, federally supported treatment centres in the United States and Puerto Rico. Data were obtained for all clients admitted to treatment at these centres between 1 June, 1969 and 31 March 1973; clients were divided for purposes of the research into three cohorts, admitted in 1969-1971, 1971-1972, and 1972-1973. This report is based on results for the first two cohorts, for which post-treatment follow-up studies, approximately five years after admission, have been completed.
The data collected on all admissions included (1) at admission - client demographic and background data and baseline measures on the criteria representing in most cases the two-month period prior to admission, and (2) during treatment - client status and progress, recorded for sucessive two-month periods for performance in the following outcome areas: opiate drugs used, non-opiate drugs used, alcohol consumption, work on legitimate (as opposed to illegal) jobs, sources of income, residence, and criminality (arrests and time in jail). The admission (intake) and status evaluation (progress) reports were completed by trained counsellors at each treatment centre and forwarded to the IBR, identified only by code numbers to protect the anonymity of the clients. IBR editors received and processed the reports for entry into the computer files. In addition, post-treatment follow-up interviews have been completed on samples of the first two cohorts and the field data collection for cohort 3 was scheduled to be completed in May 1979. These samples represent about 15 per cent of each cohort, randomly selected after stratification by treatment modality, treatment centre, sex, race, and time in treatment. Clients in the follow-up samples were located and interviewed by trained professional interviewers following a standard interview protocol that covered behavioural outcome variables similar in content to those recorded during treatment.
The central question frequently asked in relation to the DARP studies is whether or not the data they have examined indicates that "treatment works". A qualified answer is given in this report, taking into consideration the research design and the limitations of the empirical data. The results of the examination of available evidence suggest that the question involves, not a unidirectional causal model of something (i.e. treatment) that is done to certain persons (clients), but rather a reciprocal model in which treatments (including treatment staff) and clients interact in a therapeutic situation. In this framework, major attention has been directed to the identification of subgroups of clients in each type of treatment who attained favourable and unfavourable results.
* This is a condensed version of a paper scheduled for publication in The British Journal of Addiction in late 1979.
The limitations of the DARP data, that critics have mentioned, involve the representativeness of the samples of clients, the decision to accept the prevailing clinical practice as a basis for assignment of clients to treatments rather than random assignment, and the use of client self-report for most of the data collected. These criticisms have been answered in detail elsewhere (e.g., Sells, Demaree, Simpson, Joe, and Gorsuch, 1977; Sells and Simpson, in press) and only brief comment is included here. With respect to sampling, it should be noted that at the time the DARP was initiated it included most of the treatment programme receiving federal support, and even after the federal treatment programme expanded during the early 1970s the DARP coverage also expanded and included large numbers of clients by age, race, sex, type of treatment, region of the country, and major cities in proportion to the total. There appears to be no contraindication, therefore, to use the DARP for research on treatment effectiveness even though generalization of specific effectiveness rates might be problematic, particularly for the current federal treatment system.
On the matter of random assignment of clients to treatments, the DARP position is that in addition to being unfeasible this would have been inappropriate since randomization prevents realistic assessment of treatment as it occurs in the field at treatment centres operating in the community (and not in a laboratory). In the field setting, the role of clinical judgment at intake, client preference, and facilities available play an important role and these must be reflected in the data collected. Thus, the violence to experimental statistical methods must be weighed against the violence to realistic data if evaluation is to be meaningful. The DARP investigators have relied on the use of robust analytic methods and replication of studies by different methods to insure consistency of results. The issue of self-report reflects an unavoidable constraint in the study of illegal behaviour, particularly in an environment in which privacy and confidentiality are overriding requirements. However, even though some underreporting of compromising information is expected, a respectable body of empirical data supports the general reliability and validity of self-report data from drug users. Reliability and consistency checks within DARP have confirmed these findings and validity checks on self-reported treatment experience and incarceration for criminal acts have reflected high agreement (Simpson, Lloyd, and Gent, 1976).
Full details on treatments, client samples, data collection, and analysis of the DARP data have been published elsewhere (see Sells, 1974a, b, Sells and Simpson, 1976a, b, c, Sells et al., (1977). Brief mention is made here of the treatment provided by the DARP programmes and the analytic methods followed.
In 1969 methadone maintenance was a new treatment, but five of the first six treatment centres in the DARP offered methadone maintenance (MM) on an out-patient basis. These and other MM programmes that entered the DARP network during the next two years varied in many respects, as for example, in intake procedures, dosage levels, ancillary treatment components, discipline, intended duration of maintenance, and intended outcomes (e.g. abstinence or continued maintenance). Nevertheless, for the major analyses published thus far, they were all considered as representing the MM modality. The DARP centres also operated therapeutic communities (TC), which can be described as drug-free residential programmes in which efforts were made using group dynamics to alter clients' life styles and value systems to .an abstinent, self-sufficient, normal pattern of living. These also varied, mainly in facilities, staffing patterns, intended duration, and certain procedures, but were analysed as a homogeneous treatment modality. Another DARP treatment was out-patient drug-free (DF), which provided group and individual therapy and other services in an out-patient, drug-free environment. This modality also showed variation in facilities, staffing patterns, intended duration, and procedures. Finally, the DARP included short-term (21 days) in-patient and slightly longer (up to 3 months) out-patient or ambulatory detoxification units (DT), in which some therapy was frequently included.
Based on 27,461 clients in the final research sample with complete data, the assignment of clients to treatment was not only not random, but significantly biased. For example, the MM treatment group, which comprised 40 per cent of the DARP sample, was older and had proportionately more males, blacks, and daily opiate users than any other treatment, while the much smaller DF group (21 per cent) was youngest and had most females, whites, and users of non-opiates only at admission. The DT group was closest to MM in composition while the TC group was in between MM and DF (table 1). These differences, which reflect both the effects of clinical judgment of intake officers and compliance with federal guidelines (particularly in the exclusion of youth under 18 and other than long-time addicts from MM) indicated that the treatment samples differed significantly in prognosis for favourable outcomes in treatment. It was therefore of the utmost importance to avoid confounding of the outcome results by failure to control these and other factors that varied between treatment groups. The analytic approach is discussed next.
In the DARP studies it was recognized that a treatment episode is usually only a brief interval in the life of a client and that behaviours selected as outcomes are likely to be accounted for by many factors in addition to treatment. The factors that have been analysed systematically in this regard include ( a) demographic classification variables, e.g., sex, age, and race-ethnic group, ( b) developmental background factors, including family structure, education, employment history, health history, drug use history, and criminal history, ( c) prior treatment for drug use, ( d) baseline levels on criteria prior to admission to DARP treatment, and ( e) community context factors that might have impact on clients, directly or indirectly (see Sells et al., 1977).
The information recorded at bi-monthly intervals from baseline to treatment termination (as well as for monthly intervals from termination to post-DARP follow-up interview) was restricted to behavioural indicators of conformity to societal norms with respect to use of illicit drugs and alcohol, work on legitimate jobs or time in school or home-making, and involvement in criminal activities. Because of limited resources and reservations concerning the validity of measures available, no systematic effort was made in the DARP to measure motivation, adjustment, personality factors, value orientations, or other intrapsychic factors. The behavioural outcome measures were obtained longitudinally at bi-monthly intervals from baseline (admission) to termination of DARP treatment for all clients, and retrospectively by monthly intervals in the follow-up interview. As a standard feature of data management, all measures were adjusted for time at risk and measures compared across time were defined to represent comparable behaviours and comparable band width for the time periods covered. Time in treatment was also computed from admission to termination of the first DARP treatment episode and this was used as an outcome measure for evaluation of during-treatment performance as well as a predictor of post-DARP outcomes.
Recording of treatment received. Each bi-monthly during-treatment status report indicated whether or not the client had received treatment during the period and provided information for classification by modality. A field study by DARP staff members also described every treatment programme in terms of programme goals, organization, structure, policies, procedures, and processes and resulted in a taxonomic classification of treatment types within modalities. As mentioned earlier, no major DARP follow-up reports have yet been published on outcomes by treatment type. The results reported below are by modality and reflect the treatment received by each client classified, using information reported in the bi-monthly status reports.
Evaluation designs within modalities. The four treatment modalities-MM, TC, DF, and DT-are well defined, despite variations in planned duration, goals, policies, procedures, staffing, and other aspects that have been the basis for efforts in the DARP research to delineate treatment types within modalities. Studies within each modality, exemplified by that of Gorsuch, Butler, and Sells (1974) based on end-of-treatment outcomes and by Simpson, Savage, Lloyd, and Sells (1978) based on outcomes for the first year after DARP termination, have involved two stages. In the first stage, gross differences from baseline to end of treatment (as in Gorsuch et al.) or to a post-treatment time, such as the first year post-DARP (as in Simpson et al.) have been measured for each criterion variable. The second stage involved linear model analyses of variance and covariance or hierarchical multiple regression analyses for each criterion (out-come) measure, using the pre-DARP demographic, background, prior treatment, and baseline measures (and in the Simpson et al. study, during-DARP measures, as well) as independent measures to identify factors other than treatment that predict outcomes. Assuming that the gross differences represent the mean for all clients in a treatment, the significant predictors could be used to identify subgroups of clients that had differential outcomes.
This type of two-stage design can answer two important questions. The first stage determines whether or not any significant change occurred for a client cohort, and if so, the magnitude of change on each criterion measure. The second stage is relevant to the question of whether or not such change (even if not significant for the total cohort) differs between subgroups of clients. Demographic, background, baseline, and prior treatment variables (and during-treatment outcomes predictive of post-treatment outcomes) are viewed as definers of meaningful subgroups that have differential outcomes.
Evaluation designs for comparison of treatments. The comparison of treatments across modalities involves several important problems relating to differences in approach and client composition of the modalities compared. For example, MM is normally recommended for established narcotic addicts who have had prior unsuccessful treatment and in most cases criminal records, while DF has been the treatment of choice for youthful (up to recently), middle-class, non-addicts.
In addition, differences have been observed in respect to certain norms and attitudes toward expected client behaviour. Without necessarily being moralistic the drug-free treatments have taken a harder line in relation to drug use than MM, although there has been a general tolerance of alcohol and marijuana use in most treatment programmes. Many MM programmes have also advocated indefinite maintenance or return to maintenance treatment, while TC and DF programmes have not. These differences imply that occasional drug use is not incompatible with successful outcome as defined in many MM programmes, but that abstinence was generally associated with success in the drug-free programmes. They also reflect opposite positions with respect to return to treatment. Client differences between treatments have additional implications for the weights assigned to various criteria. For instance, reduction of opiate use and criminality is most relevant for addicts, who have predominated in MM, and to a lesser extent, in TC. However, for the younger and less criminal polydrug users, more typical of DF, these are not the most focal criteria; non-opiate use is usually more important, along with general improvement on the other criteria.
There are therefore ideological and client differences between treatments. In the present research it was decided to address the client differences by statistical methods and to deal with the ideological differences in the interpretation of results. The study by Simpson, Savage, Lloyd, and Sells (1978) compared follow-up samples from MM, TC, DF, DT, and a no-treatment comparison group designated as IO (Intake Only) on individual criterion measures for the first year post-DARP and also on a composite measure (developed by Hornick, Demaree, Sells and Neman, 1977). For the comparisons between treatment groups, they adjusted post-DARP measures for population differences by means of analysis of covariance. Client characteristics and performance in DARP treatment were also examined in this study as predictors of post-treatment outcomes by using hierarchical stepdown regression analysis.
Using a different approach, Sells, Demaree, and Hornick (1978) employed behaviourally defined outcome groups for the first three years after DARP. These outcome groups were developed using cluster analysis, by Hornick, Demaree, Sells, and Neman (1977). Distinctive profiles based on post-DARP scores for six criteria (employment, opiate use, non-opiate use, alcohol use, criminality, and return to treatment) were categorized and arrayed on a scale of favourableness of outcome. In a companion study, Neman, Demaree, Hornick, and Sells (1977) identified pre-DARP factors that discriminated outcome groups, by means of multiple discriminant analysis. Computations were made of both the percentages of each treatment sample in each outcome group and also the percentages expected, based on the significant variables defining the first discriminant function. Examination of two ratios, including (1) actual percentages for each treatment group to the total for all treatments combined, and (2) actual to expected percentages (by outcome group), enabled comparative judgements of treatment effectiveness.
Detailed accounts of the DARP evaluation results have been published in reports cited earlier. This brief summary focuses mainly on the outcome group analyses by Sells, Demaree, and Hornick (1978), based on outcome profiles for the first three years after DARP of all black males and white males in the Cohort 1-2 follow-up sample (N=1,923). These results were highly consistent with those reported by Simpson, Savage, Lloyd, and Sells (1978) for the first year after DARP, using a different methodology.
The outcome groups employed by Sells et al., were defined by profiles of behaviour-based outcome variables for the first three years after DARP termination. The profiles were composed of the following six criterion variables that were scored in accordance with the definitions shown below, for the first three years post-DARP:
E (employment)
|
0 |
employed over 67 per cent of time at risk
|
1 |
employed 1 to 67 per cent of time at risk
|
|
2 |
employed 0 per cent of time at risk |
O (opiate use)
|
0 |
no use over time at risk
|
1 |
mean use from less-than-weekly to 4 days
|
|
per week
|
||
2 |
mean use 5 days per week or greater
|
|
N (non-opiate use,
|
0 |
no use over time at risk
|
but not including
|
1 |
mean use from less-than-weekly to 4 days
|
marijuana)
|
per week
|
|
2 |
mean use 5 days per week or greater
|
|
A (alcohol use)
|
0 |
mean daily intake 8 oz. 80°-proof liquor
|
equivalent or less
|
||
1 |
mean daily intake over 8 oz. 80°-proof
|
|
liquor equivalent
|
||
C (criminality)
|
0 |
no arrests and no time in jail
|
1 |
not more than 1 arrest and not more than
|
|
30 days in jail
|
||
2 |
more than 1 arrest and more than 30 days
|
|
in jail
|
||
T (return to
|
0 |
no treatment after DARP
|
treatment)
|
1 |
any treatment after DARP
|
There were 11 outcome groups with distinct profiles and these were arrayed from the most favourable, with zero on all six profile elements, to the least favourable, with maximum scores of two on E, O, N, and C and of one on A and T. The 11 outcome groups, with their defining profiles, are listed below in order of over-all favourableness of outcome. In the present discussion, as in some of the analyses reported, the 11 outcome groups were combined into four outcome levels, as indicated:
The actual and expected percentages for each treatment group at each of the four outcome levels are shown in table 2, which also includes the total sample (average for all treatments) percentage for each level. These data can be interpreted using two comparisons: (1) Comparison of each actual percentage with the corresponding total percentage for the respective level. This comparison indicates the degree of favourableness or unfavourableness of a treatment outcome in comparison to the total sample. In this comparison favourableness is shown when the actual exceeds the total, for levels I and II, and when the reverse is true, for levels III and IV; and (2) Comparison of each actual percentage with its corresponding expected percentage. This comparison indicates the degree to which the actual outcome exceeds or falls short of expectation and thus the degree to which a treatment effect beyond what could be predicted on the basis of client characteristics and background occurred.
Applying the two comparisons described above to table 2 the results are interpreted qualitatively in table 3; more detailed quantitative analyses of these data have been published in the original study report (Sells, Demaree, and Hornick, 1979). Based on tables 2 and 3, it is apparent that there were major outcome differences between the generally favourable outcomes of the three major modalities, MM, TC, and DF, and the highly unfavourable outcomes of the short-time DT and no-treatmnet IO groups. In the DARP follow-up studies, DT and IO were considered as comparison groups with low or zero time in treatment, but not as control groups. Simpson (in press) showed that MM, TC, and DF clients who remained in treatment for only a comparably short time (less than three months) had unfavourable outcomes similar to those in DT and IO. Thus, clients who received no treatment and those who were in treatment less than three months, in any modality, had the poorest over-all outcomes, while those in MM, TC, and DF who remained in treatment over three months, did well in proportion to their length of stay in treatment during the first DARP treatment episode. Simpson et al. (1978) and Neman et al. (1977) also found that post-DARP results were favourable in proportion to performance on opiate use, criminality, and employment during DARP treatment.
Over-all the results summarized in tables 2 and 3 separated the short-term DT and zero-time no-treatment IO groups from MM, TC, and DT by considerable differences. Compared with expectancy, the results for DT (involving minimal treatment) and IO (no treatment) can be interpreted as very poor by comparison with the other three groups, which represent the major moralities in which substantial treatment effects should be expected. The results for MM showed favourable differences between actual and expected frequencies at all four outcome levels. However, it is noted that the strength of the case for MM depends to a considerable extent on whether or not level II outcomes are accepted as favourable. The results for TC were distinctively favourable only at level I, with 36.9 per cent of TC clients in this category, compared to 34.4 per cent expected. At level II, the TC results were poor, and at levels III and IV they were about par. Except for level I, where they were not discriminably above expectation, the results for DF were disappointing, although clearly not at the low level of DT and IO.
If these analyses had gone no further, positive indications would have been justified only for MM and TC. However, an additional analysis was conducted separately for addicts (defined as daily opiate users) and non-addicts (all other clients). The results for the addict groups in MM and TC were approximately the same as for the total sample. In contrast to their favourable results in MM and TC, however, addicts had distinctly unfavourable results in DF. On the other hand, the TC results for non-addicts were also poor, but the DF results for nonaddicts were good, at least with respect to level I. Unfortunately, the non-addict group in DF (and also in TC) was small and only tentative conclusions could be drawn.
Using level I, the most rigorous criterion, 29.5 per cent of MM clients, 36.9 per cent of TC clients, and 34.4 per cent of DF clients were abstinent from illicit drugs and demonstrated generally favourable profiles of outcomes for the first three years after DARP termination; these percentages exceeded those expected on the basis of pre-DARP, non-treatment predictors. By contrast, only 19.6 per cent of DT and 21.0 per cent of IO clients were in the same outcome level, and these percentages were well below expectation. By the rigorous level I criterion, then, around one-third or more of MM, TC, and DF clients were successful for three years after DARP termination.
In the analysis of outcome group profiles it was concluded that those for level II could be considered moderately favourable and also that these, particularly the profile involving moderate opiate use and 100 per cent return to treatment (group 4), were considered reasonable for clients in methadone maintenance. The two level II profiles involved moderate drug use, no criminality, and generally conforming behaviour. Only the MM sample had an excess of clients beyond expectation for level II. When level II was included (along with level I) in the category of favourable outcomes, the percentages of clients in this category were: MM 55.1 per cent, TC 52.8 per cent, DF 54.3 per cent, DT 35.2 per cent, and IO 37.1 per cent. In the total sample, these results exceeded expectation only for MM and TC.
The combined evidence of differential performance after DARP leaves little doubt that the MM, TC, and DF programmes were of major significance in the rehabilitation of substantial percentages of DARP clients. The limitations of the field research situation do not permit the causal demonstration of a specific treatment effect, but the data presented suggest that significant, consistent, and specifiable changes occurred in certain treatment situations that did not occur in other treatment or non-treatment situations. In that sense, the DARP studies described have shown that "treatment works".
Obviously, there is much more to be learned. The data for cohort 3 are eagerly awaited for the opportunity to enlarge the samples and to cross-validate the results reported in cohorts 1 and 2. It is also hoped to study long-term changes and "addict careers" in the DARP samples over additional years; such long-term information on community based drug treatment programmes is not available for any comparably-sized sample with such an extensive client background and behavioural data base. Career studies of drug users are considered to be an important step for the drug abuse field, and carry a high priority in future plans involving the DARP data.
Gorsuch, R.L. and M.C. Butler. Toward developmental models of non-medical drug use. July, 1974, IBR Report No. 74-13. In S.B. Sells and D.D. Simpson (Eds.). Effectiveness of drug abuse treatment (Vol. 3). Further studies of drug users' treatment typologies and assessment of outcomes during treatment in the DARP. Cambridge, Mass.: Ballinger, 1976. Reprinted in part in Psychological Bulletin, 83(1), 120-127 (1976).
Hornick, C.W., R.G. Demaree, S.B. Sells and J.F. Neman. Measurement of post-DARP outcomes: The definition of composite and differential outcome groups. National followup study of admissions to drug abuse treatments in the DARP during 1969-1972. IBR Report No. 77-17, 1973.
Neman, J.F., R.G. Demaree, C.W. Hornick and S.B. Sells. Client characteristics and other variables associated with differential post DARP outcome groups. December, 1977, IBR Report No. 77-16.
Sells, S.B. (Ed.). Studies of the effectiveness of treatment for drug abuse (Vol. 1). Evaluation of treatments. Cambridge, Mass.: Ballinger, 1974.
Studies of the effectiveness of treatment for drug abuse (Vol. 2). Research on patients, treatments, and outcomes. Cambridge, Mass.: Ballinger, 1974.
Sells, S.B., R.G. Demaree and C.W. Hornick. The comparative effectiveness of methadone maintenance, therapeutic community, outpatient drug-free, and outpatient detoxification treatments for drug users in the DARP. Cohort 1-2 followup study. IBR Report No. 78-4. June, 1978.
Sells, S.B., R.G. Demaree, D.D. Simpson, G.W. Joe and R.L. Gorsuch. Issues in evaluation of drug abuse treatment. Professional Psychology, 8(4), 609-640 (1977).
Sells, S.B. and D.D. Simpson (Eds.). Effectiveness of drug abuse treatment (Vol. 3). Further studies of drug users' treatment typologies and assessment of outcome during treatment in the DARP. Cambridge, Mass.: Ballinger, 1976.
Effectiveness of drug abuse treatment (Vol. 4). Evaluation of treatment outcomes for the 1971-1972 admission cohort. Cambridge, Mass.: Ballinger, 1976.
Effectiveness of drug abuse treatment (Vol. 5). Evaluation of treatment outcomes for the 1972-1973 admission cohort. Cambridge, Mass.: Ballinger, 1976.
Sells, S. B. and D.D. Simpson. The case for drug abuse treatment effectiveness based on the DARP research programme. British Journal of Addiction, in press.
Simpson, D.D. The relation of time in drug abuse treatment to post-treatment outcomes. American Journal of Psychiatry, in press.
Simpson, D.D., L.J. Savage, M.R. Lloyd and S.B. Sells. Evaluation of drug abuse treatments based on first year follow-up. IBR Report No. 77-14. Reprinted by National Institute on Drug Abuse, Services Research Monograph Series, DHEW Publication No. (ADM) 78-701 (1978).
Simpson, D.D., M.R. Lloyd and M.J. Gent. Reliability and validity of data. National followup study of admission to drug abuse treatments in the DARP during 19691971. November, 1976, IBR Report No. 76-18.