Kathleen M. Holehan, MA, BCBA, LBA
Thomas Zane, PhD, BCBA-D
Department of Applied Behavioral Science, University of Kansas

We want to extend our appreciation to Vivanti, Vismara, Dawson, and Rogers for their careful review of our recent column reviewing the state of the evidence behind the Early Start Denver Model (ESDM) published in the March 2019 issue of Science in Autism Treatment. They provided a thoughtful review and response to our evaluation. Those researchers join with us in firmly believing in the scientific approach, the important role of high-quality evidence in autism treatment, and the public discourse and discussion regarding research, findings, and conclusions.

The purpose of the current column is to respond to the points made by Vivanti et al. in this issue. Readers can then make their own decisions about the weight of the evidence and their confidence in the extent to which ESDM could be considered to have valid research support to claim effectiveness.

Our review will consist of our response to several of their points, namely:

  1. Vivanti et al.: Holehan & Zane did not clarify how they selected the ESDM research to review and, in fact, failed to review several important and recent studies about ESDM;
  2. Vivanti et al.: Holehan & Zane did not clarify how they rated the ‘risk of bias’ (i.e., what criteria did we consider when reviewing the quality of the research behind ESDM);
  3. Vivanti et al.: Holehan & Zane were wrong with their claim of studies were lacking ‘demonstrated objective measurement of treatment effect’;
  4. Vivanti et al.: Holehan & Zane were wrong in their assertion that in each study evaluating ESDM one of the ESDM developers was a coauthor;
  5. Vivanti et al.: Holehan & Zane were wrong to criticize the fact that participants (in many of the ESDM studies) in both the control and experimental groups received treatments outside of the ESDM protocol, and how that fact lowers confidence in internal and external validity;
  6. Vivanti et al.: Holehan & Zane failed to recognize the strength of the group designs used by all of the ESDM studies, and how those designs can confidently establish cause and effect;

Here begins our response:

  1. Vivanti et al.: Holehan & Zane did not clarify how they selected the ESDM research to review and, in fact, failed to review several important and recent studies about ESDM

Holehan & Zane: The criteria for selecting research to review was simple – we searched for published studies evaluating ESDM or some component of it. We did not include published literature simply describing ESDM, or comparing ESDM to other treatments, unless the publication included a controlled comparison. Apparently, we failed to locate some studies of ESDM, particularly very recent ones (one of which was published at the same time as our initial column). We appreciate Vivanti et al. pointing that out, since reviews of the literature should strive to review all of the literature on the topic. Subsequently, we reviewed those studies as well, and found them to be methodically similar to the previously published ones. That is, they are group designs, using experimental (ESDM) and control groups of some sort, with the dependent variables consisting of standardized scores and indirect measures, with statistical analyses conducted to determine the significance of the results. We will further discuss the limitations of this methodology in later sections.

  1. Vivanti et al.: Holehan & Zane did not clarify how they rated the ‘risk of bias’ (i.e., what criteria did we consider when reviewing the quality of the research behind ESDM).

Holehan & Zane: We are now providing further information on the criteria we used for evaluating the empirical soundness of the ESDM research. Our criteria are based upon common research standards used in behavioral research (see Iversen, 2013 and Lattal, 2013 for reviews of this approach). Generally speaking, our perspective was that of experimental control, in the form of a confident demonstration that the independent variable (in this case, ESDM) was the sole reason for changes in the dependent measures (in this case, the various measurements collected on the participants in these studies). Thus, we attempted to ascertain the extent to which no other variables, uncontrolled in the experiment, could have contributed to the changes in dependent measures. We attempted to assess the extent to which there was confidence that the ESDM procedures were implemented as described. We sought proof of a viable experimental design that would control for internal and external validity threats (e.g., Kazdin, 1999). And we were also interested in the extent to which there were data discussed for individual participants; that is, would an experiment describe the ESDM impact on individual performance? We would argue that these fundamental criteria for proper experimental design embody a very established systematic approach.

When evaluating each study on ESDM, we critiqued the extent to which the methodology of the study – the makeup of ESDM, the measurement of the dependent variables, the extent to which the ESDM treatment was implemented without deviation – was constructed, arranged, and implemented in a way that if there were positive results, we would be highly confident that the changes in the dependent variables were solely due to the ESDM treatment. That is the essence of our tasks as scientists and researchers – establishing strong confidence in causal relationships.

  1. Vivanti et al.: Holehan & Zane were wrong with their claim of a lack of ‘demonstrated objective measurement of treatment effect.’

Holehan & Zane: We stand by our original statement. To us, objective measurement refers to direct observation of behavior, measuring unique dimensions of that behavior (e.g., frequency, duration, latency) that are relevant to the experimental question (e.g., Barlow & Hersen, 1984). In behavioral research, specification of discrete responses, defined clearly so as to be confidently measured, is demanded. Essentially, there is a focus on the extent to which the experimental procedures influence some dimension of the response being measured (e.g., Johnston & Pennypacker, 1986).

Direct observation and measurement of operationally defined behavior is qualitatively different from standardized test scores. A great majority (if not the preponderance) of the dependent measures in the ESDM literature are in the form of standardized scores based upon rating scales or surveys. Let us give the reader two examples. Dawson, et al. (2010) used five instruments, four of which were standardized tests or questionnaires. Estes, et al. (2015) used seven measures on which to collect data on their participants. Of these seven, none involved direct measurement of behavior change. We assert that questionnaires, surveys, and standardized tests provide a different and lesser quality of data (e.g., Singh, Matson, Cooper, Dixon, & Sturmey, 2005). Particularly in autism treatment, in which it is imperative that a child with a diagnosis be shown to have a positive reaction to treatment, that is better shown with direct measurement of individual behavior change, than changes in standardized scores or changes in answers to a questionnaire.

  1. Vivanti et al.: Holehan & Zane were wrong in our assertion that in each study evaluating ESDM, one of the ESDM developers was a coauthor.

Holehan & Zane: We stand corrected. A few of the published studies do not involve one of the originators of ESDM, and we are grateful for that clarification. The point we were making is that potential bias of results might occur due to researchers, who conduct studies on issues of importance to them, might consciously or unconsciously arrange the experimental conditions (or the analysis of the data) in a way that would bias the results in a favorable way (towards the researcher). This is a historical problem in scientific research and researchers must be aware of it, in order to arrange an experiment to not pre-ordain results, and readers must be aware of this in order to more objectively review the evidence. Although Vivanti et al. describe ways of minimizing this potential threat, the point we are making – and continue to make – is that independent replications are the best assurance of removing bias from experimentation. Only a few of the ESDM studies could be considered independent replications (e.g., Eapen, Crncec, & Walter, 2013; Waddington, van der Meer, Sigafoos, & Ogilvie, 2019).

  1. Vivanti et al.: Holehan & Zane were wrong to criticize the fact that participants (in many of the ESDM studies) in both the control and experimental groups received treatments outside of the ESDM protocol, and how that fact (if true) lowers confidence in internal and external validity.

Holehan & Zane: We stand by our original statement. If the goal of a study is to evaluate the success of ESDM, and throughout the experiment, participants in the ESDM condition are also exposed to other treatments along with ESDM, and if the results show that the participants in the ESDM/additional treatments condition did better than the participants who did not get ESDM, then one cannot claim that ESDM was responsible for the improvement (Smith, 2012). Multiple treatment interference is a real phenomenon, well known as a potential bias in experimental research. It lowers confidence that the independent variable (in this case, ESDM) was totally responsible for the change in the participants. For example, if a person is at risk of a heart attack, a doctor might prescribe both a heart medication and a strict diet regime. If the risk of heart attack then lessens, we remain unsure as to which of the two treatments may have had a positive impact, since both treatments were done together.

In their reply to our initial column, Vivanti et al. argue that this multiple treatment is not a weakness. They argue that ESDM is designed to work in an interdisciplinary context, with the participants exposed to multiple treatments. Proponents of ESDM assert that in their studies, participants in both the experimental and control groups were exposed to multiple treatments, and thus that potential bias is standardized across groups, thus minimizing or reducing its impact.

We argue that that is a statistical solution, not a real solution. In experimental control-group studies, one can statistically argue that since both groups obtained multiple treatments, then that particular variable is controlled and washes out. However, due to the presence of additional treatments to ESDM, the ESDM model has never been tested alone. In the spirit of Smith (2012) and others, we advocate that the proponents of ESDM do a carefully controlled study in an attempt to isolate the effect of the critical components of ESDM, without the potential of added benefit by treatments unrelated to ESDM.

  1. Vivanti et al.: Holehan & Zane fail to recognize the strength of the group designs used by all of the ESDM studies, and how those designs can confidently establish cause and effect.

Holehan & Zane: We are guilty of this charge. Historically, group research designs are a standard in psychological research and have long been used. We understand that randomized control trials are the state of the art in some circles. Vivanti et al. rely on this method of experimentation and make the claim that such research can sufficiently demonstrate causal relations.

The limitations of group designs and statistical massaging of data are extensively covered elsewhere (e.g., Baer, 1977; Chiesa, 1994; Johnston & Pennypacker, 1980;1986; Sidman, 1960). One limitation is that the results from a group study may or may not apply to any specific participant of that study, since oftentimes averages of the data are calculated, instead of looking at individual data patterns before and after the intervention is applied. In the case of ESDM, if we look for clear experimental control – the clear evidence of whether ESDM itself is solely responsible for behavior change – we do not have complete confidence. The research showing the effectiveness (or not) of ESDM could be better tested using a within-subjects research design format, where the intervention is applied and individual data are analyzed to determine an effect. Unfortunately, no such data apparently exist. Vismara and Rogers (2008), Vismara, Colombi, and Rogers (2009), and Fulton, Eapen, Crncec, Amelia, and Rogers (2014) did what would be called an “AB” design, looking at the impact of ESDM on their respective dependent measures. Their phases included pretest (A), then the application of ESDM (B), and then a follow-up. Although the dependent measures showed an improvement over the course of the studies, this particular type of design does not allow for the conclusion of an existence of a causal relationship between ESDM and the improvements.

What seems to be the case is, the three subsequent publications have been randomized controlled trial group designs. Therefore, we are still looking for studies that show that the improvements in individual participants were solely due to the ESDM procedures. By emphasizing randomized control trials, it seems that the ESDM researchers “put the cart before the horse” (Smith, 2012, p. 105) by jumping to large group designs without first demonstrating tight experimental control of their procedures over actual changes in individual participant outcomes. The effects of ESDM would be best displayed by using within-subject designs to demonstrate clearly the impact of ESDM on individual behavior change. If a series of replications can do that, then the group designs – used exclusively by the ESDM authors – would lead to stronger conclusions. However, at this present time, a series of group designs, fraught with less-than-desirable control over confounding factors, do not allow us to isolate the variable responsible for change on an individual level.

Conclusion

We hope that this response to Vivanti et al. further clarifies our initial review and adequately responds to the major concerns raised by those proponents of ESDM. Readers of Science in Autism Treatment who have read these series of columns should benefit from the dialogue, and come to their own conclusions about the strength of the evidence behind the Early Start Denver Model.

References

Baer, D. M., (1977). Perhaps it would be better not to know everything. Journal of Applied Behavior Analysis, 10(1), 167-172. doi:10.1901/jaba.1997.10-167

Barlow, D. H., & Hersen, M. (1984). Single case experimental designs: Strategies for studying behavior change. New York, NY: Pergamon Press.

Chiesa, M. (1994). Radical behaviorism: The philosophy and the science. Sarasota, FL: Authors Cooperative, Publishers.

Dawson, G., Rogers, S., Munson, J., Smith, M., Winter, J., Greenson, J., Donaldson, A., & Varley, J. (2010). Randomized, controlled trial of an intervention for toddlers with autism: The Early Start Denver Model, Pediatrics, 125(1), e17-e23. doi:10.1542/peds.2009-0958

Eapen, V., Crncec, R., & Walter, A. (2013). Clinical outcomes of an early intervention program for preschool children with autism spectrum disorder in a community group setting. BMC Pediatrics, 13(1), Article 3. doi:10.1186/1471-2431-13-3

Estes, A., Munson, J., Rogers, S. J., Greenson, J., Winter, J., & Dawson, G. (2015). Long-term outcomes of early intervention in 6-year old children with autism spectrum disorder. Journal of the American Academy of Child and Adolescent Psychiatry, 54(7), 580-587. doi:10.1016/j.jaac.2015.04.005

Fulton, E., Eapen, V., Crncec, R., Amelia, W., & Rogers, S. (2014). Reducing maladaptive behaviors in preschool-aged children with autism spectrum disorder using the Early Start Denver Model. Frontiers in Pediatrics. 2(40). doi:10.3389/fped.2014.00040

Iversen, I. H. (2013). Single-case research methods: An overview. In G.J. Madden, W.V. Dube, T. D. Hackenberg, G. P. Hanley, and K. A. Lattal (Eds.) APA handbook of behavior analysis. Washington, DC: American Psychological Association.

Johnston, J. M., & Pennypacker, H. S. (1980). Strategies and tactics of human behavioral research. Hillsdale, NJ: Erlbaum.

Johnston, J. M., & Pennypacker, H. S. (1986). Pure versus quasi-behavioral research. In A. Poling and R. W. Fuqua (Eds.), Research methods in applied behavior analysis. New York, NY: Plenum Press.

Kazdin, A. E. (1999) Drawing valid inferences I: Internal and external validity. In A. E. Kazdin (Ed.) Research design in clinical psychology (3rd ed.) (pp. 15-39). New York, NY: Allyn and Bacon.

Lattal, K. A. (2013). The five pillars of the experimental analysis of behavior. In G. J. Madden, W.V. Dube, T. D. Hackenberg, G. P. Hanley, and K. A. Lattal (Eds.) APA handbook of behavior analysis. Washington, DC: American Psychological Association.

Sidman, M. (1960). Tactics of scientific research: Evaluating experimental data in psychology. New York, NY: Basic Books, Inc.

Singh, A. N., Matson, J. L., Cooper, C. L., Dixon, D., & Sturmey, P. (2005). The use of risperidone among individuals with mental retardation: Clinically supported or not? Research in Developmental Disabilities, 26, 203-218. doi:10.1016/j.ridd.2004.07.001

Smith, T. (2012). Evolution of research on interventions for individuals with autism spectrum disorder: Implications for behavior analysts. The Behavior Analyst, 35, 101-113. doi:10.1007/bf03392269

Vismara, L. A., Colombi, C., & Rogers, S. J. (2009). Can one hour per week of therapy lead to lasting changes in young children with autism? Autism, 13, 93-115. doi:10.1177/1362361307098516

Vismara, L. A., & Rogers, S. J. (2008). The Early Start Denver model: A case study of an innovative practice. Journal of Early Intervention, 31(1), 91-108. doi:10.1177/1053815108325578

Waddington, H., van der Meer, L., Sigafoos, J., & Ogilvie, E. (2019). Evaluation of a low-intensity version of the early start Denver model with four preschool-aged boys with autism spectrum disorder. International Journal of Developmental Disabilities, 1-13. doi:10.1080/20473869.2019.1569360

Citation for this article:

Holehan, K., & Zane, T. (2019). A response to Vivanti et al. Science in Autism Treatment, 16(10).

Print Friendly, PDF & Email