Evaluations of scientific research need to be based on scientific approaches

Evaluations of scientific research need to be based on scientific approaches. A Reply to “Is There Science Behind That?: The Early Start Denver Model” by Holehan & Zane, 2019.

Giacomo Vivanti, PhD
AJ Drexel Autism Institute, Drexel University

Laurie Vismara, PhD
Author “An early start for your child with autism: Using everyday activities to help kids connect, communicate, and learn”, Guilford Press.

Geraldine Dawson, PhD
Duke Center for Autism and Brain Development, Duke School of Medicine

Sally J. Rogers, PhD
MIND Institute, UC Davis Medical Center

The goal of the article “Is There Science Behind That?: The Early Start Denver Model“ published by Holehan and Zane in the ASAT newsletter is laudable. As the Early Start Denver Model (ESDM) becomes increasingly popular, it is important to critically examine its empirical support. However, we believe that the arguments put forward by Holehan and Zane contending that “the experimental rigor of the available research [on ESDM] is weak” are based on an evaluation that does not fully include the available evidence. Below we provide additional evidence that we believe was not included in the authors’ review, as well as suggestions for enhancing the evaluation approach used in the article.

The process for scientific evaluation of research evidence should include operationally defined criteria about (1) how the literature to be examined is selected and (2) how the risk of bias is evaluated. Different criteria are available to evaluate the rigor of intervention research, such as the Grading of Recommendations Assessment, Development, and Evaluation (GRADE), used by the World Health Organization (Guyatt et al. 2011), or the guidelines established by the American Evaluation Association (2008). In a literature review, these should be identified a priori. The Holehan and Zane review does not utilize an established systematic approach and does not provide operationally defined criteria for how articles are selected and evaluated. As such, key articles that evaluated the efficacy of ESDM are not examined, including four Randomized Controlled Trials (RCTs), a quasi-experimental study and a follow-up study (Dawson et al 2010, which is mentioned but not evaluated; Colombi et al., 2018; Estes et al., 2015; Vismara et al., 2018; Vivanti et al., 2018). The lack of operationally defined criteria for selecting and evaluating relevant literature and the incomplete selection of articles included in the review results in non-replicable results and arguably misleading conclusions.

We also wish to point out that a large scale RCT (Rogers et al., 2019) was published after this review was written, and consequently was not included in the Holehan and Zane review. The primary analysis for the Rogers et al. (2019) study was a generalized linear model that accounted for the repeated measures structure, and site was included as a covariate to control for potential site differences. Results showed a significant group by time interaction, indicating that children who received ESDM achieved better receptive/expressive language outcomes (the primary outcome measure) compared the children in the treatment as usual control group based on the Mullen Scales of Early Learning. This effect held regardless of the child’s initial IQ, language ability, and symptom severity. Thus, the review does not include the full literature on the efficacy of ESDM.

We disagree with the statement in the Holehan and Zane review that “current [ESDM] studies have not demonstrated objective measurement of treatment effects.” Intervention outcomes in the efficacy and effectiveness trials of ESDM (Colombi et al., 2018; Dawson et al., 2010; Estes et al., 2015; Rogers et al., 2012, 2019; Vismara et al., 2018; Vivanti et al., 2014) were measured by clinicians who were blind to treatment status, a standard approach to ensure an objective measurement of treatment effects. Similarly, we wish to point out inaccuracies in the statement that “[all ESDM research] included at least one of the original developers of the ESDM which raises questions about biases and the ESDMs (sic) ability to be replicated.” The statement is inaccurate as the Eapen et al. (2013) study cited in the review does not involve any authors associated with ESDM development. Additional ESDM research that does not involve “at least one of the original developers” includes the Vismara et al. (2018), RCT and the recent Waddington, van der Meer, Sigafoos and Ogilvie (2019) study (although this was only recently published). Moreover, when the developers of ESDM are part of an efficacy trial, research funded by the National Institutes of Health (NIH; such as Dawson et al., 2010; Estes et al., 2015; Rogers et al., 2012, 2019) addresses the issue of potential bias through the use of Data Management and Analysis Centers, or if a multisite trial, Data Coordinating Centers, whereby data from trials are managed and analyzed by evaluators who are independent from the study investigators. The developers of the intervention do not have access to the data and do not conduct the analyses. It should also be noted that while we agree that independent replication is important, it is inevitable that the initial research on a new approach is conducted by those who developed the approach. This is an issue common to other interventions, for example most initial work on Discrete-Trial ABA-based Early Intensive Behavioral Intervention (EIBI) was conducted by the author of the manualized procedures implemented in the trial (Lovaas) or individuals who were originally trained by Lovaas (e.g., Lovaas 1987; Smith et al., 2000, 2015). The reason why this happens is because initially studies are conducted in settings where there are expertise and training opportunities on the intervention. This, however, is not a fatal threat to replicability or validity of efficacy trials, when blind assessors and/or Data Management/Coordinating Centers are used, as it was the case for the NIH-funded ESDM efficacy and effectiveness studies.

Another rationale put forth by the authors to support the argument that evidence for ESDM efficacy is insufficient is that “a majority of studies have allowed children either in the test or control group to receive additional community treatment while participating in the current study confounding results.” We do not view this as a weakness in ESDM research studies. It would clearly be unethical to prevent families in an intervention trial from receiving other interventions. Most, if not all, Institutional Review Boards (IRB) or ethics committees would not approve a study in which families are “not allowed” to receive additional services, especially for studies that last for many months or years. In fact, ESDM is designed to be implemented in an interdisciplinary context in which ESDM providers work in collaboration with other providers, including occupational therapists, physicians, and others. Thus, to prevent children from receiving other services would not allow ESDM to be tested in the manner in which it is intended to be delivered. Furthermore, one wonders how “not allowing children from receiving additional treatment” would be enforced? To be sure, it is possible that factors other than the tested intervention contribute to variance in child outcomes, including additional interventions. However, this is not a threat to validity in RCTs, as the risk for additional treatment contributing to variance in intervention outcomes is equally distributed across control and treatment group. That is the point of RCTs – the RCT design does not eliminate external factors that contribute to variance, but it eliminates the risk of a systematic bias in favor of one group through randomization. In fact, the most stringent approach to evaluate treatments in RCTs according to the NIH is the “intention-to-treat” statistical model, whereby participants are compared within the groups to which they were initially randomized, independently of variations such as participants dropping out or receiving additional services. As a detailed explanation of this conservative statistical approach is beyond the scope of this reply, we suggest that the readers familiarize themselves with the abundant literature on current methodological standards for RCTs and “intention-to-treat” analyses (e.g., Moher et al., 2010; Schulz et al., 2010).

We argue that the review examines the limitations of current studies without considering what the stated purpose of the study is. The scientific merit of a research article depends on what the research is designed to achieve. As indicated by guidelines established by Lord et al. (2005), Smith et al. (2007), and the International Society for Autism Research (INSAR) Special Interest Group on early intervention in 2018 (Vivanti et al., 2018), implementation and evaluation of novel interventions follows a progression characterized by several steps. The first one is proof-of-concept work to ascertain preliminary evidence in support of the approach, including evidence of feasible delivery and safety/acceptability to key stakeholders. This is typically based on single-subject designs or case series. This is what the Vismara and Rogers (2008) paper was designed to achieve. Therefore, the criticism of the study lacking a control group is unrelated to the study purpose. Unfortunately, the authors primarily based their criticisms by examining proof of concept studies while failing to examine several RCTs and group-based studies.

To further elaborate, the authors’ conclusion that initial ESDM studies “lacked control group (sic)” and had a “small number of participants” also requires context. Importantly, the sample size chosen for a given study depends on what the study is designed to achieve. As mentioned in our previous comment, initial proof-of-principle and pilot studies (which include small samples and might not include a control group) are a critical first step to provide justification to proceed to full-scale controlled trials. The reason why initial studies of ESDM did not include a control group and had small samples is because they were not designed to test efficacy but feasibility/proof of concept of novel techniques or applications (see T. Smith et al., 2007, and Vivanti et al. 2018 for details on the differences between pilot-RCTs versus full-scale RCTs, as well as efficacy versus effectiveness studies, and implications for sample sizes and study designs). Based on the initial results of proof-of-principle and pilot studies, ESDM evaluation proceeded to studies of efficacy and effectiveness that did include a control group and adequately powered samples (e.g., Colombi et al., 2018; Dawson et al., 2010, Estes et al., 2015; Rogers et al., 2012, 2019; Vismara et al., 2018, Vivanti et al., 2014). Overall, sample sizes in ESDM trials are similar or superior to those used in RCTs that have evaluated other interventions such as EIBI.

The authors also mentioned not-otherwise-specified “issues with feasibility” in initial ESDM research. It is unclear to which are the “issues with feasibility” they refer. Feasibility of implementation for ESDM was documented in the effectiveness trials by Vivanti et al. (2014) and Colombi et al. (2018), following the systematic approach outlined in Bowen et al. (2009), which includes the following feasibility indicators: acceptability (how the individuals involved in the program react to the intervention), demand (the likelihood of the program to be chosen by the potential end-users), implementation (the degree of execution of the program against manualized procedures/guidelines), practicality (the extent to which delivery of the program is attainable within the situational constraints), adaptation and integration (the level of system change needed to deliver the program into the existing infrastructure). Therefore, feasibility was systematically evaluated and documented and the statement of ESDM having “issues with feasibility” is not based on data.

Finally, we wholeheartedly agree with the authors that “more studies involving randomized control groups” are needed in ESDM research. For example, additional research is needed to better understand moderators and mediators of treatment response. This however needs to be contextualized as the need for additional full-scale RCT research extends to all other early interventions for ASD. For example, the Cochrane review conducted by Reichow et al. (2018) on the scientific support for EIBI, which is based on stringent and operationally defined criteria, has concluded that evidence that EIBI may be an effective behavioral treatment for some children with ASD is “weak” and “the strength of the evidence in this review is limited because it mostly comes from small studies that are not of the optimum design. Due to the inclusion of non‐randomized studies, there is a high risk of bias and we rated the overall quality of evidence as ‘low’ or ‘very low’ using the GRADE system” (Reichow et al., 2018, https://www.cochranelibrary.com/cdsr/doi/10.1002/14651858.CD009260.pub3/full ). The annual research review by Green and Garg (2018) came to similar conclusions as only one small-case RCT is available in support to EIBI (Smith et al., 2000 – incidentally, the study does not report if participants in the experimental or control groups received other interventions).

In conclusion, we believe that the evidence supporting the efficacy of ESDM should be critically evaluated. However, we believe that the current review and evaluation is undermined by the lack of a systematic approach and analyses and conclusions that are not based on the full weight of available evidence. The Holehan and Zane review is weakened by: (1) a lack of operationally defined criteria to select and evaluate risk for bias in current ESDM literature, (2) a failure to include the full scope of available literature, (3) inaccurate reporting of the evidence, (4) failure to consider what the evaluated studies are designed to achieve (i.e., proof of concept studies versus pilot RCTs, efficacy versus effectiveness), (5) ethically questionable and logistically unfeasible parameters (“not allowing children in trials to receive other interventions”) that are inconsistent with current recommendations in the field (e.g., intent to treat analyses), and (6) failure to understand the study design controls, such as use of independent Data Management/Coordinating Centers for data analyses, that are required by NIH-funded trials, among others. We agree that the evidence base for ESDM is not complete, and independent replications based on larger samples are needed. We believe that ASAT should insist on evaluations of autism treatments that use a scientifically-informed and contextualized approach to analyzing the literature. Such an approach will ensure that the evaluation and its conclusions will be informative to end-users.

References

American Evaluation Association. (2008). Guiding principles for evaluators. American Journal of Evaluation, 29, 233–234.

Bowen, D. J., Kreuter, M., Spring, B., Cofta-Woerpel, L., Linnan, L., Weiner, D., et al. (2009). How we design feasibility studies. American Journal of Preventative Medicine, 36(5), 452-457.

Colombi, C., Narzisi, A., Ruta, L., Cigala, V., Gagliano, A., Pioggia, G., … & Prima Pietra Team. (2018). Implementation of the early start Denver model in an Italian community. Autism, 22(2), 126-133.

Dawson, G., Rogers, S., Munson, J., Smith, M., Winter, J., Greenson, J., Donaldson, A., & Varley, J. (2009). Randomized controlled trial of an intervention for toddlers with autism: The Early Start Denver Model. Pediatrics, 125(1), 17-23.

Eapen, V., Crncec, R., & Walter, A. (2013). Clinical outcomes of an early intervention program for preschool children with autism spectrum disorder in a community group setting. BMC Pediatrics, 13(1), Article 3. doi:10.1186/1471-2431-13-3

Estes, A., Munson, J., Rogers, S. J., Greenson, J., Winter, J., & Dawson, G. (2015). Long-term outcomes of early intervention in 6-year-old children with autism spectrum disorder. Journal of the American Academy of Child & Adolescent Psychiatry, 54(7), 580-587.

Green, J., & Garg, S. (2018). Annual Research Review: The state of autism intervention science: progress, target psychological and biological mechanisms and future prospects. Journal of Child Psychology and Psychiatry, 59(4), 424-443.

Guyatt, G., Oxman, A. D., Akl, E. A., Kunz, R., Vist, G., Brozek, J., … & Rind, D. (2011). GRADE guidelines: 1. Introduction – GRADE evidence profiles and summary of findings tables. Journal of Clinical Epidemiology, 64(4), 383-394.

Holehan, K. M., & Zane, T. (2019). Is there science behind that?: The Early Start Denver Model. Science in Autism Treatment, 16(2).

Lord, C., Wagner, A., Rogers, S., Szatmari, P., Aman, M., Charman, T., … & Harris, S. (2005). Challenges in evaluating psychosocial interventions for autistic spectrum disorders. Journal of Autism and Developmental Disorders, 35(6), 695-708.

Lovaas, O. I. (1987). Behavioral treatment and normal educational and intellectual functioning in young autistic children. Journal of Consulting and ClinicalPsychology, 55(1), 3.

Moher, D., Hopewell, S., Schulz, K. F., Montori, V., Gøtzsche, P. C., Devereaux, P. J., … & Altman, D. G. (2010). CONSORT 2010 explanation and elaboration: Updated guidelines for reporting parallel group randomised trials. Journal of Clinical Epidemiology, 63(8), e1-e37.

Reichow, B., Hume, K., Barton, E.E., Boyd, B.A. (2018) Early intensive behavioral intervention (EIBI) for young children with autism spectrum disorders (ASD). Cochrane Database of Systematic Reviews 2018, 5. DOI: 10.1002/14651858.CD009260.pub3.

Rogers, S. J., Estes, A., Lord, C., Munson, J., Rocha, M., Winter, J., … & Sugar, C. A. (2019). A Multisite Randomized Controlled Two-Phase Trial of the Early Start Denver Model Compared to Treatment as Usual. Journal of the American Academy of Child & Adolescent Psychiatry. 58(9), 853-865

Rogers, S. J., Estes, A., Lord, C., Vismara, L., Winter, J., Fitzpatrick, A., . . . Dawson, G. (2012). Effects of a brief Early Start Denver Model (ESDM)-based parent intervention on toddlers at risk for autism spectrum disorders: A randomized control trial. Journal of the American Academy of Child & Adolescent Psychiatry, 51, 1052-1065. doi:10.1016/j.jaac/2012.08.003

Schulz, K. F., Altman, D. G., & Moher, D. (2010). CONSORT 2010 statement: updated guidelines for reporting parallel group randomised trials. BMC Medicine, 8(1), 18.

Smith, T., Groen, A. D., & Wynn, J. W. (2000). Randomized trial of intensive early intervention for children with pervasive developmental disorder. American Journal on Mental Retardation, 105(4), 269-285.

Smith, T., Klorman, R., & Mruzek, D. W. (2015). Predicting outcome of community-based early intensive behavioral intervention for children with autism. Journal of Abnormal Child Psychology, 43(7), 1271-1282.

Smith, T., Scahill, L., Dawson, G., Guthrie, D., Lord, C., Odom, S., … & Wagner, A. (2007). Designing research studies on psychosocial interventions in autism. Journal of A and Developmental Disorders, 37(2), 354-366.

Vivanti, G., Dissanayake, C., Duncan, E., Feary, J., Capes, K., Upson, S., … & Hudry, K. (2018). Outcomes of children receiving Group-Early Start Denver Model in an inclusive versus autism-specific setting: A pilot randomized controlled trial. Autism, 1362361318801341.

Vivanti, G., Kasari, C., Green, J., Mandell, D., Maye, M., & Hudry, K. (2018). Implementing and evaluating early intervention for children with autism: Where are the gaps and what should we do?. Autism Research, 11(1), 16-23.

Vismara, L. A., McCormick, C. E., Wagner, A. L., Monlux, K., Nadhan, A., & Young, G. S. (2018). Telehealth parent training in the Early Start Denver Model: Results from a randomized controlled study. Focus on Autism and Other Developmental Disabilities, 33(2), 67-79.

Vivanti, G., Paynter, J., Duncan, E., Fothergill, H., Dissanayake, C., Rogers, S., & Victorian ASELCC Team. (2014). Effectiveness and feasibility of the Early Start Denver Model implemented in a group-based community child care setting. Journal of Autism and Developmental Disorders, 44, 3140-3153. doi:10.1007/s10803-014-2168-9

Vismara, L. A., & Rogers, S. J. (2008). The Early Start Denver model: A case study of an innovative practice. Journal of Early Intervention, 31, 91-108. doi:10.1177/1053815108325578

Waddington, H., van der Meer, L., Sigafoos, J., & Ogilvie, E. (2019). Evaluation of a low-intensity version of the early start Denver model with four preschool-aged boys with autism spectrum disorder. International Journal of Developmental Disabilities, 1-13.

Citation for this article:

Vivanti, G., Vismara, L., Dawson, G., & Rogers, R. (2019). Evaluations of scientific research need to be based on scientific approaches: A Reply to “Is There Science Behind That?: The Early Start Denver Model” by Holehan & Zane, 2019. Science in Autism Treatment, 16(10)

Suggested Pages

- What is autism?
  Get the facts
- Heard about a treatment?
  Find out the scientific community’s view about it
- Child newly diagnosed?
  See our packet for new parents
- What does the research say?
  Read our research summaries
- Want to understand how to weigh the evidence? Make sense of autism treatments
- What do other professional groups have to say about specific treatments?
  Read task force findings
- How do you tell the difference between something that works and something that doesn’t?
  Become a savvy consumer
- Want to hear what leaders in the field of autism have to say?
  Read the interviews section
- Question about specific issues?
  Ask a professional
Looking for a newsletter to keep you up to date?
Read our latest newsletter

Support Real Science
in Autism Treatment

DONATE TODAY