Developmental disabilities (DD) are defined as a group of conditions resulting from impairments in a broad range of domains (e.g., physical, learning, or behavioral areas) with onset in the developmental period. DD include autism spectrum disorder (ASD), intellectual disability (ID), and language disorder (LD). ASD, ID, and LD are classified as neurodevelopmental disorders in the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5) [1]. ASD, ID, and LD are among the most common developmental disorders in preschool children [2,3]. In children with ASD, social interactions and communication are predominantly impaired, with a broad range of language and intellectual functioning [4,5]. Children with ID show overall limitations in both intellectual and adaptive functioning, whereas those with LD have prominent impairments in language functioning, with possible significant discrepancies between their verbal and nonverbal abilities [1,5].
Differential diagnosis of DD in preschool children can be challenging. Each child has a different rate of development, and the rate of development varies across developmental domains within a child [6]. The symptoms of DD in very young children may overlap with disorders of DD [7,8] and these disorders commonly co-occur [2,3]. Additionally, diagnostic changes may occur in a few preschool children with DD, whereas the diagnosis of DD after school age tends to remain stable throughout one’s lifetime [9-11]. Therefore, clarifying which disorder among DD they have in preschool children is imperative while also determining whether they actually have DD.
Children with DD have problems with language, executive functioning, and memory, leading to challenges in instrumental activities of daily living, academic achievement, selfcare, and social interaction [12,13]. The prevalence of challenging behaviors, such as self-injury or aggression, in children with DD ranges from 48% to 60% [14]. Hospitalization and emergency department care among children with DD were 1.8 times higher than in the general population [15]. Early diagnosis and intervention have been emphasized in many studies to alleviate various problems associated with cognitive development in children with DD. The human brain develops rapidly during the first two years of life, and the formation and density of synaptic connections peak at the age of three years [16]. The brain is most sensitive to stimulation during early childhood and neuroplasticity plays a critical role in brain development [17]. Early childhood interventions for DD have shown improvements in intellectual functioning and challenging behaviors [18]. Zwaigenbaum et al. [19] reported that initiating interventions for children with DD before three years of age may have more positive effects than those initiated after five years of age. Therefore, early identification and intensive interventions for DD are recommended to improve the prognosis.
Various assessment tools have been administered to evaluate the level of development and identify DD in preschool children. The Psychoeducational Profile (PEP), Wechsler Preschool and Primary Scale of Intelligence (WPPSI), and Vineland Adaptive Behavior Scales (VABS) are among the most commonly used tests in South Korea. PEP was designed to identify idiosyncratic learning patterns among children with ASD and to assist professionals in planning individualized educational programs [20,21]; however, studies on the utility of the PEP-Revised (PEP-R) for evaluating children’s development have focused on children with ASD rather than other disorders of DD [22,23]. Although WPPSI provides comprehensive information on the cognitive function of preschoolers, several studies have questioned its diagnostic ability in preschool children, especially those with limited verbal abilities [24,25].On the other hand, the VABS has been widely used to assess adaptive behavior [26] and is considered a valid measure of ID in South Korea if a child is too young to undergo an intelligence test, such as the WPPSI [27]. However, previous studies using VABS Second Edition (VABS-II) have reported mixed results regarding the correlation between adaptive behavior and cognitive functioning [26,28].
Considering these findings, we believe that there may be differences between the scales for identifying DD across a range of ages, diagnoses, and levels of cognitive function. However, to our knowledge, only a few studies have compared the utility of these scales in distinguishing DD from typical development (TD) and assessing the development of preschoolers with DD. Therefore, we aimed to compare the utility of the PEP-R, Korean WPPSI Fourth Edition (K-WPPSI-IV), and VABS-II in preschoolers. In addition, we investigated their correlation.
The present study is part of research on the efficacy of a mobile-based cognitive training program in preschool children with or without DD [29]. Between May 2020 and July 2020, preschool children were prospectively recruited from community-based childcare centers, kindergartens, and special education service centers. Children were excluded if they had underlying problems and were unable to complete the psychometric test owing to: 1) any sensory disturbances, 2) neurological diseases such as cerebral palsy, or 3) severe gross or fine motor difficulty. A total of 164 preschool children aged 37–84 months participated in this study. Participants were classified into two groups: children with TD and children with DD. DD in this study included ASD, ID, and LD. Two boardcertified child and adolescent psychiatrists confirmed the diagnosis based on DSM-5 diagnostic criteria, relevant psychometric tests, and additional interviews with caregivers and children. A simultaneous diagnosis of LD was not made once the child was diagnosed with ID. This study was approved by the Institutional Review Board (no. 2020-0386). Caregivers of the children consented to participate in the study.
The PEP-R, K-WPPSI-IV, and VABS-II were administered to assess children’s development. The Preschool Receptive- Expressive Language Scale (PRES), and Korean Childhood Autism Rating Scale, Second Edition (K-CARS 2) were used to obtain additional information related to DD.
The PEP-R [20,21] is used to assess the level of development in 6-month- to 7-year-old children. PEP-R comprises a developmental scale with 131 items and a behavioral scale with 43 items. The developmental scale consists of seven domains: imitation, perception, fine motor, gross motor, eyehand coordination, cognitive performance, and cognitive– verbal performance. The developmental age equivalent is based on the number of items answered correctly in each domain. The developmental quotient (DQ) indicated the overall developmental level ([developmental age/chronological age]×100).
The K-WPPSI-IV [28,30] is a scale for evaluating the intellectual functioning of children aged between 2 years and 6 months and 7 years and 7 months. The K-WPPSI-IV provides a Full-Scale Intelligence Quotient (FSIQ) that represents overall intelligence, five primary index scores (verbal comprehension, visuospatial, fluid reasoning, working memory, and processing speed), and axillary index scores (vocabulary acquisition index, nonverbal index, general ability index, and cognitive proficiency index).
The VABS-II [26,27] measures the adaptive functioning of individuals from birth to 90 years of age and comprises five domains, including communication, daily living skills, socialization, motor skills, and maladaptive behavior, which is optional. The domain scores from four domains (communication, daily living skills, socialization, and motor skills) are combined to yield an Adaptive Behavior Composite (ABC) score for children from birth to 6 years and 11 months of age.
The K-CARS 2 [31] was designed to identify the presence and severity of ASD symptoms and has been used to distinguish children aged >24 months with ASD from those with other developmental disorders. A clinician completed the scale by observing the children and interviewing their caregivers. The scores for each item ranged from 1 (age-appropriate behavior) to 4 (severely abnormal behavior), according to the severity of the child’s behavior.
The PRES [32] was developed to assess the developmental age of language in children aged 2–6 years. Receptive language age and expressive language age were obtained, and the receptive language quotient (RLQ) and expressive language quotient (ELQ) were calculated by dividing the language score by the chronological age, which was converted into a percentage.
To compare the demographic and clinical characteristics between the two groups, independent t-tests for continuous variables and chi-squared tests for categorical variables were used. The receiver operating characteristic (ROC) curve and area under the curve (AUC) were determined for all participants. Pairwise comparisons of ROC curves were performed using the DeLong method to identify differences between the PEP-R DQ, K-WPPSI-IV FSIQ, and VABS-II ABC scores. Although the K-WPPSI-IV FSIQ and VABS-II ABC scores have been standardized and categorized according to the level of functioning, the cutoff values for the PEP-R DQ have not been determined. Youden’s index analysis was used to estimate the optimal cutoff value for the PEP-R DQ. McNemar’s exact test was used to identify statistical differences in the sensitivity of the PEP-R DQ, K-WPPSI-IV FSIQ, and VABS-II ABC scores. The Pearson’s correlation coefficient (r) was calculated to evaluate the correlation between the PEP-R DQ, K-WPPSI-IV FSIQ, and VABS-II ABC scores. The partial correlation coefficient was calculated to measure the degree of correlation between the PEP-R, K-WPPSI-IV, and VABS-II, with age, sex, and socioeconomic status as controls.
Data were analyzed using IBM SPSS version 21.0 (IBM Corp., Armonk, NY, USA), MedCalc statistical software version 20.118 (MedCalc, Ostend, Belgium), and R statistical software, version 4.0.2 (R Foundation for Statistical Computing, Vienna, Austria). A pE0.05 was considered statistically significant. The strength of the correlation was described using Evans’ classification. Diagnostic accuracy was classified based on the AUC with reference to Šimundić [33].
Of the 164 children, 103 had TD and 61 had DD (Table 1). Comorbid ASD and ID and comorbid ASD and LD were reported in 33 and 4 children, respectively. ASD, ID, and LD were observed in 5, 12, and 7 children. Children with DD were older than those with TD (p<0.001). Sex and socioeconomic status (SES) showed statistically significant differences between the two groups (p=0.026 and p=0.002, respectively). Therefore, further analyses were performed, controlling for age, sex, and SES when possible.
Table 2 summarizes the clinical characteristics of the TD and DD. The mean PEP-R DQ, K-WPPSI-IV FSIQ, VABS-II ABC scores, PRES RLQ, and ELQ were significantly higher in the TD group than in the DD group (all p<0.001). In addition, they were significantly lower in the ASD, ID, and LD groups than in the TD group. The mean K-CARS 2 scores in the TD group were significantly lower than those in the ASD, ID, and LD groups, as well as in the DD group (all p<0.001).
The estimated AUCs of the PEP-R DQ, K-WPPSI-IV FSIQ, and VABS-II ABC scores for all participants were 0.953 (95% confidence interval [CI], 0.915–0.992; p<0.001), 0.955 (95% CI, 0.914–0.996; p<0.001), and 0.961 (95% CI, 0.932–0.991; p<0.001), respectively (Fig. 1). Based on pairwise comparisons of the ROC curves, no significant difference in the total number of participants was observed between the PEP-R DQ and K-WPPSI-IV FSIQ (p=0.918), PEP-R DQ and VABS-II ABC scores (p=0.646), and K-WPPSI-IV FSIQ and VABS-II ABC scores (p=0.777). The ROC curves and AUCs for the ASD, ID, and LD groups are shown in Fig. 1. No significant differences in the AUCs were observed between the scales in each group. When the participants were divided into three groups based on their ages, children younger than 56 months were defined as the youngest group among the three groups and were called the first tertile group in this study. The AUCs of the PEP-R DQ, K-WPPSI-IV FSIQ, and VABS-II ABC scores in the first tertile group were 0.994 (95% CI=0.978–1.000, p<0.001), 0.918 (95% CI=0.765–1.000, p<0.001), and 0.986 (95% CI=0.960–1.000, p<0.001), respectively. The AUCs of the scales did not differ significantly between the groups.
The sensitivity (SN), specificity (SP), and positive and negative predictive values (PPV and NPV, respectively) of the PEP-R DQ, K-WPPSI-IV FSIQ, and VABS-II ABC scores are presented in Table 3. Based on the results of Youden’s index analysis, the cutoff value for the PEP-R DQ was estimated to be 81. The SN of PEP-R group was significantly higher than that of VABS-II ABC scores in all participants (p=0.013) and in the ID and TD groups (p=0.004). The SN between the K-WPPSI-IV FSIQ and VABS-II ABC scores was not significantly different, except between the ID and TD groups (p= 0.039). However, a comparison of the SN between the PEP-R DQ and K-WPPSI-IV FSIQ in all DD groups was not available because the sum of the number of children in the discordant pairs was insufficient to calculate the exact McNemar’s test. We also determined the SN, SP, PPV, and NPV of the scales in the first tertile. The SN of the PEP-R DQ, KWPPSI- IV FSIQ, and VABS ABC scores were 87.5%, 75.0%, and 50.0%, respectively. However, the SN was not compared among the PEP-R DQ, K-WPPSI-IV FSIQ, and VABS-II ABC scores in the first tertile group because of the insufficient number of discordant cells.
The correlations between the PEP-R DQ, K-WPPSI-IV FSIQ, and VABS-II ABC scores are shown in Fig. 2. The PEP-R DQ demonstrated strong positive correlations with the K-WPPSI-IV FSIQ (r=0.90, p<0.001) and the VABS-II ABC scores (r=0.89, p<0.001). The K-WPPSI-IV FSIQ and VABS-II ABC scores also had a significant positive correlation (r=0.84, p<0.001). Even when controlled for the effects of age, sex, and SES, the partial correlation coefficient between PEP-R DQ, K-WPPSI-IV FSIQ, and VABS-II ABC scores was strong, ranging from 0.83 to 0.90 (p<0.001). In the ASD group, the partial correlations between the PEP-R DQ, K-WPPSI-IV FSIQ, and VABS-II ABC scores were strong, ranging from 0.71–0.88. Moreover, strong partial correlations were identified in the ID group (range: 0.66–0.76) and the LD group (range: 0.67–0.91). Based on the K-WPPSI-IV FSIQ scores (<70 or >70), participants were divided into two groups for additional analysis. K-WPPSI-IV FSIQ group with score <70 showed a strong partial correlation between PEP-R DQ, KWPPSI- IV FSIQ, and VABS-II ABC scores (range: 0.67–0.75, p<0.001). In the group with K-WPPSI-IV FSIQ scores >70, the partial correlations between the PEP-R DQ and K-WPPSI- IV FSIQ and between the PEP-R DQ and VABS-II ABC scores were strong (r=0.686, p<0.001; r=0.614, p<0.001). A moderate partial correlation was identified between the K-WPPSI-IV FSIQ and VABS-II ABC scores (r=0.446, p<0.001).
Partial correlations between the PEP-R DQ and developmental domains, K-WPPSI-IV FSIQ and primary index scores, and VABS-II ABC scores and domains are summarized in Fig. 3. The partial correlation coefficients were generally strong, ranging from 0.49 (between the PEP-R gross motor and the K-WPPSI-IV VSI) to 0.90 (between K-WPPSI-IV FSIQ and PEP-R DQ). All p-values of the partial correlations were less than 0.001.
This study found that the PEP-R, K-WPPSI-IV, and VABSII effectively distinguished DD from TD in preschool children. Moreover, all three scales had excellent discriminative ability and did not differ significantly based on the ROC curves and AUC. In previous studies using PEP-R [34] and PEP-3 [35,36], the developmental scores of children with ASD were significantly lower than those of children with TD. In a study of children aged 12–36 months, the standard scores for communication, daily living skills, and socialization measured by the VABS-II in ASD were significantly lower than in TD [37]. Additionally, the diagnostic accuracy for ASD can be improved when VABS is used in conjunction with the Autism Diagnostic Interview-Revised, and Autism Diagnostic Observation Schedule [38]. Our study findings were consistent with those of previous studies. Moreover, our study included preschoolers with ID, LD, and various levels of language measured by PRES, and the findings of our study support that all three scales may be effective in identifying DD across diagnoses and a wide range of levels of language.
Furthermore, our study aimed to examine the correlations between the PEP-R, K-WPPSI-IV, and VABS-II. Our results showed strong correlations between the PEP-R DQ, K-WPPSI-IV FSIQ, and VABS-II ABC scores in all participants. We conducted an additional analysis by dividing the participants according to their diagnoses. Strong partial correlations were identified among the ASD, ID, and LD groups. Previous studies on children and adolescents with ASD have reported a moderate-to-strong correlation between PEP and VABS [22,39]. The PEP-R also exhibited a strong correlation with the Hong Kong-based Adaptive Behavior Scale, which was modeled after the VABS [40]. Despite the paucity of studies on the correlation between PEP-R and WPPSI, some studies have shown that PEP-R in preschool children with ASD strongly correlates with levels of cognitive function, as determined using the Stanford-Binet Intelligence scales [41], Leiter-R [42], and Snijders-Oomen Nonverbal Intelligence test [34]. Our findings were consistent with those of previous studies. Consequently, we suggest that the PEP-R, K-WPPSI-IV, and VABSII may be beneficial for evaluating cognitive function in preschoolers with DD.
Despite the presence of a strong partial correlation between the K-WPPSI-IV FSIQ and VABS-II ABC scores in the group with a K-WPPSI-IV FSIQ score <70, a moderate correlation was observed in the other group with a K-WPPSI-IV FSIQ score >70. A previous study showed that the correlations between the WPPSI and VABS-II ranged broadly from weak to strong depending on the level of cognitive function [28]. Strong correlations were reported between the WPPSI-IV FSIQ and VABS-II domain scores in ID, ranging from 0.58 to 0.73 [28]. Furthermore, the strength of the correlation between VABS-II and WPPSI-IV varied within the ASD group; the correlations in autistic disorder ranged from 0.23 to 0.59, and those in Asperger’s disorder ranged from 0.13 to 0.42 [28]. The socialization domain of the VABS in ASD is more impaired than other domains at different levels of cognitive functioning [43,44]. Additionally, when comparing ASD patients with and without ID, Alvares et al. [45] identified the largest difference in the communication domain of the VABS and the smallest difference in the socialization domain. These findings from previous studies may explain the mixed results found in our study regarding the correlation between the K-WPPSI-IV and VABS-II.
Our study had several limitations. First, there were demographic differences in age, sex, and SES between the TD and DD groups. Further analysis cannot always statistically adjust for these variables. Second, our study focused on ASD, ID, and LD in patients with DD. Children with certain health conditions such as cerebral palsy, blindness, and deafness were excluded. Third, the proportion of patients with ASD or ID was high, whereas LD accounted for a small proportion of patients with DD. Additionally, approximately half of the children in the DD group had ASD comorbid with ID or LD. We did not calculate the correlations between the scales in each of the diagnostic groups without comorbidities because of the small sample size of children with single disorders. Therefore, the results should be interpreted with caution. Further research is recommended to include other disorders of DD and control for comorbidities.
Our study suggests that the PEP-R, K-WPPSI-IV, and VABSII may be effective in evaluating the level of development in preschool children with DD across a range of ages or diagnoses. Further research is required to extend these findings to a broader population of children with DD.
The datasets generated and analyzed during the study are not publicly available due to specific restrictions in the informed consent agreements obtained from the participants but can be obtained upon reasonable request or from the corresponding author.
The authors have no potential conflicts of interest to disclose.
Conceptualization: Sumi Ryu, Hyo-Won Kim. Data curation: Taeyeop Lee, Hyo-Won Kim. Formal analysis: Sumi Ryu, Taeyeop Lee, Seonok Kim, Hyo-Won Kim. Investigation: Sumi Ryu, Yunshin Lim, Haejin Kim, Go-eun Yu, Hyo-Won Kim. Methodology: Sumi Ryu, Taeyeop Lee, Seonok Kim, Hyo-Won Kim. Project administration: Hyo-Won Kim. Supervision: Hyo-Won Kim. Writing—original draft: Sumi Ryu. Writing—review & editing: Hyo-Won Kim.
This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Ministry of Science and ICT (NRF-2020R1A5A8017671).