Matches in SemOpenAlex for { <https://semopenalex.org/work/W1971944430> ?p ?o ?g. }
Showing items 1 to 73 of
73
with 100 items per page.
- W1971944430 endingPage "993" @default.
- W1971944430 startingPage "987" @default.
- W1971944430 abstract "Objective Assessment of child development often results in a multitude of binary outcome data. There is no agreed way to use them to score the developmental status of children. Conventional methods include age-standardized Z-scores and simple sum of number of passes. Recently two approaches based on the Rasch model and the concept of ‘developmental age’ have been proposed. This study aims to compare the performance of the four approaches. Methods In a longitudinal study, 473 Malawian children were measured for growth status at age 36 months and administered a new test of developmental milestones between age 3 and 6 years. The test consisted of four domains: gross motor (GM), fine motor (FM), social and language development. The four approaches were used to score the developmental level of each child in each domain, and the results compared. Results In this sample, the approach based on the Rasch model provided development scores that were more normally distributed than the other approaches did. The four sets of scores were highly correlated with each other. They gave similar estimates of the effect of height-for-age on GM, social and language development. In FM development, the maximum difference in the effect size estimates was only 0.04 standard deviation despite its statistical significance (P = 0.009). Conclusion The four approaches were practically equivalent in the context of the estimation of an intervention effect or association. Their relative advantages and disadvantages are discussed. None of them can be universally recommended. Objectif: L’évaluation du développement de l’enfant se traduit souvent par une multitude de données binaires des résultats. Il n’existe pas de méthode conforme de leur utilisation pour classifier le stade de développement des enfants. Les méthodes conventionnelles comprennent les Z-scores standardisés pour l’âge et une simple somme du nombre de passes. Récemment, deux approches basées sur le modèle de Rasch et le concept de “développement selon l’âge” ont été proposées. Notre objectif a été de comparer les performances des quatre approches. Méthodes: Dans une étude longitudinale, 473 enfants malawiens ont étéévalués pour leur stade de croissance à l’âge de 36 mois. Ils ont été de nouveau évalués entre l’âge de trois et de six ans. Le test consistait en quatre domaines: la motricité générale, la motricité fine, le développement social et du langage. Les quatre approches ont été utilisées pour marquer le niveau de développement de chaque enfant dans chacun des domaines, et les résultats obtenus on été comparés. Résultats: Dans cet exemple, l’approche basée sur le modèle de Rasch a fourni des scores de développement distribués de façon normale comparée aux autres approches. Les quatre séries de scores corrélaient hautement les unes avec les autres. Ils ont fourni des estimations similaires (P > 0,05) de l’effet de la taille pour l’âge sur la motricité générale, sur le développement social et du langage. Dans le développement de la motricité fine, la différence maximale sur les estimations de l’ampleur de l’effet ne correspondait qu’à un écart type de 0,04 malgré une signification statistique (P = 0,009). Conclusion: Les quatre approches étaient pratiquement équivalentes dans le contexte de l’estimation d’un effet d’intervention ou d’une association. Aucune d’elles ne peut être universellement recommandée. Objetivo: Evaluar el desarrollo infantil a menudo resulta en la obtención de una multitud de datos binarios. No existe una fórmula acordada sobre el uso que debe dárseles para puntuar el nivel de desarrollo de los niños. Los métodos convencionales incluyen: Z-score estandarizado por edad y la simple suma del número de aprobados. Recientemente se han propuesto dos nuevos enfoques basados en el modelo de Rasch y el concepto de “desarrollo por edad”. Hemos comparado el desempeño de los cuatro enfoques. Métodos: En un estudio longitudinal, 473 niños de Malawi fueron medidos para determinar su estatus de crecimiento a los 36 meses, y posteriormente, entre los tres y seis años, se les evaluó con una nueva prueba. Esta prueba consistía de evaluación en cuatro áreas: motricidad gruesa, motricidad fina, desarrollo social y del lenguaje. Los cuatro enfoques fueron utilizados para puntuar el nivel de desarrollo de cada niño en cada área y los resultados fueron comparados. Resultados: En esta muestra, el enfoque basado en el modelo Rasch arrojó los puntajes de desarrollo que tenían la distribución más normal comparándolos con los otros enfoques. Los cuatro grupos de puntajes estaban altamente correlacionados entre ellos. Daban estimativos similares (c/u P > 0.05) sobre el efecto de la altura por edad sobre la motricidad gruesa, y el desarrollo social y del lenguaje. En el desarrollo de la motricidad fina, la diferencia máxima con el efecto del tamaño estimado era solo 0.04 DS a pesar de su significancia estadística (P = 0.009). Conclusión: Los cuatro enfoques eran prácticamente equivalentes dentro del contexto de la estimación de la asociación o del efecto de una intervención. Ninguno de ellos puede ser universalmente recomendado. Child development is an important aspect of child health and an important step to reaching the Millennium Development Goals, but many children in developing countries are failing to achieve their developmental potential (Grantham-McGregor et al. 2007). In epidemiological and intervention studies, an association with child development may be quantified as an effect size, indicating the degree of change in the response (in standard deviation, SD) per unit change in the exposure variable or in the presence of an intervention (Machin et al. 1997). Child development can be assessed by inventories of milestones. A tester assesses the child’s ability in performing a series of tasks. A child may succeed or fail to pass each of these items, resulting in a multitude of binary outcomes. Some examples of commonly used inventories include the Bayley Scales of Infant Development (Bayley 1993) and the Griffiths Scale of Mental Development (Griffiths 1970). Many of the commonly used development tests were originally developed in Western Europe or Northern America. Cultural adjustment and appropriate modification are needed before they can be used in other societies. One such recent work is a test developed for use in children up to age 6 years in Malawi, southern Africa (Gladstone et al. 2008). The test consists of four domains: gross motor (GM), fine motor (FM), social and language development, and 110 items in all. Having administered a developmental test, the challenge is to use the multitude of binary data to score the developmental status of a child. How to use these data in epidemiological and intervention studies has remained debatable (Jacobusse et al. 2006; Drachler et al. 2007; Jacobusse & Buuren 2007). One simple approach is to count the total number of successes. This approach has two limitations. First, it does not adjust for variation in age at assessment. So, the same score may represent different developmental statuses for children at different ages. Consequently, it is quite common to use an approach that standardizes the scores according to age-specific reference data and results in an age-standardized Z-score. The Bayley Scales, for example, use this approach (Bayley 1993). Similarly, some instruments adjust for age and create percentile scores instead of Z-scores. These two methods share a similar basis but the percentile scores tend to give a non-normal, flat distribution. Here we will only discuss the Z-scores. Second, the counting of successes allows all items to contribute equally to the raw scores and Z-scores regardless of the items’ difficulty level. Drachler et al. (2007) recently proposed an approach to score child development that takes into account the difficulty level of each item. The resultant developmental score is the natural logarithm of the ratio of the child’s ‘ability age’ to actual age. A negative (positive) value means developmental delay (advance). One drawback of this score is that it cannot be estimated for children who either pass or fail all test items. Another recently proposed approach offers a quantitative developmental score based on the Rasch Model (Jacobusse et al. 2006; Jacobusse & Buuren 2007). This approach does not standardize for age, but it has the advantage that the scores are on the same metric across age and can be compared across age. Under this approach, children who have the same number of passes will have the same score even if they pass different items with different level of difficulty. There is no consensus on what the most appropriate way to score child development is. Both the simple count (SC) and Z-scores are seen in the literature. Some studies analysed the items separately, leading to a problem of multiple testing. To our knowledge there has been no empirical comparison of the four scoring approaches. This article aims to compare the four approaches to the scoring of child development in the context of paediatric epidemiology and intervention studies by drawing on data from the Malawian study (Gladstone et al. 2008). In this context, the focus of analysis and the motivation of scoring child development usually concern the estimation of an association between child development (response variable) and a risk factor or intervention (exposure variable), or the effect size. This article does not aim to study how best to use developmental tests in clinical services. The Lungwena Child Survival Study is an ongoing prospective cohort study of children born in 1995 and 1996 in Lungwena, a rural area in southern Malawi. Anthropometric measurements were collected at monthly intervals from birth up to the age of 18 months, 3-month intervals from 18 to 60 months and wider intervals thereafter. Details of the cohort study have been described previously (Maleta et al. 2003). A sub-study was conducted to develop an inventory for the assessment of child development. The sample for this sub-study was the cohort of survivors aged between 3 and 6 years and their younger siblings, excluding those who had significant disability, severe malnutrition, ≤34 weeks of gestation or twin births. The children were assessed on one occasion in 2000 to 2001 on a home visit by research assistants trained by a paediatrician in the study team (MG). The assessment took approximately 35 min to complete. Items were scored as either pass or fail, or ‘don’t know’ if the child was uncooperative or unwell. Items were administered until the child failed seven consecutive items in the same domain. After this, it was assumed that the child would not attain any further milestones in that domain. Details of the sub-study on child development have been described earlier (Gladstone et al. 2008). The present work is a secondary analysis of the previously reported data. Only the cohort members were included in this work. The study was approved by the National Health Science Research Committee, Malawi. All procedures deal with each of the four domains of development separately, giving four domain scores for each child. The first approach simply counts the total number of items passed. In this article we will call this the SC. The second approach standardizes the SC for age. For the present study, we do not have a separate population reference. Hence the standardization is internal. The children assessed were arranged into quintiles according to age. Within each quintile the mean and SD were calculated and a Z-score was obtained for each child by subtracting the age-specific mean from the child’s SC and then dividing the difference by the age-specific SD. We compare the four scores from three aspects. First, we described and contrasted their distribution and examined which of them tend to show a more normal distribution, using the V-statistics described in Royston (1991). V > 1.4 represents a statistically significant deviation from normality (P < 0.05); a larger V indicates a larger degree of deviation from normality. Second, we estimated the pair-wise Pearson’s and Spearman’s correlation coefficients between the scores in order to see how similar they are. Third, we calculated the height-for-age Z-score (HAZ) at age 36 months using the WHO Multicentre Growth Reference Study Group (2006) and estimated and compared the effect size (in standard deviation) per unit change in HAZ by ordinary least squares regression, with gender and age at assessment as covariates. Testing for equality of effect sizes across the four scores was based on Zellner’s Semmingly Unrelated Regression method (Greene 2003). The present sample was assessed at mean age 4.2 years (SD = 0.7) and so the items for infants and younger children are irrelevant (i.e. all passes). Only items with a pass rate below 100% in this sample were retained for the analysis. The developmental scores for each domain were estimated and analysed for subjects with non-missing values in HAZ and items in that domain. For the purpose of comparison with the LAR, which is not estimable if a child passes or fails all items, the analysis excluded these subjects, but they were included in the processes of deriving the other three scores. The 473 cohort members (233 males) were assessed, with median age 4.05 and range 3.13–6.15 years. The number of subjects with no missing values in HAZ at 36 months and the milestones for inclusion in the studies of GM, FM, social and language development were 384, 360, 399 and 389, respectively. Table 1 is a descriptive summary of the four scores for each of the four domains of child development. There were 20, 48, 31 and 36 children, respectively, who passed or failed all GM, FM, social and language items and therefore had no LAR. They were excluded from the analysis and comparison here. Most scores were negatively skewed (skewness < 0) in this age range and more pointed (kurtosis > 3) than a normal distribution would be. None of the score distributions shows evidence of normality. However, in the GM, FM and language domains, the RS were closer to normality than the other scores (V = 4.1, 13.4 and 3.2, respectively). Especially in GM and language development, the RS’s degree of skewness (-0.5 and -0.4) and excess kurtosis (3.2 and 3.3) was mild. The LAR of social development was closer to normality than the other three scores (V = 5.6). Table 2 presents the correlations between the scores. The Spearman’s correlation coefficients show that the two scores without age adjustment, that is SC and RS, had exact agreement in ranks (coefficients = 1.00). The two scores with age adjustment, that is Z-score and LAR, were also strongly correlated (from 0.80 to 0.92). Correlation between the scores with and without age adjustment was more modest (from 0.61 to 0.83). Pearson’s correlation coefficients gave similar results. Table 3 shows the effect size per unit increase in height-for-age at 36 months, with sex and age as covariates in the regression analysis. In GM development, the four scores increased by about a quarter of a standard deviation (0.23–0.28) per SD increase in HAZ. There was no statistically significant difference between the four effect size estimates (P = 0.070). The Z-statistics (effect size divided by standard error of effect size) were also similar (from 5.62 to 6.46). In FM development, the effect size estimates for SC, Z-score and RS were all 0.28; that for LAR was 0.32. The difference between these effect size estimates was only 0.32 - 0.28 = 0.04 SD, although it was statistically significant (P = 0.009). In the social domain, the effect size estimates were slightly below 0.2 SD (0.17–0.20). There was no significant difference between the four estimates (P = 0.428). The Z-statistics of all the aforementioned effect sizes were larger than 2.58 and so all were statistically significant at the 1% level (P < 0.01), indicating associations between HAZ and the three aspects of child development. HAZ was not associated with language development scores. The four effect size estimates were all close to zero and statistically insignificant. Again, there was no significant difference between the effect size estimates (P = 0.320). Tests of child development typically result in a large array of binary data. To avoid the inflated risk of type I error resultant of multiple testing and to avoid being overwhelmed by a multitude of data, it is desirable to develop a summary measure of child development. There is no universally agreed way to do this. One common concern is whether the outcome scores follow a normal distribution. This concern is often overemphasized. T-test and ordinary least squares regression are robust to deviation from normality, especially when the sample size is large (Heeren & D’Agostino 1987; Gujarati 1995; Cheung et al. 2008). This concern is more relevant when the sample size is small, though there is no fixed rule on how small it is. In the present comparison in a Malawian sample, none of the four methods provided developmental scores that closely follow a normal distribution. Nevertheless, the RS performed better in this regard, providing scores that had smaller V and skewness closer to zero and kurtosis closer to three than the others in several domains. Despite very different conceptual frameworks and technical procedures, the four scores are strongly correlated. The relatively modest correlation between the age-adjusted and -unadjusted scores is a result of the children being measured at variable ages. The strong correlation coefficients suggest that, in research practice, the practical difference in employing the different scoring methods may not be significant. In paediatric epidemiology and intervention studies, the primary objective is often to estimate an association, or effect size. Related to this is the testing of the null hypothesis of effect size 0. In the context of behavioural sciences, Cohen suggested that an effect size of about 0.2 SD is a small effect (Machin et al. 1997). Stunting is an established predictor of a range of developmental and educational outcomes (Grantham-McGregor et al. 2007). We have compared the effect sizes in relation to one unit increase in HAZ at age 36 months. The use of different scoring algorithms made only minor variations in the effect size estimates. In three of four domains, the variations were not statistically significant. In FM development, where the variation in effect size estimates were statistically significant (P = 0.009), three methods gave an identical estimate of 0.28 whereas the LAR gave an estimate of 0.32. The difference between these high and low estimates was only 0.04 SD per unit increase in HAZ, which is substantially smaller than the ‘small’ effect suggested by Cohen. It takes a five units’ difference in HAZ to get the difference in effect size to accumulate to a ‘small’ level of 0.2 SD between LAR and the other three scores. Such a minor difference between scoring algorithms is unlikely to be of scientific significance in paediatric research. We maintain that for many epidemiological and intervention studies, where the purpose is to estimate an effect size, there is no practical difference between the methods. Nevertheless, other research purposes may arise from time to time and they may be better served by one of the approaches. For example, the RS is not standardized for age and the values are comparable across age groups. If one’s purpose is to study the acceleration and deceleration of development in relation to age, this is likely to be the method of choice. One may also consider ease and meaning in the presentation and interpretation of data. For example, the mean RS is meaningless and it depends on how one scales it. In the proposal of Jacobusse and colleagues, it is scaled to have a mean of 50 (Jacobusse et al. 2006; Jacobusse & Buuren 2007). In contrast, the LAR (or its exponent) has a nice interpretation of whether a child’s ability age is above or below his/her actual age. One drawback of the LAR as proposed by Drachler et al. (2007) is that it is not defined if a child passes or fails all test items, when the estimation of ‘ability age’ will not converge. This may be a minor issue if the test items vary substantially in difficulty for a sample of children in a particular age range. In such situations the number of children with all passes or all failures would be small. Otherwise, the missing values in LAR would mean not only a reduced sample size but also a possibility of bias. Hence, the other three methods may be preferred. One way to deal with this problem of the LAR is to set the missing values to a high (or low) score for cases with all passes (or failures) and analyse them as right (or left) censored values. The calculable highest and lowest ‘ability age’ may be used to form the censoring thresholds. The possible statistical procedures for the analysis of censored data include Tobit (Greene 2003), censored least absolute deviations (Powell 1984) and the more typical survival analysis techniques (Machin et al. 2006). However, some of these techniques to deal with censoring, for example Tobit, are not robust to the violation of distributional assumption. In conclusion, this empirical comparison of the four approaches to the scoring of child development suggests that the four methods provide scores that are highly correlated and equivalent for the purpose of estimation of effect size. The simplest approach of counting the total number of successes is as useful as the much more statistically advanced methods in this context. For studies with smaller sample size, where normality in the data is a concern, there is some sign that the RS is preferable. It may be that at other times the research purposes and consideration of interpretability may require a particular method. One needs to consider the drawback of the LAR in not providing scores for all subjects. An approach to deal with this as a censoring problem is proposed. The study was funded by grants from the Academy of Finland (grants 200720 and 109796), the Foundation for Paediatric Research in Finland, and the Medical Research Fund of Tampere University Hospital. XD received a scholarship from the Finnish Centre for International Mobility. The funders played no role in the study’s implementation, analysis or reporting." @default.
- W1971944430 created "2016-06-24" @default.
- W1971944430 creator A5004624020 @default.
- W1971944430 creator A5019453659 @default.
- W1971944430 creator A5023203369 @default.
- W1971944430 creator A5063772250 @default.
- W1971944430 creator A5064869077 @default.
- W1971944430 date "2008-08-01" @default.
- W1971944430 modified "2023-09-25" @default.
- W1971944430 title "Comparison of four statistical approaches to score child development: a study of Malawian children" @default.
- W1971944430 cites W1483693356 @default.
- W1971944430 cites W1600501516 @default.
- W1971944430 cites W1651797523 @default.
- W1971944430 cites W1972901764 @default.
- W1971944430 cites W2055527405 @default.
- W1971944430 cites W2061836320 @default.
- W1971944430 cites W2071937436 @default.
- W1971944430 cites W2110806731 @default.
- W1971944430 cites W2138056489 @default.
- W1971944430 cites W2139255721 @default.
- W1971944430 cites W2153221666 @default.
- W1971944430 cites W2172096768 @default.
- W1971944430 cites W4236153054 @default.
- W1971944430 doi "https://doi.org/10.1111/j.1365-3156.2008.02104.x" @default.
- W1971944430 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/18554248" @default.
- W1971944430 hasPublicationYear "2008" @default.
- W1971944430 type Work @default.
- W1971944430 sameAs 1971944430 @default.
- W1971944430 citedByCount "8" @default.
- W1971944430 countsByYear W19719444302012 @default.
- W1971944430 countsByYear W19719444302013 @default.
- W1971944430 countsByYear W19719444302015 @default.
- W1971944430 countsByYear W19719444302017 @default.
- W1971944430 countsByYear W19719444302021 @default.
- W1971944430 crossrefType "journal-article" @default.
- W1971944430 hasAuthorship W1971944430A5004624020 @default.
- W1971944430 hasAuthorship W1971944430A5019453659 @default.
- W1971944430 hasAuthorship W1971944430A5023203369 @default.
- W1971944430 hasAuthorship W1971944430A5063772250 @default.
- W1971944430 hasAuthorship W1971944430A5064869077 @default.
- W1971944430 hasBestOaLocation W19719444301 @default.
- W1971944430 hasConcept C105795698 @default.
- W1971944430 hasConcept C187212893 @default.
- W1971944430 hasConcept C2986587452 @default.
- W1971944430 hasConcept C33923547 @default.
- W1971944430 hasConcept C71924100 @default.
- W1971944430 hasConceptScore W1971944430C105795698 @default.
- W1971944430 hasConceptScore W1971944430C187212893 @default.
- W1971944430 hasConceptScore W1971944430C2986587452 @default.
- W1971944430 hasConceptScore W1971944430C33923547 @default.
- W1971944430 hasConceptScore W1971944430C71924100 @default.
- W1971944430 hasIssue "8" @default.
- W1971944430 hasLocation W19719444301 @default.
- W1971944430 hasLocation W19719444302 @default.
- W1971944430 hasOpenAccess W1971944430 @default.
- W1971944430 hasPrimaryLocation W19719444301 @default.
- W1971944430 hasRelatedWork W2039318446 @default.
- W1971944430 hasRelatedWork W2080531066 @default.
- W1971944430 hasRelatedWork W2410491650 @default.
- W1971944430 hasRelatedWork W2465156443 @default.
- W1971944430 hasRelatedWork W2604682584 @default.
- W1971944430 hasRelatedWork W2748952813 @default.
- W1971944430 hasRelatedWork W2899084033 @default.
- W1971944430 hasRelatedWork W3032375762 @default.
- W1971944430 hasRelatedWork W3108674512 @default.
- W1971944430 hasRelatedWork W4242858705 @default.
- W1971944430 hasVolume "13" @default.
- W1971944430 isParatext "false" @default.
- W1971944430 isRetracted "false" @default.
- W1971944430 magId "1971944430" @default.
- W1971944430 workType "article" @default.