Re: Caregiver Survey Data: Scoring Instructions

Scoring Instructions, 4.23.13

Revised: 5.10.13

To: NPLH Evaluation Team

From: Lyscha

Re: Caregiver Survey Data: Scoring Instructions

Date: April 23, 2013

This document provides scoring instructions for the variables included in the Caregiver Survey Dataset. As scoring decisions are made, this document will be updated accordingly.

BITSEA Externalizing

The BITSEA Examiner’s Manual only provides norming,percentile rankings, and clinical cut-offs for their scales (i.e., Competence and Problem), not subscales. With respect to missing or unanswered items, the instructions vary by scale type. For the Problems Total Scale score, if five or more items are unanswered, this scale should not be summed. Likewise, if two or more Competence items are unanswered, this scale should not be summed (p. 8).

Because the manual does not contain scoring information for subscales such as our externalizing measure, I recommend constructing an aggregate variable based on mean scores if 75% of the subscale items have been answered (at least 5 out of 7) because mean scores provide greater validity in the presence of missing data. The team agreed.

Infant Externalizing Scale (IES)

The IES was created by Katherine Casillas and was derived from two, existing measures: Rothbart’s IBQ and Bates’ ICQ. The NPLH evaluation team decided to use a 7-point scale so that the measure would be similar to the Protective Factors Survey.

The IES does not yet have scoring protocols. I recommend constructing an aggregate variable based on mean scores if 75% of the scale items have been answered (at least 5 out of 7). The team agreed.

Scoring notes:

V51, “Easy to soothe/comfort,” is reversed coded.

SDQ Conduct Problems[1]

According to the scoring materials, the SDQ items have values of 0, 1, or 2 that correspond to “not true,” “somewhat true,” and “certainly true.” In the most current dataset, the SDQ items are coded with 1, 2, or 3 values. If we want to make comparisons with the norming data, I would recommend using the values specified in the instructions. It was decided that Casey will recode the1, 2, or 3 values to 0, 1, or 2.

For documentation purposes, the materials I reviewed indicated that the SDQ typically contains 25 items total comprised of 5 scales of 5 items each. The SDQ Conduct Problems scale used in the NPLH evaluation contains 7 items. Thus, if norming data are used, only the 5 specified items should be included (denoted by checkmarks in Table 1). In addition, there appears to be two sources for the 7 items that were selected for inclusion in our survey. As shown in Table 1, five items are from the “Parent/Teacher 3 or 4-year old” version and five items (3 which overlap) are from the “Parent 4-to-10-year old” version.

Because both versions of the SDQ can be used with 4-year-olds, Erin and Lyscha decided to use the 4-to-10-year old” version for this age group because it ensures that these same items will be used at follow-up.

As a team we decided to:

Use a 5-item conduct problems variable (instead of a 7-item version) based on child age.
Use the “4-to-10-year old” version for children greater than 10.

Table 1. SDQ items by version

NPLH Measure / Parent or Teacher, 3 or 4 year-olds / Parent 4-to- 10 year-olds
sdq63 ‘Often loses temper’ /  / 
sdq64 'Generally well behaved’ (reversed scored) /  / 
sdq65 ‘Often fights with other children’ /  / 
sdq66 ‘Often lies or cheats’ / no / 
sdq67 ‘Steals’ / no / 
sdq68 ‘Often argumentative' /  / no
sdq69 ‘Can be spiteful’. /  / no

Note. In the NPLH evaluation, the SDQ is used for children 37 months or greater.

The footnoted source below suggests that the scale items should be summed. Referring to the standard 5-item scale, the instructions state that the “scale scores can be prorated if at least 3 items are completed” (60% complete).Following the developer’s guidelines, the team decided to sum an aggregate variable if at least 3 of the 5-items were answered.

If missing data are present (i.e., 1 or 2 items left unanswered), mean substitution will be used to account for these missing values. This step is necessary because the Conduct Problems scale score is based on a summed (rather than a mean) value.

Cut-off scores. “Caseness from symptom scores” are used to “identify likely cases with mental health disorders.”These scores do vary by age or gender.[2]For the Conduct Problems scale, the cut-off scoring is: normal = 2, borderline = 3, abnormal = 4 to 10. A 3-level categorical variable was constructed to reflect these cut-offs. In addition, a dichotomous variable (“normal” vs. “borderline” or “abnormal”) was constructed.

Protective Factors Survey

The Protective Factors Survey Scoring Manual[3] discourages the construction of a Child Dev/Knowledge of Parenting Subscale “because of the nature of these items” (p. 24). However, we still recommend creating a mean subscale score, with 4 items as the minimum. Note that the scoring info directly below does not pertain to theChild Dev/Knowledge of Parenting Subscale as this information was not provided.

An average score is computed for all subscales. The minimum number of required responses is for each subscale is shown in Table 2.

Table 2. Minimum number of required responses for the PFS Subscales

PFS Subscale / Minimum number of required responses
Family Functioning/Resiliency / 4
Social Support / 2
Concrete Support / 2
Nurturing and Attachment / 3

Scoring notes:

Six items are reverse-scored so that a higher score reflects a higher level of protective factors (see Table 3).

Table 3. PFS items that are reverse scored

1 / PF8. I would have no idea where to turn...
2 / PF9. I wouldn't know where to go for help...
3 / PF11. If I needed help finding a job...
4 / PF12. There are many times when I
5 / PF14. My child misbehaves just to
6 / PF16. When I discipline my child

Economic Hardship

The economic hardship scale is based on two variables.Erin and I both recommend using a mean value that allows for one missing variable to minimize missing values. The team decided to use a mean summary variable, however, the possibility of exploring a latent or factor score variable might warrant exploration if the magnitude of the correlation between the two items is small.

Scoring notes:

V18, “difficulty paying bills,” is reverse coded so that higher values correspond to greater difficulty.

[1] Scoring information retrieved from This same information is also available at To obtain the info, select“Instructions in English for scoring informant-rated SDQs by hand.”

[2]In my correspondence with Robert Goodman (4.17.13), the scale developer, he explained his rationale for not providing scores by age and gender: “We don't have age- and gender- specific cut-off scores because our experience (based particularly on analysis of UK data) is that the same cut-offs seem to make clinical sense across age and gender. Thus boys in the general population have higher mean scores for externalising problems, but boys and girls with comparable behavioral or ADHD disorders tend to have similar SDQ scores. Hence a single cutoff for boys and girls makes sense. Similar issues apply to age.”

[3]Scoring information retrieved from: