He Matai Matatupu - Kia Ata Mai Educational Trust


page 1 - 2 - 3 - 4

Standardised tests do not measure slow progress well
It is difficult to design a good reading assessment instrument which can be used close to the onset of instruction. Standardised tests sample from all behaviours and they do not discriminate well until considerable progress has been made by many of the children (Clay, 1991, page 204). Yet teachers can identify the children making slow progress before standardised tests can do this effectively. In my own research 20 to 25 percent of beginning readers were showing some confusions and difficulties one year to 18 months before good assessments could be obtained by standardised tests of reading for children in the tail end of the distribution of test scores. We should try to use systematic observation by teachers as one way to achieve early identification of children who need supplementary help.
I have come to place less emphasis on assessments which yield an age or grade level score in the first years of school. A programme of assessment will give me checkpoints on the general level of performance of children but I would want to have, in addition, records of progress on individual children - where they were at various points during the year, what products they could produce and what processes they could control on what texts.
To be acceptable as evidence of children’s progress observational data would have to be as reliable as test data. Running records have shown high reliability, with scores for accuracy and error having reliabilities of 0.90. Observers find self-correction behaviour harder to agree upon and the reliability can drop to 0.70.
Running records of text reading have face and content validity. You cannot get closer to the valid measure of oral reading than to be able to say the child can read the book you want him to be reading at this or that level with this or that kind of processing behaviour. Little or nothing is inferred. You can count the number of correct words to get an accuracy score. The record does not give a measure of comprehension but you can tell from the child’s responses to the story and from the analysis of error and self-correction behaviour how well the child works for meaning. And you can gauge his understanding of the story in the discussion you have with him about the story. You do not get a score on letters known, but you can see whether the child uses letter knowledge on the run in this reading.
In summary, standardised tests are indirect ways of observing children’s progress. They are suitable for reporting the behaviours of groups but cannot compare with the observation of learners at work for providing the information needed to design sound instruction.

Systematic observation
Educators have done a great deal of systematic testing and relatively little systematic observation of learning. One could argue that educators need to give most of their attention to the systematic observation of learners who are on the way to those final scores on tests.
Systematic observations have four characteristics in common with good measurement instruments. They provide:

  • a standard task
  • a standard way of setting up the task (administration)
  • ways of knowing when we can rely on our observations and make reliable comparisons
  • a task that is like a real world task as a guarantee that the observations will relate to what the child is likely to do in the real world (for this establishes the validity of the observation).

The standard task and administration provide sound measurement conditions. Otherwise we would be evaluating with a piece of elastic instead of using an instrument that behaves in the same way on every occasion. Two measurements with a piece of elastic cannot be compared; and comparability is often important not only at the national, state and district level but also at the individual level. For we often want to compare a student on two of his own performances. A standard task, which is administered and scored in a standard way, gives one kind of guarantee of reliability in comparisons.
Not all of our observations have to be on standard tasks but those used to demonstrate change over time should be. The problem with observations is that they can have many sources of error. One of these sources of ‘error’ is that what you ‘know’ about reading and writing will determine what you observe in children’s literacy development. You bring to the observation what you already believe.
We need to design procedures that limit the possibilities of being in error or being misled by our observations. One way we can do this is to make certain that a wide range of measures or observations is used. Probably no one technique is reliable on its own. When important decisions are to be made we should increase the range of observations we make in order to decrease the risk that we will make errors in our interpretations.
For example, a word test should never be used in isolation because it assesses only one aspect of early reading behaviours. So does retelling. The child is learning more about letters, and about how print is written down, and how to form letters and write words, and something about letter-sound relationships, and teachers need to know how learning is proceeding in each of these areas. That is why the observation tasks described in this Survey range across each of these areas of knowledge.
It is imperative, also, that we attend to the reliability of our observations. An unreliable test score means that if you took other measures, at around the same time or at another time, you might get very different results. We have to be concerned with whether our assessments are reliable because we do not want to alter our teaching, or decide on a child’s placement, on the basis of a flawed judgement. We need to be able to rely on the data from which we make our judgements.
It is important that we use tasks that are authentic. The word authentic has arisen among educators because many tests of reading and writing and spelling are being challenged as not valid measures of real world literacy activities. One of the current criticisms of the multiple choice type of test items is that they are a special type of task not found in real life; they are a test device with no real world reference. It will be better if we can find sound assessment procedures which reflect what the learner is mastering or struggling with. (Concepts About Print was designed to have such authenticity 20 years before the word appeared in the assessment field.)

Characteristics of observation tasks
All the observation tasks which I will discuss were developed in research studies. I like to call them observation tasks but they do have the qualities of sound assessment instruments with reliabilities and validities and discrimination indices established in research studies.
These observation tasks can be justified not only by theories of measurement: other theories are taken into account, from the psychology of learning, from developmental psychology, from studies of individual differences, and from theories about social factors and the influences of contexts on learning.
The observation tasks were not designed to produce samples of work which go into portfolios; they were designed to make a teacher attend to how children work at learning in the classroom. It is useful to supplement our observations of children’s portfolio work by systematic observation tasks, because portfolio products are often channelled by the teacher’s ways of teaching or expectations, and sometimes a different kind of observation task will confront the teacher with a new kind of evidence of a child’s strengths or problems.
The observation tasks in this Survey do not simplify the learning challenge. They are designed to allow children to work with the complexities of written language.
They do not measure children’s general abilities, and they do not look for the outcomes of a particular programme. They tell teachers something about how the learner searches for information in printed texts and how that learner works with that information.*

*To help teachers attend to features of oral language one could recommend Clay et al. (1983) and Cazden (1988). A standard story retelling task (McKenzie, 1986; Morrow, 1989) is also helpful to sensitise teachers to individual differences in the child’s growing control over constructing stories.

 

Next section: Reading and writing:
Processing the information in print