Please Note: Balanced Assessment printed materials are available from this site, except for those indicated below. Please use the order form and follow the ordering directions carefully as they have changed. Current prices can be found on our order form.

Please also note that the Balanced Assessment Primary & Elementary Tasks have been published by Corwin Press. The Balanced Assessment Transition & Middle School Tasks have been published by Teachers' College Press. These tasks may still be viewed in .pdf format on this website but they may not be copied or printed.

An Interim Report

of the

Harvard Group

Balanced Assessment in Mathematics Project

September, 1995

Educational Technology Center

Harvard Graduate School of Education

This research was supported by a subcontract from the University of California at Berkeley, under National Science Foundation grant MDR‑9252902.

Let early education be a sort of amusement; you will then be better able to find out the natural bent.

Plato, The Republic, bk. VII, 537

Introduction

Assessing the mathematical performance of our students and the effectiveness of our mathematics instructional programs has become a major concern of a large part of the mathematics education community as well as a concern of several larger publics. The National Council of Teachers of Mathematics (NCTM) has addressed that concern in its recently released Assessment Standards for School Mathematics which provides a set of six standards to guide the development of assessment instruments for school mathematics.

This NCTM document makes clear, however, that it is a guide and not a “how-to” document. Guides are necessary but not sufficient. One actually needs different models of assessment that instantiate the principles set down in guidelines such as those offered by the NCTM. Balanced Assessment in Mathematics (BA) is a National Science Foundation Project charged with developing new approaches to the assessment of mathematical competence in the elementary and secondary grades. The principal grantee is the University of California at Berkeley with subcontracts to Michigan State University, the Shell Mathematics Centre of the University of Nottingham, and the Educational Technology Center of the Harvard University Graduate School of Education. The Principal Investigator for the entire project is Alan Schoenfeld of the University of California at Berkeley.

The main goal of Balanced Assessment is to produce assessment that can be used in classrooms throughout the nation — assessments that reflect the values of the mathematics reform movement as articulated in the National Council of Teachers of Mathematics Curriculum and Evaluation Standards. The assessments created by Balanced Assessment are designed to provide students, teachers, schools and parents with useful information about how students and programs are doing with respect to those standards.

This document is an interim report of two years of work by the Harvard Group of the project. It is intended to both complement and supplement other reports issued by the project. This report addresses the following questions:

· What is mathematics about?

· What are the purposes of assessment?

· How should assessment in mathematics be done?

· What is Balanced Assessment in Mathematics about?

· What is the Harvard Group of Balanced Assessment in Mathematics about?

Our report also includes a complete archive of the work of the Harvard Group. It contains a packet of on-demand tasks and scoring rubrics at the elementary level, several packets of on-demand tasks and scoring rubrics at the secondary level, a secondary level portfolio packet containing problems and projects, and a technology resource package from which teachers may draw materials to supplement other on-demand assessment. Included with this report is a CD-ROM containing all of these materials in a form that can be used by anyone who has access to Microsoft Word on either a PC-compatible or Macintosh computer with a CD-ROM drive.

The work of the Balanced Assessment Project has been influenced in no small measure by the efforts of those who preceded it. We have been helped enormously by being able to draw on these efforts. A selected bibliography of the most important of these is included as Appendix C.

This document owes much to the work of many hundreds of students and dozens of teachers. We are indebted to all of them. We are particularly grateful to Joel Hillel for helping us think through many knotty issues. In addition we want to thank Walter Stroup and the Boston teachers and students with whom he worked for many of the tasks involving graphing calculators. Finally, we wish to thank our colleagues at the other project sites.

Judah L. Schwartz, Director

Joan M. Kenney, Coordinator

Kevin A. Kelly

Teresa Sienkiewicz

Yesha Sivan

Victor Steinbok

Michal Yerushalmy

Table of Contents

1. What is Mathematics About? — the way we see the structure of the subject...............

The Objects of Mathematics..................................................................................................

The Actions of Mathematics................................................................................................

What Can/Should be Expected of Students at the Elementary Level.......................................

What Can/Should be Expected of Students at the Secondary Level.......................................

2. What are the Purposes of Assessment?...........................................................................

3. How should Assessment in Mathematics be Done?.........................................................

4. What is Balanced Assessment in Mathematics About?...................................................

5. What is the Harvard Group of Balanced Assessment About?........................................

Why this Report?................................................................................................................

Task Design........................................................................................................................

New Task Types.................................................................................................................

The “-Ness” tasks.........................................................................................................

Fermi tasks....................................................................................................................

Example generation........................................................................................................

Weighting of Tasks..............................................................................................................

Writing Rubrics for Tasks....................................................................................................

Scoring Student Performance...............................................................................................

Balancing Assessment Packets.............................................................................................

Appendix A: Mathematical Content Matrix for the Elementary Grades..........................

Appendix B: An Analysis of the “Square-Ness” Task........................................................

Appendix C: Selected Assessment Bibliography.................................................................

Appendix D: Balanced Assessment Packets........................................................................

A Balanced Assessment Packet for the Elementary Grades....................................................

Balanced Assessment Packets for the Secondary Grades......................................................

The Technology Resource Packet........................................................................................

1. What is Mathematics About? — the way we see the structure of the subject

Like many subjects, it is possible to identify both content and process dimensions in the subject of mathematics. Unlike many subjects where most of the process dimension refers to general reasoning, problem-formulating and problem-solving skills, the process dimension in mathematics refers to many skills that are mathematics specific. As a result, many people tend to lump content and process together when speaking about mathematics, calling it all mathematics content.

We believe it is important to maintain the distinction between content and process. In part we say this because we believe that this distinction reflects a something very deep about the way humans approach mental activity of all sorts. All human languages have grammatical structures that distinguish between noun phrases and verb phrases. They use these structures to express the distinction between objects, and the actions carried out by or on these objects.

We believe that the content-process distinction in mathematics is best described by the words object and action. What are the mathematical objects we wish to deal with? What are the mathematical actions that we carry out with these objects? We will try to answer these questions in a way that makes clear the continuity of the subject from the earliest grades through post-secondary mathematics. Seen in the proper light there are really very few kinds of mathematical objects and actions.

The Objects of Mathematics

The first set of mathematical objects we need to consider are number and quantity. Indeed, elementary mathematics is largely about these objects and the actions we carry out with and on them.

Number and Quantity

integers (positive and negative whole numbers and zero)

rationals (fractions, decimals and all the integers)

measures (length, area, volume, time, weight)

reals (p, e, etc. and all the rationals)

complex numbers

vectors and matrices

Along with number and quantity we introduce very early a concern for another kind of mathematical object, namely shape and space.

Shape and Space

topological spaces (concepts of connected and enclosure)

metric spaces (with such shapes as lines/segments, polygons, circles, conic sections, etc.)

From the beginning we try to make students aware of pattern in the worlds of number and shape. Pattern as a mathematical object matures into function which is the central mathematical object of the subjects we call algebra and calculus.

Pattern and Function

functions on real numbers (linear, quadratic, power, rational, periodic, transcendental)

functions on shapes

There are several other kinds of mathematical objects that have less prominent roles in the mathematics we expect our youngsters to study. These include Chance and Data, and Arrangement:

Chance and Data

relative frequency and probability

discrete and continuous data

Some aspects of data collection, organization and presentation can be done in the earliest grades but little, if any, data analysis. Notions of probability are not realistically addressable until late middle school.

Arrangement

permutations, combinations, graphs, networks, trees, counting schemes

At the youngest grades, these topics tend to blend with the study of patterns of numbers and shapes.

The following table describes the kinds of mathematical objects in more detail, along with their properties, operations that can be performed on them, and their pragmatic uses.

types of objects

properties of objects

operations on objects

semantics of pragmatic use

Number and Quantity

integers

rationals

reals

measures:

length

area

volume

time

weight

order

between-ness

part-whole relationships

units

dimensions

arithmetic operations

addition

subtraction

multiplication

division

exponentiation

counting or measuring anything in

the world around us

Shape and Space

topological

metric:

lines/segments

polygons

circles

conic sections

other (e.g. spherical geometry)

connectedness

enclosure

distance

location

symmetry

similarity

scaling

projection

translation

rotation

reflection

inversion

conformal mapping

homotopies/deformations

covering, packing and tessellating

designing and building objects

mapping and traveling

Pattern and Function

linear

quadratic

power

rational

periodic

transcendental

domain/range

continuity

boundedness

rate of change, curvature, etc.

maxima and minima

rate of accumulation

linear-root, slope/intercept

quadratic-roots, axis of symmetry

power-roots, asymptotic behavior

rational-roots, singularities, asymptotic behavior

periodic-frequency, phase

transcendental-“growth” constant

arithmetic operations (functions on R_n)

comparison

equations

inequalities

identities

composition

translation

reflection

dilation/contraction

expressing how something depends on one or more other things

resolving constraints

(solving equations and inequalities)

Chance and Data

discrete

continuous

determinism

randomness

relative frequency

distribution

moments

sampling

(by counts and/or measures)

composing

representing

dealing with uncertainty

dealing with lack of precision

Arrangement

permutations

combinations

graphs/networks

trees

adjacency

enumeration

vertices and edges of graphs/networks

organizing discrete information

The Actions of Mathematics

As previously mentioned, the process dimension of mathematics has many actions that are mathematics specific. It also involves actions that are properly regarded as general problem-formulating, problem-solving and reasoning skills. We divide these skills into four categories.

Modeling/Formulating

Transforming/Manipulating

Inferring/Drawing Conclusions

Communicating

With the exception of communication, each of these actions has aspects that are specific to mathematics and aspects that are not specific to mathematics but that are quite general in nature. We list below some of these aspects.

Modeling/Formulating

domain-general

observation and evidence gathering

necessary and/but not sufficient conditions

analogy and contrast

deciding, with awareness, what is important and what can be ignored

domain-specific

deciding, with awareness, what can be mathematized and then doing so

formally expressing dependencies, relationships and constraints

Transforming/Manipulating

domain-general

understanding “the rules of the game”

understanding the nature of equivalence and identity

domain-specific

arithmetic computation

symbolic manipulation in algebra and calculus

formal proofs in geometry

Inferring/Drawing Conclusions

domain-general

shifting point of view

testing conjectures

domain-specific

exploitation of limiting cases

exploitation of symmetry and invariance

exploitation of “between-ness”

Communicating

making a clear argument orally and in writing (using both prose and images)

It is evident that there is no reasonable way to separate, nor should there be any interest in separating, the domain-specific and the domain-general aspects of the process dimension of mathematics. We therefore come to the conclusion that it is better to parse the domain of mathematics as

object (Number and Quantity, Shape and Space, Pattern and Function, Chance and Data, Arrangement)

action (including both domain-specific and domain-general actions)

rather than by

content (usually defined by “topics” — an undifferentiated mixture of objects and domain-specific actions)

and

process (i.e. domain-general actions)

which is the usual procedure in mathematics education.

What Can/Should be Expected of Students at the Elementary Level

We view elementary mathematics (at about the 4^th grade level) as being concerned with the following mathematical objects:

number and quantity

shape and space

pattern

data

Number and Quantity

We expect youngsters to be able to demonstrate

· a robust understanding of the conceptual meaning of addition and subtraction of whole numbers and integers

Sarah has 3 apples and Joe gave her 2 more. How many apples does Sarah have now?
Sarah has 3 apples and Joe has 2 more apples than Sarah. How many apples do they have altogether?

· a growing understanding of the various meanings of both multiplication and division of whole numbers and integers

At a party 20 bags of candy were given out. Each bag contained 5 candies. How many candies were given out altogether?
Thelma has 5 skirts and 3 blouses. How many different outfits can Thelma put together?

· a reasonable degree of computational facility with the four arithmetic operations on whole numbers and integers

· an ability to make reasonable approximations for the results of arithmetic computations (this expectation is not currently realizable in most US fourth grade classrooms)

To the nearest hundred, what is 38 times 42?
To the nearest hundred, what is 716 and 879?

· a growing understanding of the order properties of decimals and other rational fractions (this expectation is not currently realizable in most US fourth grade classrooms)

Write a fraction that is larger than 1/3 and smaller than ½.
Write a decimal that is larger than 0.083 and smaller than 0.15.

· an ability to identify and measure continuous quantity such as length, area, weight and time

· an ability to make reasonable estimates of lengths, areas, weights and time in ones environment

How much does a gallon of milk weigh?
How much time does it take you to say your name?

Shape and Space

We expect youngsters to be able to demonstrate

· an ability to distinguish and name a variety of two- and three-dimensional shapes

Draw three different kinds of closed figures that have four straight lines

· an understanding of the symmetries of these shapes

Find all the lines along which you can fold a paper hexagon so that the two parts lie exactly on top of each other.

· an ability to read and interpret simple maps

Which two rooms in your school are furthest apart? Figure out three different routes that go from one to the other and tell how you would decide which is the shortest route.

Pattern

We expect youngsters to be able to demonstrate

· an ability to recognize and generate numerical patterns

What numerical pattern could continue the sequence 1, 4, 7, 10, 13, ....?
What numerical pattern could continue the sequence 1, 2, 4, 8, ....?

· an ability to recognize and generate spatial patterns

Can you tile a floor with tiles like this so that the pattern is “regular”?

· an ability to enumerate and organize simple combinations and permutations

How many different ways can you seat four people at a square table so that there is one person on each side of the table?

Data

We expect youngsters to be able to demonstrate

· an ability to collect, organize and display simple data sets

Make a presentation of all the kinds of pets owned by the students in your class. Include their weights, ages, and length from nose to tip of tail (where appropriate).

In Appendix A, the interested reader can see how these expectations of elementary school mathematical competence relates to the earlier discussion of the structure of the subject of mathematics as a whole.

Before leaving the issue of mathematical expectations of elementary students, we wish to comment on traditional mathematics instruction, new technology, and their interaction.

The traditional mathematics curriculum at the elementary levels concentrates on the acquisition of computational skills, specifically getting students to master with some degree of automaticity the algorithms for adding, subtracting, multiplying and dividing whole numbers, fractions, and decimals. We believe it is time to think carefully about that enterprise.

We live in an age when a simple four-function calculator can be bought for less than the cost of a weekly newsmagazine. With the exception of the elementary grades of the schools of our country, almost all the calculation done in the country is done electronically. Thus the schools, in preparing students to calculate “by hand” are not preparing our students for the world they will encounter.

The counter-argument is often made that students need to understand the conceptual underpinnings of the computations that are done in the world around them. Indeed they do! We claim that such conceptual understanding does not flow from mindless repetition of un-understood mathematical ceremonies, but rather from a direct addressing of the conceptual issues involved in computation with whole numbers, fractions and decimals. Thus, at the youngest levels, the reader will find that we have stressed the importance of the order properties of numbers and estimation much more than is normally done in the traditional curriculum. At more advanced levels there are other interesting and subtle conceptual issues about numbers, the differences between the way in which they have been traditionally treated, and the way in which they are treated electronically.

Repetitive computational exercises are often performed without understanding. For example, how many educated adults understand why the procedures for long division or for multiplication and division of fractions work? Filling school and homework time with tiresome computational drill

· does not prepare students for the kinds of applications of mathematics that they are likely to encounter

· deadens the students’ interest and curiosity about mathematics

· uses up time that might better be spent in helping students develop a conceptual understanding of, and appreciation for, the subject of mathematics

Accordingly, we would be well advised to reconsider what we think is important mathematics in the elementary grades.

What Can/Should be Expected of Students at the Secondary Level

By the end of secondary school we ought to have a much more reasonable set of expectations of the mathematical capabilities of our students than we now do. In particular, there is a set of expectations we ought to have of students going directly into the world of work as well as of students going on to further education in subjects that are not mathematically demanding. We expect any school-leaving young adult, no matter what their formal mathematical training at secondary level, to be able to meet these expectations.

In our analyses of this question we have relied heavily on the work of some of our secondary school teacher colleagues who regularly bring “blue-collar” people from their community into their algebra classes to talk with students about the mathematics they use in their work.

We expect all students to be able to generate and enumerate simple permutations and combinations.

advanced

We expect some students to be comfortable with iterative and recursive algorithms, discrete modeling, and optimization.

2. What are the Purposes of Assessment?

As a society and as educators, we assess both performance and competence in education in a variety of ways and for a variety of purposes. Broadly speaking the purposes are

serving instruction

accountability

selection

licensure

Assessing student performance in order to inform instruction is something that all teachers do. It is often the case that an external agency of some sort gets involved in assessment, nominally to serve instruction. The time lapse between the administration of the tests and the reporting of “scores” to teachers who might be able to use the information is such that there is little reason to assume that any such testing by an external agency has much to contribute to assessment for instruction.

Assessing for the purpose of saying how well a student, or a class, or a school, or an instructional program is doing is the primary purpose of assessment for accountability. Traditionally such information has been presented in one of two quite different forms, norm-referenced and criterion-referenced. Norm-referenced accountability statements involve comparing students’ performance (or classes or schools) to one another and then presenting the results of those comparisons in rank order. It should be noted that this can only be done if the performance of the students can be encoded in a unidimensional measure.Criterion-referenced accountability statements involve comparing students’ performance (or classes or schools) to some predetermined set of performance criteria without regard to how they compare to one another. It should be noted that this can only be done if one has a clearly defined set of performance criteria that reflect one’s theory of competence in the domain being assessed.

Assessing for selection is normally done for the purpose of helping to ascertain whether a student will have access to limited resources. Such assessment is often employed in order to inform decisions about access to select universities, programs for gifted music students, special education programs, etc.

Assessing for the purposes of licensure is normally done in order to ascertain whether the people being assessed have exceeded some threshold of minimal competence and are thus permitted to practice in an unsupervised fashion the skill that they have demonstrated. Such skills include driving automobiles, swimming in the deep part of the pool, barbering, butchering, working as an electrician or a plumber, etc.

Although it has never clearly articulated its stance with respect to these purposes, the Balanced Assessment project has focused it attention primarily on assessment to serve instruction and assessment for accountability, largely through the mechanism of assessing the performance of students on collections of tasks that the BA sites devised or adapted.

3. How should Assessment in Mathematics be Done?

In 1992 the National Council of Teachers of mathematics undertook the development of a report on assessment to complement its earlier Curriculum and Evaluation Standards for School Mathematics. At the end of this document is a table summarizing the shifts in assessment practice that the NCTM is calling for. We cite that table here.

Major Shifts in Assessment Practice

toward	away from
assessing students’ full mathematical power	assessing only students’ knowledge of specific facts and isolated skills
comparing students’ performance with established criteria	comparing students’ performance with that of other students
giving support to teachers and credence to their informed judgment	designing “teacher-proof” assessment systems
making the assessment process, public, participatory and dynamic	making the assessment process secret, exclusive and fixed
providing students multiple opportunities to demonstrate their full mathematical power	restricting students to a single way for demonstrating mathematical knowledge
developing a shared vision of what to assess and how to do it	developing assessment by oneself
using assessment results to ensure that all students have the opportunity to achieve their potential	using assessment to filter and select students out of the opportunities to learn mathematics
aligning assessment with curriculum and instruction	treating assessment as independent of curriculum or instruction
basing inferences on multiple sources of evidence	basing inferences on restricted or single sources of evidence
viewing students as active participants in the assessment process	viewing students as the objects of assessment
regarding assessment as continual and recursive	regarding assessment as sporadic and conclusive
holding all concerned with mathematics learning accountable for assessment results	holding only a few accountable for assessment results

This summary of the past and desired future of assessment in mathematics is as clear a set of guidelines as one could ask for in designing mathematics assessment. However, there is little, if anything, in this summary that could not have been written, with appropriate changes of adjective, by a task force of the National Council of Teachers of English. The central problem of changing the nature of assessment in mathematics must be faced in the design of actual mathematics assessments that reflect these guidelines. Roughly speaking, that is what the Balanced Assessment in Mathematics Project is about.

4. What is Balanced Assessment in Mathematics About?

Balanced Assessment in Mathematics (BA) is a National Science Foundation project charged with developing new approaches to the assessment of mathematical competence in the elementary and secondary grades. The project is being carried out at four sites: the University of California at Berkeley, Michigan State University, the Shell Mathematics Centre of the University of Nottingham and the Educational Technology Center of the Harvard University Graduate School of Education. Support for the Berkeley and Nottingham sites of the project began in July of 1992; support for the Michigan State and Harvard sites began in the fall of 1993.

By the end of 1995, BA will have completed the piloting of packages of assessment at each of three levels, elementary, middle school, and high school. The packages have assessment tasks and suggestions for longer projects to be done throughout the school year. Teachers and students use a package to create a set of selected works for each student which can be scored and used to document that student’s mathematical achievement, as well as to provide a balanced picture of that student as a learner of mathematics.

The type of assessments that BA is creating contrast sharply with traditional forms of testing, which rely primarily on multiple-choice questions. On standardized tests students are expected to answer each item in a minute or two. Such tests make no claim to assess a student’s problem-solving abilities, nor do these test provide information about how a student reasons, communicates mathematically or makes connections across mathematical content.

BA’s focus is on rich, mathematically complex work that requires students to create a plan, make a decision or solve a problem — and then justify their thinking. The contents of each assessment package range from short tasks to extended investigations and projects involving a week or more of work, and which include evidence of student collaboration, reflection and growth.

Further the project believes that assessment that is worthwhile to teachers, students, and others with a valid interest in what students can do mathematically, must also have the following characteristics:

· Assessment focuses on important, grade-level appropriate mathematics. Since assessment can only sample from all that is learned, it must sample as effectively as possible — by concentrating on the most important and useful mathematics taught and learned at that grade level, as defined by the NCTM Standards.

· Assessments are worthwhile learning activities — not digressions from learning. For the student, assessment is a tool that helps further the understanding of important mathematical ideas. For the teacher, assessment is student work that informs and augments instruction. Worthwhile assessment is not something students and teachers “stop and do,” but a way to further what they are already doing.

· The assessment maintains a focus on accessibility and equity for all students. The student must have — and the teacher and student must perceive that the student has — a fair opportunity to do his or her best. Assessments are designed to provide a student of either gender and of any cultural, linguistic and socio-economic background with the means to do his or her strongest mathematical work.

· Assessment elicits scorable, informative student work. The assessments are designed to elicit more than just an answer from the student. Rather, students are asked to solve a problem, show their thinking, create a product. The information in the student’s response, and the features of the student’s work that are evaluated, give a picture of his or her understanding of mathematical concepts, strategies, tools and procedures.

Clearly, the stated intent of the Balanced Assessment in Mathematics Project is entirely consonant with the directions in which the NCTM would like to see assessment in mathematics move. More to the point, the products of its efforts are also consonant with these directions and, in our view, represent a major step forward.

5. What is the Harvard Group of Balanced Assessment About?

Why this Report?

While there are no differences between the members of the Harvard Group and the BA project as a whole with respect to overall strategic goals, there are several areas in which the work of the Harvard group of BA differs from that of the other project sites. These include

task design and new task types

“weighting” of tasks

writing rubrics for tasks

scoring students performance

balancing assessment packets

The Harvard Group went about the process of task design in a way that differed from the other groups and was largely informed by the object ´ action analysis of the domain that it had made at the outset. This led to a particular kind of analysis of task demands, and a particular strategy for writing scoring rubrics for tasks. It also led to an explicit procedure for the balancing of tasks.

In addition to these procedural differences, there are two philosophical differences:

1. We believe that human performance in any cognitive domain of interest, including mathematics, is too complex to be reduced to unidimensional measures. Scoring performance of students should reflect this complexity. In keeping with this position, we do not accept the idea giving a student a single score on a task.

In addition to the trivialization of complexity that accompanies unidimensional measures, the use of such measures opens the door to a great deal of social mischief by making it easier to compare students to one another rather than to established criteria as called for by the NCTM and others.

2. Also in keeping with the NCTM Assessment Standards, we believe in making assessment public rather than secret. Well-crafted tasks aimed toward assessing well-defined skills and understandings need not be kept secret, either before or after they are administered. Indeed, if one wishes, as the draft NCTM Assessment Standards call for, to have assessment “...aligned with curriculum and instruction” and to be understood as “...continual and recursive” and for the community of mathematics educators and the public to have a “...shared vision of what to assess and how to do it,” then keeping the tasks secret is counterproductive.

We turn now to a detailed description of some of the ways the Harvard Group of the Balanced Assessment Project has gone about doing its work for the past two years.

Task Design

For most of the period of the grant the primary responsibility of each of the sites of the BA project was to design assessment tasks, to try them in both classroom and clinical settings, and to revise them in the light of student reaction to those trials. The BA project undertook to design three quite different sorts of tasks. They are

skills tasks tasks that primarily test the ability to manipulate and compute

problems tasks that primarily test the ability to model, infer, and generalize

projects tasks that test the ability to analyze, organize, and manage complexity

The Harvard Group of BA approached the problem of Task Design within the framework of the Mathematical Content Matrix presented earlier in this document. First we decided what mathematical objects and actions we expect students to have mastered; this gave us a reasonably focused view of the mathematical playing fields within which we needed to design tasks.

If one needs to generate a large number of assessment tasks, it is clear that thinking in terms of task types, rather than in terms of individual tasks, is a useful strategy. Wherever possible we strove to make clusters of tasks that were linked by context, or mathematical structure, or both. By way of illustrating this strategy as well as demonstrating what we mean by each of the three categories of task, here is an example of each.

Skills task

Here is a diagram of a new kind of race track. What is the total length of the track?

The combined length of all the curved sections is 100 meters.

The length of each straight section is 100 meters.

Problem

Two joggers set out at the same time and from the same place and in the same direction to jog on a circular track. Jogger A jogs at a constant speed which is exactly twice the speed of jogger B. They jog for the same period of time and stop after A has completed 6 laps around the track. (You may ignore the time it takes for the joggers to get up to speed at the outset and to slow down at the end.)

An observer at the geometric center of the track monitors the angle between the two joggers as a function of time. Sketch a graph of this observer’s data.

How would the graph of the observer’s data differ if the two runners had started off in opposite directions at the outset?

Project

Track of Dreams

The Situation

In the last twenty years, science has improved the conditions of competition in many sports. Some things, however, have not changed in ages. The dimensions of playing fields and courts in team sports, tennis and some other court sports have remained the same, largely to preserve the tradition of the sport. However, in track and field these unchanged dimensions are primarily due to the fact that the track and other venues are often associated with football or soccer fields and, therefore, must rely on their dimensions. If an average football field is about 140 by 65 yards, it allows for a track oval around it with the length of the inside lane of about 400 meters. A soccer field, which is shorter but wider, produces roughly the same length of the track.

For international competition, 400 meters is the standard length of the inside lane on the track. In addition, the following conditions are required for international competition

· The width of the each lane must accommodate the runners in such a way that two runners running side by side in adjacent lanes would not interfere with each other.

· The track must have 8 lanes.

· The 100 meter race must be run along a straight path.

· The finish line must be at the end of a straight part of the track, continuous across all lanes and perpendicular to the track.

· The starting line must also be perpendicular to the track, but need not be in the same place for all lanes (this is a “staggered” start necessitated by the different lengths of the lanes).

There are also restrictions about the type and quality of the surface and some other conditions necessary for accreditation, but these are not important here.

The Problem

You have been hired by the Santa Monica Track Club (the most prestigious in the country) to design a new track. Your job is to analyze a number of possible shapes and present the club with three alternative shapes for the new track and to also present arguments — pro and con — for each of these. The track is to be built solely for the track-and-field purposes so there will be no external constraints with respect to the dimensions or directions of the track.

Some of the international rules are a matter of tradition and could be relaxed for a new scientifically designed track. However, all the conditions listed above (except for the length of the inside lane) must be met. Furthermore, there are other physical and esthetic constraints that further limit the design possibilities:

· The track must accommodate races for 100, 200, 400, 800, 1500, 3000, 5000 and 10000 meters. Some other races, such as 1 mile or 50 meters, may be run there as well, but are not a priority, so they should not figure in the computations.

· The length of the shortest lane of the new track must be some multiple of 100 meters in length.

· For construction reasons, all parts of the track must be designed as parts of circles or straight lines; at the transition points from one part to another, these circles and lines must be tangent to each other.

· The lanes cannot separate at any point, that is, crossing the track at any point in the direction perpendicular to it must cut across all eight lanes in succession with no space between them.

· The faster you run the harder it is to turn, so races from 50 to 400 meters must be run in such a way that no one makes sharp turns; for longer races the turns could be made tighter; overall, no part of the course for a specific race can contain a part of a circle with a diameter (measured in meters) smaller than where 9,000 is a constant measured in square meters which was derived by observation. This constraint is designed to prevent runners from falling on turns or one runner having an advantage over another.

· An attempt must be made to minimize the overall dimensions of the track, for two reasons: Costs should be considered, although a superior track might be chosen despite its higher price. Spectators should be seated so as to allow them to see as much of the race as possible; for this reason a straight track would be out of the question.

1. Several designs have already been submitted by different parties. As the official design consultant you must sift through these and explain why some of them must be rejected. Even though you reject some or all of these, they may give you ideas about possible designs. Write a letter to your assistant, explaining which of these proposals are rejected and why, and point out some possible modifications which could make similar designs acceptable. (In all instances the distances are approximate and measured along the track between the marked points.)

The combined length of all the curved pieces is 100 meters.

2. Suppose now that you are the assistant who received the above letter. You must respond to the letter with proposed modifications of these designs. Note that in proposal E the combined length of all the curved parts could not only be 100 meters, but also 200, 300 meters or some other multiple of 100 meters. In your response, you will need to analyze several possible variations, including changing the length in the case of design E. Write such a letter with detailed mathematical analysis.

3. Most of the tracks under consideration have the unfortunate property that they require a staggered start. This happens because the lanes are not all of equal length and the athletes on the inside lanes are required to make a sharper turn than the athletes on the outside. Can you generate some possible designs that would satisfy all the conditions and would not require a staggered start for at least some of the races?

4. Having returned to your supervisory capacity, now is the time to find other possible shapes and write the final report on the proposal. If you believe that some of the conditions should be relaxed in favor of a specific design, you will need to convince a committee composed of athletes, administrators, architects and mathematicians. Therefore, your arguments must be clear, succinct and precise.

This third category of task, projects, deserves special comment because of the growing interest in the use of portfolio assessment. We found that many users of portfolio assessment took the position that the essential feature of such assessment was the fact that the student chose, with or without guidance from the teacher, what pieces of work to include in his or her portfolio. While this in itself is desirable, it can lead to portfolio content that is minimally demanding and hardly able to exhibit the student’s strengths and weaknesses. The situation may be compared to that of a student who applies for admission to a music conservatory and submits a tape of playing scales.

Harvard’s Balanced Assessment team thinks of projects as intellectual undertakings that require students to make an effort over an extended period of time to structure and formulate a problem, and then to analyze the problem as they have formulated it. Projects are not problems with unique, correct solutions. Projects are not long and complicated versions of problems that one normally assigns in the context of a classroom assignment. Projects are not problems that require tricky insights or inventions to solve. The essence of a Balanced Assessment project is that it is a task that requires a student to ruminate and reflect about a rich web of complexity, and to sort out some main threads that can serve as the basis for structuring a response.

In the course of addressing the problem that forms the core of the project as they have formulated it, students will have to perform a wide range of traditionally taught mathematical actions that might include manipulating algebraic symbols, plotting graphs, geometric constructions, compiling tables, and performing numerical computations. The accurate performance of these actions, as important as they are, is only a part of doing a Balanced Assessment project. Students are also asked to make inferences, draw conclusions, and present their work in both written and oral form.

In this section we describe some considerations that teachers should keep in mind as their students work on projects and as they, the teachers, assess the products of their students’ efforts.

How shall I organize the class for project work?

Mathematics has traditionally been a subject in which we have insisted that students work alone. Whatever the merits of that viewpoint might be with respect to covering the content of the syllabus, it seems to us that project work is different. The essential issue in project work is development of desirable “habits of mind” about organizing and analyzing complexity. Adults, when confronted with tasks of this sort, often address them in groups. The reason for doing so is the intellectual resonance and symbiosis that leads groups of people to fashion far better solutions to complex problems when they work cooperatively than when they work in isolation. We suggest, therefore, that students undertake project work in small groups. You may want to give some thought to how groups ought to be composed and whether or not to juggle the composition of the small groups at various times during the school year.

How much time shall I allocate to project work during the year?

This will vary with individual teachers. Some teachers will try to build a whole year’s work around projects. We find that it is often difficult to do this — there is always the nagging feeling that the curriculum is not being “covered.” On the other hand we feel that the kind of intellectual development that coping with a project offers is of sufficient importance that students ought to spend no less than 10% of their time, and probably as much as 25% of their time on such work.

One could imagine students spending one quarter of their time every week throughout the school year on their continuing project work. Alternatively, one can imagine intensive two week project periods distributed throughout the year. During these periods students would use all of their mathematics time for project work. Other time arrangements are also possible. Ultimately, it will be individual teachers who make this decision in light of their understanding of what best fits the needs of their classes.

How should project work be presented?

We think that project work should be presented both in writing and orally. The written presentation should describe the contextual setting and how, within that setting, the problem is defined. The written presentation should present explicit arguments for why the projects omits consideration of some factors and includes others. It should show clearly how solution was approached. It should indicate clearly where there are further issues to investigate.

The oral presentation should be made publicly to the entire class after the teacher and at least some of the students have read the written presentation. Following a brief outlining of the written document, the presenting students should entertain questions and comments.

Here is one possible specific way you might organize the presenting of project work. Have each group submit a written report of its work. In addition, ask each group to read the written reports of two other groups. On the day of the oral presentations, have each group present its work — and then serve as a panel to answer questions put to them by the other groups that have read their work, as well as by the teacher and other students.

How do I grade project work?

All Balanced Assessment projects ask the student to prepare a document with an intended purpose for an intended audience. Consequently, scoring of projects comes down to an analysis of two questions:

Is the document suitable for the specified audience?

Does the document fulfill the requested purpose?

To aid in evaluating student project work in the light of these two criteria, we suggest the following six perspectives; each perspective may allow you to reach conclusions about some aspect of the student’s effort.

Organizing the Subject: How well does the student structure a large collection of interrelated issues and identify possible problematic areas? Are the constraints described in the problem made clear at the outset? How well does the student argue the relative importance of factors that are taken into consideration in addressing the problem, and the relative unimportance of factors that are ignored? Does the report proceed in a logical manner?

Analyzing the Problem: How well does the student define the problem? Is the student clear about all the resources, both tools and information, that will be needed to address the problem? Does the student draw appropriate implications about the given data?

Accuracy and Appropriateness of Computation/Manipulation: Are the symbolic manipulations carried out accurately? Are the graphs plotted correctly? Are the axes labeled sensibly? Is the scale reasonable? Are the geometric constructions “constructable”? Are the table columns properly labeled? Are the computations that underlie computed columns clearly defined? Are the numerical computations done accurately? Are the graphs, charts, tables used appropriately? Are they pertinent to and do they strengthen the argument?

Thoroughness of Inquiry: Is the work perfunctory or thorough? Is the reader left to fill in many missing steps? Are all the implications of the complexity of the problem followed up and examined? Is the reporting level of detail adequate?

Clarity of Communication: Can the student’s work be read by a colleague who is previously unacquainted with the work? Is the student’s written presentation intelligible to another teacher? to the principal? to a group of parents?

Drawing Conclusions: o students draw reasonable conclusions from their work? Do they clearly explore the ways in which their work answers the questions they have posed?

Extending the Inquiry: Has the student identified interesting aspects of the problem that lend themselves to further exploration? Has the student noted related problems that might be approached in similar ways?

Finally, it should be said that we are mindful that teachers’ constraints and opportunities vary from place to place. Not everything we suggest will be desirable, or even possible, at every location. However, we are certain that challenging the students with a significant amount of project work will be rewarding and engaging to both teacher and student.

New Task Types

One of the strategies the Harvard Group of BA employed in order to design tasks that are fresh and engaging to student is to try to find new task types that students may not have encountered before. There are several such types that we found and/or developed. For the most part these are tasks that do not have unique correct answers. They are tasks that promote discussion, and occasionally debate, among students. We believe these to be important features of successful tasks.

The “-Ness” tasks

The purpose of these tasks is to see how well students can mathematize a relationship that they are aware of perceptually but probably have never attempted to describe in any formal, not to mention quantitative, fashion.

Each of these tasks requires students to identify and describe formally a geometric property of some two or three dimensional shape. It is important to stress that properties such as “squareness” or “bumpyness” are not formal geometric properties. There are no formally correct and universally accepted answers to these questions. On the other hand, there are sensible (and non-sensible) answers.

For the sake of specificity, the following is an example.

Below is a collection of rectangles.

1. Which of the rectangles is the “squarest”?

2. Arrange the rectangles in order of “square-ness” from most to least square.

3. Devise a measure of “square-ness,” expressed algebraically, that allows you to order any collection of rectangles in order of “squareness.”

4. Devise a second measure of “square-ness” and discuss the advantages and disadvantages of each of your measures.

The elements of performance on these tasks are as follows:

a. choosing the most and least square, sharp, etc. figure

b. a verbal description of the geometric property being modeled

c. identifying the geometric elements that combine to form the measure

d. forming an algebraic relationship among these elements

e. computing values of the measure for various figures

f. discussing the advantages and disadvantages of the measure

The weight that should be assigned to the successful completion of these elements of the task increases as one goes down the list. In our view successful completion involves satisfactory completion of at least parts a through d.

Problems of this sort have several important virtues. Even very weak students can get started on them. At the same time, they provide an opportunity for strong students to display a great deal of sophistication. In addition, they exercise several quite important mathematical muscles that are rarely called upon in school mathematics. We refer here to the need for defining quantitative constructs — a central feature of the application of mathematics to both the natural and social sciences.

Appendix B contains a discussion of some of the directions one might go with the “square-ness” problem.

Fermi tasks

Another type of problem that we have introduced is known in some circles as “Fermi problems.” These problems ask for the estimation of quantities such as number, length, area, volume, weight, and time. Many of these quantities are measures of things that you are likely to encounter in everyday life, although they may not be quite in the form that you normally think of them.

These problems are called “Fermi problems” after Enrico Fermi, one of the great physicists of the twentieth century. Fermi taught for many years at University of Chicago where he used to ask his beginning graduate students “...to estimate the number of piano tuners in Chicago.”

In order to make the kinds of estimates that respond to Fermi problems, one must often make use of information that is not contained in the statement of the problem. Sometimes the necessary information will be the sort of thing you might already know. Sometimes you may have to use reference materials in order to find the necessary pieces of information.

Lets explore some ways of thinking about this sort of problem. Suppose we are asked “How many words in all the books in the school library?” How could we go about estimating this number?

If we knew

      the number of shelves of books in library, and
      the average number of books on a shelf, and
      the average number of pages in a book, and
      the average number of words on a page

then we could estimate the number of words. Suppose a school library has

600 shelves

and that on the average there are

16 books on each shelf

Suppose further, that there are, on the average,

250 pages in each book, and
400 words on each page.

We can estimate the number of words by multiplying

The size of this product is 960,000,000 words. How good is this answer? How well does it apply to your school library?

People are not accustomed to seeing problems in mathematics that do not have exactly one correct answer. How, then, are they to think about these problems? In particular, when is the answer to such a problem good enough? A rough guide is the following:

Devise two different strategies for arriving at an estimate of the desired quantity. If the larger of the two estimates is no more than 10 times the smaller estimate then there is a strong likelihood that the estimate(s) you have made is (are) reasonable.

For example, consider the illustrative problem that we worked on before, i.e. the number of words in all the books in the school library. Suppose we had approached the problem differently. Suppose we said that the library, on the average, buys 50 books a month during each of the 10 months of the school year. Suppose, further, that the library has been doing this for the 20 years that the school has been in existence.

We can now compute the number of words in all the books in the library in a quite different way.

This computation produces an estimate of 1,000,000,000 words.

Here are three quite different ways of arriving at an estimate the total distance a person walks in a day. One estimate attempts to take account of the total waking time of a person during the day, assumes that they are moving about a certain fraction of that time and that when they move, they do so at an assumed average speed.

Another estimate relies on the information that comes from a shoemaker who says that a pair of sneakers wear out after some number of miles. If you know how often you replace your sneakers then you can calculate an average distance walked (or run) in a day.

Finally, a third method of estimating the total distance a person walks in a day relies on following the person about and simply adding up the estimated distances they walk, including to and from school, store, ball field, around the house and school, and so on.

To be sure these estimates may not all yield the same number. Although they differ from one another, they are all reasonable ways of approaching the problem. If you estimate the distance a person walks by each of these three methods and they turn out to give numbers that are not very different from one another you may assume that you have made a reasonable estimate.

Example generation

An effective way of probing a student’s understanding of the application of a concept in context is to ask the student to generate an example of the application of that concept. Typically, such questions do not have unique answers. For example, one might ask a student to give an example of a number that is an integer power of both 2 and 4, or an even function of x that is not constant but never exceeds 1 in absolute value.

This type of problem, i.e., of providing examples of a mathematical object that has a specified set of properties, is in our view underutilized in the assessment of mathematical competence. Here are some examples.

Number and Quantity

Give three examples of numbers that are evenly divisible by 2, 3, 4, and 5.
Write a number whose value is between 1/4 and 1/5.

Shape and Space

Draw a 4-sided figure with two pairs of sides of equal length that is not a parallelogram.
Draw a triangle whose circumscribing circle’s center lies outside the triangle.

Pattern and Function

For each of the following pairs of functions, write a function which is everywhere at least as large as the smaller and no larger than the larger of the two functions

Chance and Data

Design a red and blue painted dart board such that the chance of landing on red is three times the chance of landing on blue.

Arrangement

Devise two different techniques for alphabetizing a large list of names, for example all the students in your school. Contrast the efficiency of your techniques with one another.

Weighting of Tasks

In order to approach the problem of designing balanced assessment packages in mathematics one must have a clear view of the kinds of understandings and skills that we wish to assess in our students and the ways in which the tasks we design elicit demonstrable evidence of those skills and understandings. In what follows we shall describe how our view of the subject of mathematics, its objects and its actions, informs the design of tasks and the balancing of assessment packages.

Each task is classified according to domain, i.e. the mathematical objects that are prominent in the accomplishment of the task. Most of our tasks deal predominantly with a single sort of mathematical object although some deal with two. Each task offers students an opportunity to demonstrate a variety of kinds of skill and understanding.

In order to score student performance on a task one has to first analyze the task and decide on the nature of the demands that the task makes on the student. We considered the following four kinds of skill and understanding.

Modeling/Formulating: How well does the student take the presenting statement and formulate the mathematical problem to be solved? Some tasks make minimal demands along these lines. For example, a problem that asks students to calculate the length of the hypotenuse of a right triangle given the lengths of the two legs does not make serious demands along these lines. On the other hand, the problem of how many 3 inch diameter tennis balls can fit in a (rectangular parallelepiped) box that is 3" ´ 4" ´ 10", while exercising the same Pythagorean muscles in the solution, is rather different in the demands it makes on students’ ability to formulate problems.

Transforming/Manipulating: How well does the student manipulate the mathematical formalism in which the problem is expressed? This may mean dividing one fraction by another, making a geometric construction, solving an equation or inequality, plotting graphs, or finding the derivative of a function. Most tasks will make some demands along these lines. Indeed most traditional mathematics assessment consists of problems whose demands are primarily of this sort.

Inferring/Drawing Conclusions: How well does the student apply the results of his or her manipulation of the formalism to the problem situation that spawned the problem? Traditional assessments often pose problems that make little demand of this sort. For example, students may well be asked to demonstrate that they can multiply the polynomials (x+1) and (x–1) but not be expected to notice (or understand) that the numbers one cell away from the main diagonal of a multiplication table always differ from perfect squares by exactly 1.

Communicating: How well do students communicate to others what they have done in formulating the problem, manipulating the formalism, and drawing conclusions about the implications of their results?

Since we do not expect each task to make the same kinds of demands on students in each of the four skills/understandings area, we assign a single digit measure of the prominence of that skill/understanding in the problem according to the following scale of weighting codes:

Weighting codes

0 not present at all

1 present in small measure

2 present in moderate measure, and affects solution

3 a prominent presence

4 a dominant presence

Note that these numbers are not measures of student performance but measures of the demands of the task for a given performance action.

Most tasks will involve these skills and understandings in some combination. Needless to say, different tasks will call differently on these actions. Therefore, it is necessary in designing tasks to pay particular attention to the nature of the demands on performance that the tasks make.

What does all of this mean for the fashioning of both tasks and balanced assessment packages?

For each task one decides on the basis of experience, taste and judgment, ideology and philosophy, how the task’s demands should be distributed among the content domains:

Content domain weighting

Number and Quantity
Shape and Space
Pattern and Function
Chance and Data
Arrangement

[Entries must sum to 1.]

and among the various sorts of performance actions:

Process weighting

Modeling/ Formulating	Transforming/ Manipulating	Inferring/Drawing Conclusions	Communicating

[Each entry is on a 0-4 scale.]

This is precisely how we went about the design of our assessment packages. Each of the tasks we designed was weighted in this fashion. Balanced packets of assessments at different grade levels were then assembled by suitably assembling tasks into collections whose aggregated weights reflect our thoughts about appropriate demands for that grade level.

At the youngest grades we placed great emphasis on Number and Quantity and Shape and Space objects. We tried to stress the Modeling/Formulating and Inferring/Drawing Conclusions actions much more than do traditional assessment programs that tend to concentrate on assessing students’ ability to manipulate number and shapes. In addition, we tried to devise fresh and interesting ways of having students communicate with one another as well as report to us on their efforts.

At the other end of the grade spectrum, the distribution of emphasis shifted. The Number and Quantity and Shape and Space objects are still present but the Chance and Data and Arrangement objects are now a significant portion of the assessment, and the Pattern and Function object assumes great importance. The demands on students are more subtle and nuanced. Apart from technical questions designed to ensure that transforming and manipulating skill exceeds some reasonable threshold, strong emphasis is placed on tasks that make extensive Modeling/Formulating and Inferring/Drawing Conclusions demands. Here too we have attempted to devise a wide variety of ways for students to communicate with one another and with us about their efforts.

The particular distribution of emphasis for each of the packages we designed is discussed in the introduction to that package.

Writing Rubrics for Tasks

After a task has weights assigned that reflect the different demands the task makes on a student it is possible to write scoring rubrics. Here for example is a task appropriate to secondary level, along with the weighting of the task and the scoring rubric written for it. The rubric is specific to the task and analyzed along the lines of the performance actions we have defined.

No one can write rubrics that exhaustively anticipate the richness and variety of student responses. Teachers who use the rubrics we write are urged to use them as a guide where they are helpful, and to use their own good judgment when they find our rubrics not shedding light on their students’ efforts.

Melons and Melon Juice

1. An empty fruit crate weighs 5 pounds. When filled with 10 melons it weighs 35 pounds. Plot the weight of the fruit crate as a function of the number of melons it contains.

2. An empty 10 gallon can weighs 5 pounds. When filled with melon juice it weighs 35 pounds. Plot the weight of the can and juice as a function of the volume of
melon juice in the can.

3. Discuss the similarities and differences of your two graphs.

Melons and Melon Juice L016 scoring rubric

Math Domain

Number/Quantity	Shape/Space	Pattern/Function

Chance/Data	Arrangement

Math Actions (possible weights: 0 through 4)

3	Modeling/Formulating	2	Manipulating/Transforming

2	Inferring/Drawing Conclusions	2	Communicating

Math Big Ideas

Scale	Reference Frame	Representation

Continuity	Boundedness	Invariance/Symmetry

Equivalence	General/Particular	Contradiction

Use of Limits	Approximation	Other

The essential point in this problem is for students to display an understanding of the difference between a discrete linear function and a continuous linear function. Graphs for these functions are shown on the following page.

The two graphs are similar in the respect that they both begin at the point (0,5) and end at the point (10,35), with a constant rate of change.

The symbolic form of both of these functions is Weight = 5 + 3n.

The functions differ in the following respects:

In the case of the melons, the domain is the integers 0-10; in the case of the melon juice the domain is continuous between 0 and l0.

In the case of the melons, the range is the integers 5,8,11,14,17,20,23 26,29,32,35; in the case of the melon juice the range is continuous between 0 and 35 pounds.

	partial level	full level
Modeling/ Formulating (weight: 3)	Set up a weight vs. object (melons or gallons of juice) graph. Give some evidence of linear dependence.	Make a clear distinction between linear discrete and linear continuous dependence.
Transforming/ Manipulating (weight: 2)	Plot the linear functions with reasonable accuracy.	Show the plotted functions with a clear distinction between the discrete and continuous dependence.
Inferring/ Drawing Conclusions (weight: 2)	Realize that both graphs have the same angle of upward trend.	Realize that both graphs start at (0,5), but that the range of the melon graph will be discrete values satisfying the equation W=5+3n.
Communicating (weight: 2)	Provide only graphs with no prose description of similarities and differences.	Give a prose description of each graph in detail, and present an argument for the graphs being similar or distinct.

Scoring Student Performance

What does all this mean for the scoring of students who use balanced assessment packages? There are several premises that underlie our views on scoring. These are:

· The BA assessment packages are designed to allow their users to make informed judgments about both the success of individual students and the success of instructional programs as a whole.

· All BA tasks are inherently multidimensional and thus notions of unidimensional ranking of students, independent of purpose, are inappropriate.

· Scoring along any single dimension is at best ordinal and thus notions of ratio or even interval scales are inappropriate.

Assessing student performance on a single task

For each task we expect the person scoring the student’s performance to use one of the following performance icons for each of four different kinds of skill and understanding. These performance icons may be thought of as ordinal measures with roughly the following values:

Performance icons

the student shows little evidence of skill or understanding

[internal code 0]

the student shows a fragile skill or understanding

[internal code 1]

the student shows an adequate level of skill or understanding

[internal code 2]

the student shows a deep and robust level of skill or understanding

[internal code 3]

Here is a copy of a sheet recording the performance of one student on the “Melons and Melon Juice” task.

Content domain weighting

Number and Quantity	0
Shape and Space	0
Pattern and Function	1
Chance and Data	0
Arrangement	0

[Entries must sum to 1.]

Process weighting

Modeling/ Formulating	Transforming/ Manipulating	Inferring/Drawing Conclusions	Communicating
3	2	2	2

[Each entry is on a 0-4 scale.]

Student performance

Modeling/
Formulating

Transforming/
Manipulating

Inferring/Drawing Conclusions

Communicating

score

wtd. score

6/9

score

wtd. score

4/6

score

wtd. score

2/6

score

wtd. score

4/6

Since realistically any single task is unlikely to involve more than two mathematical domains or kinds of mathematical object, the content domain weighting table is unlikely to have more than two rows with entries. Indeed, most tasks will have entries in only one row. The assessor enters into each of the pertinent cells one of the four performance icons (or their internal codes) described above.

In this instance the task is deemed to be primarily about function as the mathematical object; it makes prominent demands on a student’s ability to model, and moderate demands on the students ability to manipulate, infer and communicate. These demands are indicated by weights of 3, 2, 2, and 2 respectively in the process weighting table.

The student received high partial credit, i.e. or 2, for Modeling/Formulating. Since the weight in this area is 3, the student received a weighted score on Modeling/Formulating for 2 ´ 3 = 6 out of 9 possible points. Similarly, the student received a weighted score of 2 ´ 2 = 4 out of 6 for Transforming/Manipulating. On Inferring/Drawing Conclusions, the student received low partial credit, i.e. , for a weighted score of 1 ´ 2 = 2 out of 6. For Communicating, the weighted score was
2 ´ 2 = 4 out of 6.

It is important to stress that the numbers used in computing weighted scores from raw scores all vanish in the final presentation of the student’s record.

Assessing overall student performance in mathematics

Assessing the overall performance of individual students in mathematics requires us to record their performance on individual tasks and to aggregate their performance across a large number of individual tasks while preserving, to as large an extent as possible, the richness of the information yielded by the observations on individual tasks.

One could imagine reporting the complete student record of performance on each task. This volume of information is likely to overwhelm whoever looks at it, and, except in the case of the clinician specifically interested in as particular student, to be of little use to anyone. For example, a complete student record might have this structure.

Complete student record

student name	class/date
action task/domain	Modeling/ Formulating	Transforming/ Manipulating	Inferring/Drawing Conclusions	Communicating
task 1
Number and Quantity
Shape and Space
Pattern and Function
Chance and Data
Arrangement
task 2
Number and Quantity
Shape and Space
Pattern and Function
Chance and Data
Arrangement
*etc.*

Each cell in this table contains the weighting code of that task on the skill/understanding in question for a given kind of mathematical object. Each cell corresponding to a task the student has tried also contains a performance icon (or its internal code) denoting the quality of his or her performance on that aspect of the task.

Such a body of information might well be useful to teachers for informing instructional decisions and, in addition, can be thought of as a cumulative record of a student’s mathematics activities throughout the school years. However, it may be somewhat inundating for purposes of accountability. For such purposes it becomes necessary to aggregate student performance over several, possibly many tasks. Let us consider a specific example.

Here is a comparison of the work of two students on a collection of tasks. The students were asked to choose five tasks. Their choices were to be governed by the following constraints:

no more than 2 tasks from any single content domain

no more than 3 skills tasks

at least 2 problems

Here are the results of the students’ efforts:

The leftmost column indicates the name of the task. The next column indicates the predominant mathematical content domain (n - Number and Quantity, ss- Shape and Space, f - Pattern and Function, cd - Chance and Data, a - Arrangement). The bold figures are the weights given to each of the mathematical action categories for each of the problems. The figures in italics are the scores (0, 1 or 2 for partial level, 3 for full level) that each student received on that section of each of the problems.

Note that a column sum of the scores at this stage would record the fact that the students were equally competent at modeling, transforming, and communicating, although a teacher might intuitively feel that there were distinctions to be made on the performance of these two students.

To calculate the weighted sum of the first student’s performance on modeling in the domain of Function we note that the first function problem was assigned an M/F weight of 4 and that the student made 3 marks (i.e. full level) on it. The second problem in the domain of function was assigned an M/F weight of 3 and the student made partial level on it, receiving 2 marks. Thus the student received 18 M/F marks (3´4 + 2´3) out of a possible 21 (3´4 + 3´3) in the domain of function, resulting in a decimal score of 0.86.

Similar calculations are done for each cell. Performance icons are then assigned as follows.