[SBA] Determining Grades

In this series:

  1. Writing Learning Standards
  2. Constructing Proficiency Scales
  3. Designing Assessment Items
  4. Determining Grades

Determining Grades

It’s time to report out. How would you translate the following into a proficiency level, letter grade, or percentage? What would you assign to Aaron, Blake, and Denise?

Gradebook at time of first Learning Update

If your reporting policy requires a proficiency level (e.g., Grades K-9 in BC), analyze the data and make a judgement. To me, Aaron has demonstrated Extending, Denise Developing. Blake has also demonstrated Developing. Or Partial. I’m waffling.

What if this was your gradebook for Math 10? In BC, you  may use proficiency scales but must provide letter grades and percentages. In this post, I’ll propose a solution–admittedly flawed–to this problem. But first, a bit about why this is a problematic practice…

Percentage Problems

Think of a student who has achieved 80% in Math 10. Describe their level of performance.

Got it? Great! Now do 79% and 81%.

Don’t stop! Finish the Bs.

A letter grade and percentage mandate suggests a difference between 73% and 85%–both Bs in BC. Quantitatively? Sure. In the point-gathering paradigm, 73% leaves almost twice as many points on the table as 85% (i.e., the “Lo-B, Hi-B” refrain).

73% vs. 85%, with apologies to 99pi

But qualitatively? Not really. See the Ministry of Education’s letter grade definitions:

F; 0 – 49; The student has not demonstrated, or is not demonstrating, minimally acceptable performance in relation to the learning outcomes for the course or subject and grade.
Policy Development Background & Rationale Document (PDF)

There are not thirteen (85 − 73 + 1) variations on very good. Three is a stretch:

NB: pretty good < good

Extend the table. Write distinctly different descriptors of all levels, from 86% up to 100%, 72% down to 0%.

0-36 didn’t fit.

You can’t. Whereas letter grades differentiate six levels of performance, percentages differentiate one hundred one. No teacher can be that precise (or accurate). Like objectivity and consistency, precision is a myth.

Standards-based assessment is not designed to produce percentages. Proficiency scales are not numbers! Still, teachers–of Grades 10-12 only–are required to report out a number. So, holding my nose…

Imperfect Solutions

🔴 1-2-3-4

To turn the data into a number, values need to be assigned to proficiency levels (e.g., Emerging = 1, Developing = 2, Proficient = 3, Extending = 4). Students receive a value on each outcome. The numerator takes together these values from all of the outcomes; the denominator is the greatest sum that is possible. Aaron, Blake, and Denise receive 83% (B), 63% (C), and 48% (F), respectively.

Student Navigation Tool

This feels… off. Denise demonstrated partial (Developing) or complete (Proficient) understanding of seven of ten learning outcomes. Nevertheless, she is failing. This is because a 1-2-3-4 scale is harsh. One-out-of-four (i.e., 25%) for Emerging isn’t just a failing grade; it’s an unforgiving one. Also, two-out-of-four (i.e., 50%) for Developing leaves no wiggle room. Developing is more than a minimal pass.

🟡 2-3-4-5

A 2-3-4-5 scale feels more accurate. Aaron, Blake, and Denise now receive 86% (A), 70% (C+), and 58% (C-), respectively.

Student Navigation Tool

Note that Denise is now passing. I really like the example of Aaron since it illustrates that Extending is not “the new A.” To achieve an A, Aaron demonstrated Proficient in all, Extending in (just) a few. Further, Blake’s C+ feels fair. To “award” Blake a B, I’d want to see additional evidence of their proficiency (i.e., new data points at Developing in outcomes 2 or 6 or Proficient in outcomes 1, 7, or 10).

If 2-3-4-5 doesn’t work, play with 3-4-5-6. Or 46-64-85-100. And if you want to give some outcomes more weight than others, do so. For example, you can double values from solve systems algebraically.

Averaging

Conversations about averaging do not always offer nuance. The takeaway can be that averaging is just… wait for it… mean. Averaging across different outcomes–see above–is more than okay. It’s averaging within the same outcome that can be punitive. Let’s revisit the gradebook:

Gradebook at time of first Learning Update

For the sake of simplicity, I skipped a crucial step. These letters are not single data points. For example, prior to “it’s time to report out,” Denise’s “P” on the third learning outcome might have been “Em, Em, D, P, P.” Averaging would drag Denise down to Developing; she’d be stuck to her initial struggles. In the end, Denise demonstrated–successively–a Proficient level of understanding in relation to this learning outcome. That’s what matters, that’s what counts.

The fact that she didn’t know how to do something in the beginning is expected–she is learning, not learned, and she shouldn’t be punished for her early-not-knowing.

Peter Liljedahl, 2020, p. 258

* * ** *** ***** ******** *************

Marc has extended my understanding of assessment and this blog series reflects our collective thinking. Check out his assessment video from BCAMT!

[SBA] Designing Assessment Items

In this series:

  1. Writing Learning Standards
  2. Constructing Proficiency Scales
  3. Designing Assessment Items
  4. Determining Grades

Designing Assessment Items

There is a sentiment in BC that using tests and quizzes is an outdated assessment practice. However, these are straightforward tools for finding out what students know and can do. So long as students face learning standards like solve systems of linear equations algebraically, test items like “Solve: 5x + 4y = 13; 8x + 3y + 3 = 0are authentic. Rather than eliminate unit tests, teachers can look at them through different lenses; a points-gathering perspective shifts to a data-gathering one. Evidence of student learning can take multiple forms (i.e., products, observations, conversations). In this post I will focus on products, specifically unit tests, in part to push back against the sentiment above.

In the previous post, I constructed proficiency scales that describe what students know at each level. These instruments direct the next standards-based assessment practice: designing assessment items. Items can (1) target what students know at each proficiency level or (2) allow for responses at all levels.

Target What Students Know at Each Level

Recall that I attached specific questions to my descriptors to help students understand the proficiency scales:

This helps teachers too. Teachers can populate a test with similar questions that reflect a correct amount of complexity at each level of a proficiency scale. Keep in mind that these instruments are intended to be descriptive, not prescriptive. Sticking too close to sample questions can emphasize answer-getting over sense-making. Questions that look different but require the same depth of knowledge are “fair game.” For example:

Prompts like “How do you know?” and “Convince me!” also prioritize conceptual understanding.

Allow For Responses at All Levels

Students can demonstrate what they know through questions that allow for responses at all levels. For example, a single open question such as “How are 23 × 14 and (2x + 3)(x + 4) the same? How are they different?” can elicit evidence of student learning from Emerging to Extending.

Nat Banting’s Menu Math task from the first post in this series is an example of a non-traditional assessment item that provides both access (i.e., a “low-threshold” of building a different quadratic function to satisfy each constraint) and challenge (i.e., a “high-ceiling” of using as few quadratic functions as possible). A student who knows that two negative x-intercepts pairs nicely with vertex in quadrant II but not with never enters quadrant III demonstrates a sophisticated knowledge of quadratic functions. These items blur the line between assessment and instruction.

Note that both of these items combine content (“operations with fractions” and “quadratic functions”) and competencies (i.e., “connect mathematical concepts to one another” and “analyze and apply mathematical ideas using reason”). Assessing content is my focus in this series. Still, I wanted to point out the potential to assess competencies.

Unit Tests

Teachers can arrange these items in two ways: (1) by proficiency level then learning outcome or (2) by learning outcome then proficiency level. A side-by-side comparison of the two arrangements:

Teachers prefer the second layout–the one that places the learning outcome above the proficiency levels. I do too. Evidence of learning relevant to a specific standard is right there on single page–no page flipping is required to reach a decision. An open question can come before or after this set. The proficiency-level-above-learning-outcome layout works if students demonstrate the same proficiency level across different learning outcomes. They don’t. And shouldn’t.

There’s room to include a separate page to assess competency learning standards. Take a moment to think about the following task:

What could equations ① and ② have been? What else? How do you know?

Initially, I designed this task to elicit Extending-level knowledge of solve systems of linear equations algebraically. In order to successfully “go backwards,” a student must recognize what happened: equivalent equations having opposite terms were made. The p-terms could have been built from 5p and 2p. This gives 5p + 3q + 19 = 0 for ① and 2p - 5q - 11 = 0 for ②. (I’m second-guessing that this targets only Extending; 10p + 6q + 38 = 0 for ① and 10p - 25q - 55 = 0 for ② works too.) This task also elicits evidence of students’ capacities to reason and to communicate–two of the curricular competencies.

Teacher Reflections

Many of the teachers who I work with experimented with providing choice. Students self-assessed their level of understanding and decided what evidence to provide. Most of these teachers asked students to demonstrate two proficiency levels (e.g., the most recent level achieved and one higher). Blank responses no longer stood for lost points.

Teachers analyzed their past unit tests. They discovered that progressions from Emerging to Proficient (and sometimes Extending) were already in place. Standards-based assessment just made them visible to students. Some shortened their summative assessments (e.g., Why ask a dozen Developing-level solve-by-elimination questions when two will do?).

The shift from grading based on data, not points, empowered teachers to consider multiple forms (i.e., conversations, observations, products) and sources (e.g., individual interviews, collaborative problem solving, performance tasks) of evidence.

In my next post, I’ll describe the last practice: Determining Grades (and Percentages). Again, a sneak peek:

Update

Here’s a sample unit test populated with questions similar to those from a sample proficiency scale:

Note that Question 18 addresses two content learning standards: (1) solve systems of linear equations graphically and (2) solve systems of linear equations algebraically. Further, this question addresses competency learning standards such as Reasoning (“analyze and apply mathematical ideas using reason”) and Communicating (“explain and justify mathematical ideas and decisions”). The learning standard cells are intentionally left blank; teachers have the flexibility to fill them in for themselves.

Note that Question 19 also addresses competencies. The unfamiliar context can make it a problematic problem that calls for (Problem) Solving. “Which window has been given an incorrect price?” is a novel prompt that requires Reasoning.

These two questions also set up the possibility of a unit test containing a collaborative portion.

[E]valuation is a double edged sword. When we evaluate our students, they evaluate us–for what we choose to evaluate tells our students what we value. So, if we value perseverance, we need to find a way to evaluate it. If we value collaboration, we need to find a way to evaluate it. No amount of talking about how important and valuable these competencies are is going to convince students about our conviction around them if we choose only to evaluate their abilities to individually answer closed skill math questions. We need to put our evaluation where our mouth is. We need to start evaluating what we value.

Liljedahl, P. (2021). Building thinking classrooms in mathematics, grades K-12: 14 teaching practices for enhancing learning. Corwin.

[SBA] Constructing Proficiency Scales

In this series:

  1. Writing Learning Standards
  2. Constructing Proficiency Scales
  3. Designing Assessment Items
  4. Determining Grades

Constructing Proficiency Scales

BC’s reporting order requires teachers of Grades K-9 to use proficiency scales with four levels: Emerging, Developing, Proficient, and Extending. Teachers of Grades 10-12 may use proficiency scales but must provide letter grades and percentages. Proficiency scales help communicate to students where they are and where they are going in their learning. But many don’t. When constructing these instruments, I keep three qualities in mind…

Descriptive, Positive, Progressive and Additive

Descriptive

BC’s Ministry of Education defines Emerging, Developing, Proficient, and Extending as demonstrating initial, partial, complete, and sophisticated knowledge, respectively. Great. A set of synonyms. It is proficiency scales that describe these depths with respect to specific learning standards; they answer “No, really, what does Emerging, or initial, knowledge of operations with fractions look like?” Populating each category with examples of questions can help students–and teachers–make sense of the descriptors.

Positive

Most scales or rubrics are single-point posing as four. Their authors describe Proficient, that’s it. The text for Proficient is copied and pasted to the Emerging and Developing (or Novice and Apprentice) columns. Then, words such as support, some, and seldom are added. Errors, minor (Developing) and major (Emerging), too. These phrases convey to students how they come up short of Proficient; they do not tell students what they know and can do at the Emerging and Developing levels.

Progressive and Additive

BC’s Ministry of Education uses this phrase to describe profiles of core competencies: “[Profiles] are progressive and additive, and they emphasize the concept of expanding and growing. As students move through the profiles, they maintain and enhance competencies from previous profiles while developing new skills.”

I have borrowed this idea and applied it to content learning standards. It was foreshadowed by the graphic organizer at the end of my previous post: Extending contains Proficient, Proficient contains Developing, and Developing contains Emerging. (Peter Liljedahl calls this backward compatible.) For example, if a student can determine whole number percents of a number (Proficient), then it is assumed that they can also determine benchmark percents (i.e., 50%, 10%) of a number (Emerging). A move from Emerging to Proficient reflects new, more complex, knowledge, not greater independence or fewer mistakes. Students level up against a learning standard.

Emerging and Extending

The meanings of two levels–Emerging to the left and Extending to the right–are open to debate. Emerging is ambiguous, Extending less so. Some interpretations of Extending require rethinking.

Emerging

“Is Emerging a pass?” Some see Emerging as a minimal pass; others interpret “initial understanding” as not yet passing. The MoE equivocates: “Every student needs to find a place on the scale. As such, the Emerging indicator includes both students at the lower end of grade level expectations, as well as those before grade level expectations. […] Students who are not yet passing a given course or learning area can be placed in the Emerging category.” Before teachers can construct proficiency scales that describe Emerging performance, they must land on a meaning of Emerging for themselves. This decision impacts, in turn, the third practice of a standards-based approach, designing assessment items.

Extending

A flawed framing of Extending persists: above and beyond. Above and beyond can refer to a teacher’s expectations. The result: I-know-it-when-I-see-it rubrics. “Wow me!” isn’t descriptive.

Above and beyond can also refer to a student’s grade level. Take a closer look at the MoE’s definition of Extending: “The student demonstrates a sophisticated understanding of the concepts and competencies relevant to the expected learning [emphasis added].” It is Math 6 standards, not Math 8 standards, that set forth the expected learning in Math 6. When reaching a decision about proficiency in relation to a Math 6 outcome, it is unreasonable–and nonsensical–to expect knowledge of Math 8 content.

Characterizing Extending as I can teach others is also problematic. Explaining does not ensure depth; it doesn’t raise a complete understanding of a concept to a sophisticated understanding. Further, I can teach others is not limited to one level. A student may teach others at a basic complexity level. For example, a student demonstrates an initial understanding of add and subtract fractions when they explain how to add proper fractions with the same denominator.

Example: Systems of Linear Equations

In my previous post, I delineated systems of linear equations as solve graphically, solve algebraically, and model and solve contextual problems. Below, I will construct a proficiency scale for each subtopic.

Note that I’ve attached specific questions to my descriptors. My text makes sense to me; it needs to make sense to students. Linear, systems, model, slope-intercept form, general form, substitution, elimination–all of these terms are clear to teachers but may be hazy to the intended audience. (Both logarithmic and sinusoidal appear alongside panendermic and ambifacient in the description of the turbo-encabulator. Substitute nofer trunnions for trigonometric identities in your Math 12 course outline and see if a student calls you on it on Day 1.) The sample questions help students understand the proficiency scales: “Oh yeah, I got this!”

Some of these terms may not make sense to my colleagues. Combination, parts-whole, catch-up, and mixture are my made-up categories of applications of systems. Tees and hoodies are representative of hamburgers and hot dogs or number of wafers and layers of stuf. Adult and child tickets can be swapped out for dimes and quarters or movie sales and rentals. The total cost of a gas vehicle surpassing that of an electric vehicle is similar to the total cost of one gym membership or (dated) cell phone plan overtaking another. Of course, runner, racing car and candle problems fall into the catch-up category, too. Textbooks are chock full o’ mixed nut, alloy, and investment problems. I can’t list every context that students might come across; I can ask “What does this remind you of?”

My descriptors are positive; they describe what students know, not what they don’t know, at each level. They are progressive and additive. Take a moment to look at my solve-by-elimination questions. They are akin to adding and subtracting quarters and quarters, then halves and quarters, then quarters and thirds (or fifths and eighths) in Math 8. Knowing \frac{8}{3} - \frac{5}{4} implies knowing \frac{7}{4} - \frac{3}{4}.

Emerging is always the most difficult category for me to describe. My Emerging, like the Ministry’s, includes not yet passing. I would welcome your feedback!

Describing the Extending category can be challenging, too. I’m happy with my solve graphically description and questions. I often lean on create–or create alongside constraints–for this level. I’m leery of verb taxonomies; these pyramids and wheels can oversimplify complexity levels. Go backwards might be better. Open Middle problems populate my Extending columns across all grades and topics.

My solve algebraically… am I assessing content (i.e., systems of linear equations) or competency (i.e., “Explain and justify mathematical ideas and decisions”)? By the way, selecting and defending an approach is behind my choice to not split (👋, Marc!) substitution and elimination. I want to emphasize similarities among methods that derive equivalent systems versus differences between step-by-step procedures. I want to bring in procedural fluency:

Procedural fluency is the ability to apply procedures accurately, efficiently, and flexibly; to transfer procedures to different problems and contexts; to build or modify procedures from other procedures; and to recognize when one strategy or procedure is more appropriate to apply than another

NCTM

But have I narrowed procedural fluency to one level?

And what about something like:

\frac{x}{3} + \frac{y}{2} = 3
\frac{x+3}{2} + \frac{y+1}{5} = 4?

More complicated? Yep. More complex? Probably not.

Note that my model and solve contextual problems is described at all levels. Apply does not guarantee depth of knowledge. Separating problem solving–and listing it last–might suggest that problem solving follows building substitution and elimination methods. It doesn’t. They are interweaved. To see my problem-based approach, watch my Systems of Linear Equations videos from Surrey School’s video series for parents.

Next up, designing assessment items… and constructing proficiency scales has done a lot of the heavy lifting!

[SBA] Writing Learning Standards

For several years, standards-based assessment (SBA) has been the focus of much of my work with Surrey teachers. Simply put, SBA connects evidence of student learning with learning standards (e.g., “use ratios and rates to make comparisons between quantities”) rather than events (“Quiz 2.3”). The change from gathering points to gathering data represents a paradigm shift.

In this traditional system, experience has trained students to play the game of school. Schools dangle the carrot (the academic grade) in front of their faces and encourage students to chase it. With these practices, schools have created a culture of compliance. Becoming standards based is about changing to a culture of learning. “Complete this assignment to get these points” changes to “Complete this assignment to improve your learning.” […] Educators have trained learners to focus on the academic grade; they can coach them out of this assumption.

Schimmer et al., 2018, p. 12

In this series, I’ll describe four practices of a standards-based approach:

  1. Writing Learning Standards
  2. Constructing Proficiency Scales
  3. Designing Assessment Items
  4. Determining Grades

Writing Learning Standards

In BC, content learning standards describe what students know and curricular competency learning standards describe what students can do. Describe is generous–more like list. In any mathematical experience a student might “bump into” both content and competency learning standards. Consider Nat Banting’s Quadratic Functions Menu Math task:

Think about the following ten “design specifications” of quadratic functions:

A.Two negative x-interceptsB.Vertex in quadrant II
C.Never enters quadrant IIID.Vertex on the y-axis
E.Positive y-interceptF.No x-intercepts
G.Never enters quadrant IH.Has a minimum value
I.Horizontally stretchedJ.Line of symmetry enters quadrant IV

You could build ten different quadratic functions to satisfy these ten different constraints.

Instead, build a set of as few quadratic functions as possible to satisfy each constraint at least once. Write your functions in the form y = a(x − p)2 + q.

Which constraints pair nicely? Which constraints cannot be paired?

Is it possible to satisfy all ten constraints using four, three, or two functions?

Describe how and why you built each function. Be sure to identify which functions satisfy which constraints.

Students activate their knowledge of quadratic functions. In addition, they engage in several curricular competencies: “analyze and apply mathematical ideas using reason” and “explain and justify mathematical ideas and decisions,” among others. Since the two are interwoven, combining competencies and content (i.e., “reason about characteristics of quadratic functions”) is natural when thinking about a task as a learning activity. However, from an assessment standpoint, it might be helpful to separate the two. In this series, I will focus on assessing content.

The content learning standard quadratic functions and equations is too broad to inform learning. Quadratic functions–nevermind functions and equations–is still too big. A student might demonstrate Extending knowledge of quadratic functions in the form y = a(x − p)2 + q but Emerging knowledge of completing the square, attain Proficient when graphing parabolas but Developing when writing equations.

Operations with fractions names an entire unit in Mathematics 8. Such standards need to be divided into subtopics, or outcomes. For example, operations with fractions might become:

  1. add and subtract fractions
  2. multiply and divide fractions
  3. evaluate expressions with two or more operations on fractions
  4. solve contextual problems involving fractions

Teachers can get carried away breaking down learning standards, differentiating proper from improper fractions, same from different denominators, and so on. These differences point to proficiency levels, not new outcomes. Having too many subtopics risks atomizing curriculum. Further, having as many standards as days in the course is incompatible with gathering data over time. I aim for two to four (content) outcomes per unit.

In Foundations of Mathematics and Pre-calculus 10, systems of linear equations can be delineated as:

  1. solve graphically
  2. solve algebraically
  3. model and solve contextual problems

My solve algebraically includes both substitution and elimination. Some of my colleagues object to this. No worries, separate them.

In my next post, I’ll describe constructing proficiency scales to differentiate complexity levels within these learning standards. Here’s a sneak peek:

What do you notice?