Pass or Fail: Multiple Assessments to Determine True Learning

pass or fail

In this multi-part series, I provide a dissection of the phenomenon of retention and social promotion. Also, I describe the many different methods that would improve student instruction in classrooms and eliminate the need for retention and social promotion if combined effectively.

While reading this series, periodically ask yourself this question: Why are educators, parents and the American public complicit in a practice that does demonstrable harm to children and the competitive future of the country?

When it comes to getting rid of our current pass-fail system, I have developed six strategies (click to see them all). Developing a system with varied assessments is one of those points.

Many educators view standardized testing as a necessary evil, and some see it as a completely useless process that never reflects what students know. Proponents of K-12 assessments, on the other hand, contend that there is no adequate way to enforce educator accountability without them.

The majority of states and school districts rely on large-scale assessments when it comes to student grade progression, but this should only be a small piece of a larger analysis of individual students. Multiple sources of information about a student should be used in determining his or her readiness for the next grade, and teachers should make use of them.

Compared to the first two stages of change, the idea of creating multiple assessment measures is very easy. To some extent, public schools already make use of multiple assessment measures. For instance, multiple assessment measures are standard for students with IEPs, and IEPs are not usually changed without making reference to multiple assessment measures. The real key to implementing this stage is not so much the employment of multiple measures as it is the actual selection of those measures and the way they should be administered and interpreted.

The use of multiple assessments including some that do not entail tests makes allowance for that considerable proportion of the student body that does not perform well on tests. Multiple assessments also allow for the possibility that a student simply had a bad day on the day of the test. Finally, the inclusion of some assessment elements that do not consist of a rigid, multiple-choice tests reduces the likelihood of students “overthinking” higher-level questions, and inadvertently providing the right answer to the wrong question.

A combination of assessments is best both for simple assessment of learning and for making decisions about retention. The decision to hold a student back, if made at all, should be made on the basis of multiple measures of performance, and never strictly by a standardized test.

Pass or Fail: How Did We Get to This Assessment Place?

pass or fail

In this multi-part series, I provide a dissection of the phenomenon of retention and social promotion. Also, I describe the many different methods that would improve student instruction in classrooms and eliminate the need for retention and social promotion if combined effectively.

While reading this series, periodically ask yourself this question: Why are educators, parents and the American public complicit in a practice that does demonstrable harm to children and the competitive future of the country?

When it comes to assessing what students have learned, there is a mixed review from educators. A few view the current way we assess students as on point, while a few others feel it’s detrimental. Most educators hang out somewhere in the middle, with an understanding that current assessments are needed, but have perhaps gone too far to the detriment of students.

How can we improve student assessment?

Let’s begin with a review of the current status of assessments, identifying exactly what types of assessments are currently used, and their strengths and weaknesses. There are numerous types of assessments employed throughout the world to assess knowledge, skills, and even fundamental intelligence. Often, though, the advertised purpose of a test is not consistent with its capacity for assessment. The Intelligence Quotient or IQ test, for instance, is supposed to gauge individual intelligence.

Proponents of the IQ test stress that it is a viable model for gauging what someone’s academic or even professional potential might be. Culturally, this is widely accepted: a high IQ score is the equivalent of confirmation that an individual will succeed academically and professionally. The reality, however, is different. The IQ test itself is not an absolute gauge of a person’s intelligence and is far from being a perfect measure of intelligence. At best, it assesses a specific kind of intelligence with a reasonable degree of accuracy: a problem-solving ability, really, or the ability to recognize patterns in problems.

Like the IQ test, many other standardized assessments set out to gauge a particular knowledge or skill. For instance, SAT and ACT tests gauge individual capacity for verbal and mathematical reasoning. At least, that is how most of us view them. However, while the SATs and ACTs test analytical skills and comprehension skills, the alignment of these elements is not necessarily effective. As most seasoned educators are aware, the limitations established by the format of these tests and the reliance upon multiple-choice responses means that there is minimal scope for assessment. Many students with excellent academic potential may not score well on these tests. Others who have a more limited aptitude may score well and find themselves in academic settings that are too demanding for their ability level.

Since different types of abilities and many different skills play a role in academic success, the focus on analytical skills measured through multiple-choice questions and the focus on math and reading comprehension is inherently limiting. These assessments often overlook written and oral communication skills. They also overlook fundamental reasoning and argumentation skills, since students are limited in the scope of their answers. Finding a solution to that, of course, is not easy but is something we’ll explore more in-depth in future posts in this series.

Pass or Fail: Multiple Assessments to Determine True Learning

pass or fail

In this multi-part series, I provide a dissection of the phenomenon of retention and social promotion. Also, I describe the many different methods that would improve student instruction in classrooms and eliminate the need for retention and social promotion if combined effectively.

While reading this series, periodically ask yourself this question: Why are educators, parents and the American public complicit in a practice that does demonstrable harm to children and the competitive future of the country?

Student assessment is a necessary evil of the teaching profession but what is actually most effective?

Ankur Singh, formerly a student at the University of Missouri–Columbia, took an English class in his junior year of high school that influenced him profoundly.

“It was the only class I’ve ever taken where the lessons I learned will carry with me for the rest of my life, and after completion, I felt ten times smarter,” he says. The teacher focused on the development of the students’ critical-thinking skills and ensured that they were able to analyze poems and essays. He was keen to allow each student to form his or her opinions.

Because Singh loved the junior-year English class so much, he expected the college-prep AP English course he enrolled in during his senior year would be equally enjoyable. However, it turned out to be an awful experience. The critical-thinking skills he had honed the previous year were of no use in the new class; instead, the classes focused solely on preparing them for the inevitable exam. “It frustrated me to no avail, and I ended up doing very poor in AP English,” Singh says. “And I found the same thing in all of my other AP classes, which seemed more focused on college preparation and standardized tests rather than genuine learning.”

Singh began to wonder what the real purpose of education was. “All around me were students studying diligently, stressing out about their grades, homework, the ACT, college essays, AP tests. And here I was not caring about any of those things. Were there no students in this school who wanted anything more than just a college degree and a job?” He began to feel lonely, and then angry. Finally, during an AP French exam, he used the time to write a furious letter to the College Board, expressing his misgivings.

Though he expected to be reprimanded by his French teacher for writing a letter rather than taking the exam, she listened sympathetically and told him that she felt the same frustrations with the system. Though she had wanted to take the French students on field trips to a French bakery or watch a French film, she was forced to teach to the test. “Maybe if the students themselves spoke out against it,” she said, “it could all change.”

As Ankur Singh’s story demonstrates, the current model of assessments can lead to frustration in students and teachers alike. In a previous article, we outlined ways in which administrators in education might manage the hiring of qualified teachers and how they might also use the availability of qualified teachers to promote student success in the classroom. In the following articles, we will look at the use of multiple assessment measures in determining a student’s abilities and academic potential.

The basic premise of this strategy is as follows: Many states and school districts rely on large-scale assessments when making decisions about student grade progression. Despite the evidence that such assessments are not always an accurate reflection of a student’s academic abilities and despite the reality that most testing experts warn that high stakes retention or promotion decisions should never be made by a single assessment, states and school districts rely on these assessments.

How can we change the way we look at student assessments – and how can they benefit our students as a result?

 

 

Pass or Fail: Why High-Stakes Tests for Retention Decisions Fails

pass or fail

In this multi-part series, I provide a dissection of the phenomenon of retention and social promotion. Also, I describe the many different methods that would improve student instruction in classrooms and eliminate the need for retention and social promotion if combined effectively.

While reading this series, periodically ask yourself this question: Why are educators, parents and the American public complicit in a practice that does demonstrable harm to children and the competitive future of the country?

As an education community, do we put too much stock in standardized testing? In other words – are we unfairly retaining students based on a testing system that is flawed?

High-stakes tests in retention decisions have added another layer of controversy to the debate over retention. Test-based retention is itself an educationally beneficial placement, which we have coupled with the issue of whether chosen tests validate inferences concerning student knowledge and educationally beneficial placements. As we said above, most of the time, the inferences are not valid at all.

In the wealth of academic research on this particular subject, we find Penfield discussing the extent to which research has confirmed test-based grade retention as a particularly problematic approach to education. Teachers initiate the majority of retention decisions, but an increasing number of states and districts have taken to using high-stakes tests to make retention decisions.

To be specific, Texas, Florida, and Louisiana are among the states that retain children in gateway grades primarily based on standardized test performance. Several large school districts, including New York and Chicago, also employ standardized tests as key criteria for grade retention decisions.

Common Core Standards are heavily reliant on assessments to achieve their aims. As most educators and parents are already aware, the principle objective of the Common Core Standards is to provide “a consistent, clear understanding of what students are expected to learn, so teachers and parents know what they need to do to help them.”

The Common Core Standards seek to make every child learn at the same pace, with teachers and parents roped into the process of standardizing the learning experience. Kindergarteners must learn one set of things to advance to first grade. First graders must learn another set of things, and so on, all the way through the system.

Because they are intended to be nationwide, Common Cores lead to assessments that are as standardized as possible. Students must submit to testing with even more regularity than they have in the past, and must demonstrate, in these test scenarios, that they have acquired all of the standards for knowledge and skill that the Core demanded of them.

Numerous factors appear to influence the validity of assessments, including the opportunities that students have had to learn the content of the test, whether the test measures the intended constructs, whether the test leads to the intended educational goals, whether the scores are reflective of high-quality instructions, and whether the test has afforded students sufficient opportunities to demonstrate their knowledge, skills, and achievements.

The American Educational Research Association (AERA), the American Psychological Association (APA), and the National Council on Measurement in Education (NCME) make up much of the standards for fair and appropriate tests. These standards also play a central role in determining the appropriate use of tests, but the reality falls far short of their supposed ideal parameters.

The use of tests in making retention decisions is complicated by the disproportional impact that test-based retention policies have on historically disadvantaged groups, including ethnic minority groups, racial minority groups, and English language learners. Numerous studies indicate that large achievement gaps exist between the majority and protected student populations.

These gaps point to the possibility that students of protected populations are in jeopardy for displaying disproportionately low passing rates on tests used to make retention decisions. Recent reports have pointed to disproportionately high retention rates for students in Florida and Texas, which have high proportions of minorities among their students. Penfield also questions the validity of test scores obtained from high-stakes tests, which don’t appear supported for all protected populations.

Penfield’s research alone seems grounds to take a long, hard look at assessments and how we use them when it comes to promoting or retaining students. If there is a big gap between the types of groups of students who pass assessments and fail them, then it seems fair to assess the assessments themselves.

Pass or Fail: Test-Based Retention Practices and Education Standards

pass or fail

In this multi-part series, I provide a dissection of the phenomenon of retention and social promotion. Also, I describe the many different methods that would improve student instruction in classrooms and eliminate the need for retention and social promotion if combined effectively.

While reading this series, periodically ask yourself this question: Why are educators, parents and the American public complicit in a practice that does demonstrable harm to children and the competitive future of the country?

Is testing an accurate portrayal of what students actually know and does it help them progress from one level of mastery to the next?

Today, retention occurs primarily or exclusively based on test results, often without due consideration of the fairness or appropriateness of the test itself.

Some researchers have argued that test-based retention may have a net benefit to society. Contradictory as it might seem, given what we have just discussed, proposed reasons for the benefit include the following: test-based retention can (a) create a more homogeneous class environment that can facilitate instruction, (b) provide motivation for all students to obtain the requisite knowledge, and (c) provide motivation for all teachers and school officials to deliver adequate learning opportunities for all students. These approaches are themselves potential benefits.

According to the theories of classical utilitarianism, the aggregated benefit across all individuals is also significant and outweighs the costs because the majority of students can thrive based on their test scores and regarding promotion or retention.

As Levin has identified, supported by Xia and Glennie, the costs of test-based retention are numerous. They include loss of income and lost tax revenues; increased reliance on government- subsidized health coverage by those that are impacted by these policies, increased criminal activity, higher reliance on welfare benefits, and added instructional resources required for each additional year of schooling generated by the retention.

From a purely economic perspective, the costs associated with test-based retention rival the resulting benefits of these policies to promoted students. Although there has yet to be a formal weighing of costs and benefits of retention policies, the overall net economic benefits of test-based retention policies appear to be negligible.

The economic costs generate an educational disadvantage large enough to have a dramatic adverse impact on the life chances of the retained students. We must also factor in ethical issues: testing heavily infringes on the life chances of low-performing students, constituting a significant violation of fairness.

Even if a net economic benefit resulted from a test-based retention policy for society as a whole, the acceptance of these benefits demands the educational disenfranchisement of so many minority and poor students as to be unconscionable.

Test-based retention is also problematic from a purely assessment-based perspective, regarding how it assesses and how these assessments measure up to basic parameters of fairness. Most forms of test-based retention, considered against criteria for fair and valid testing, fall short. The first problem is measurement validity.

Regarding measurement validity, test-based retention leads to an evaluation of each specific test used in retention decisions. No one can assess validity in a general way because scores are not rigorously applied when retention decisions are made. A school may retain a child who achieves a score within a certain range, based on the determination of relevant education professionals. A different group of education professionals could promote another child who has the same score or even a lower score. There’s little evidence of consistency in scoring.

The effectiveness of treatment is perhaps the policy most prone to consistent violation by test-based retention. Since grade retention is an educational placement, the standards for testing should result in educational placements should that are educationally beneficial to the student. Indeed, if retention is to be a test-based decision, educators should evaluate grade retention per se to determine whether it is ever educationally beneficial.

Of course, retention does have a certain intuitive appeal, which we should not entirely discount. Students who have not adequately mastered certain material should be offered a second attempt to master it. They should undertake that attempt before they graduate to the next grade, where there are new demands to contend with and where the material becomes more difficult.

There are limitations to this rationale, though. Among other things, it clearly ignores the mediating issues. Grade retention inevitably reestablishes students in the same learning environment in which, on their first attempt at knowledge and skills mastery, they have had little success. Retention becomes not only pointless but often takes on the character of punishment.

The embarrassing stigma associated with grade retention is, as we have already shown, intense. There is also the anxiety that most students feel with respect to the retention experience. These negative attributes make retention unlikely to engender any real educational benefits. Students may be worse off regarding academic and cognitive growth than if they had never experienced retention.

Some studies demonstrate that retention puts most students in a worse position than they would be if they had not been retained, meaning that the placement has no educational benefits at all and thus that it is also contrary to standards for test-based placements.

If, collectively as educators, we push back against testing culture as a form of retention measurement, perhaps we can start to find real solutions for students.

Pass or Fail: Alternative Assessment Measures

pass or fail

In this multi-part series, I provide a dissection of the phenomenon of retention and social promotion. Also, I describe the many different methods that would improve student instruction in classrooms and eliminate the need for retention and social promotion if combined effectively.

While reading this series, periodically ask yourself this question: Why are educators, parents and the American public complicit in a practice that does demonstrable harm to children and the competitive future of the country?

If we know that students who are retained based on assessment scores continue to perform poorly in school, statistically speaking, then what can we do to turn the tide?

With standardized testing occupying such a prominent position on the educational landscape of the United States, it is not easy to specify a viable alternative. What alternatives to tests are available and, more to the point, which alternatives might be made to serve to eliminate test-based retention policies in education?

One of the first and most obvious alternatives to test-based assessments, specifically for retention policies, is a teacher-based assessment. Because of their intimate connection with students, teachers have a real potential to intervene and change the system. Indeed, teacher-based assessments ought to be readily manageable and academically viable.

Administrators could supply teachers with criteria for assessment and give guidance on how to apply those criteria in assessing student knowledge and skill. Hypothetically, this kind of system would represent only a relatively minor alteration of the existing education system, although problems may arise in enacting it.

Because of the scrutiny of the consequences of test-based retention policies, some researchers have already chosen to focus on a relationship between teacher-initiated retention and poor educational outcome. Although there is often little distinction between retention decisions based on high-stakes tests and those based on teacher assessment, most studies have still associated teacher-initiated and test-based retention policies with differential educational outcomes.

Several factors come into play, however, if we are to seriously consider teacher-based assessments as an alternative to testing-based retention. There are important differences in selection criteria between the methods as well as differences in the interventions and available resources for retained students. These factors also tend to vary quite significantly across districts and states as a reflection of the established retention criteria, mandated interventions, and budget limitations.

One might, however, question whether there is a significant difference between retention based on grades on a test or grades meted out as the considered opinion of a teacher. It is clear that retention is an inevitable consequence of a graded school system, and retention is no more viable when triggered by teachers than by test scores.

Beyond high-stakes testing and teacher assessments, however, there are many alternative methods for assessing a student’s mastery of materials. Multiple-choice questions, which are what most standardized tests use, are also only one of the available testing formats. Tests themselves should be revised regarding their format to allow for a fairer and more accurate application regarding assessing knowledge and skills outlined in a curriculum.

These suggestions just scratch the surface of what is possible when it comes to alternatives to the testing that exists today – so why aren’t we doing more to explore these avenues?

Pass or Fail: When Assessments are Used for Retention – The Fallout

pass or fail

In this multi-part series, I provide a dissection of the phenomenon of retention and social promotion. Also, I describe the many different methods that would improve student instruction in classrooms and eliminate the need for retention and social promotion if combined effectively.

While reading this series, periodically ask yourself this question: Why are educators, parents and the American public complicit in a practice that does demonstrable harm to children and the competitive future of the country?

Retaining a student due to low assessment scores doesn’t help much, if at all.

When tests are used to make retention decisions, retained students are likely to receive a low-quality educational placement because many of the causes of their poor test performance are will simply be repeated. Most tests used in retention decisions produce scores that are partly attributable to low-quality instruction and unintended linguistic and cultural factors. Whenever this is the case, students who are already at a socioeconomic and cultural disadvantage find themselves educationally disenfranchised for the second time.

This problem also begs the question of whether graded learning structures are viable at all. With neither retention nor social promotion offering a positive educational placement for struggling students, the structure of the system itself comes into question.

Grissom and Shepard’s study demonstrates that retained students drop out at rates higher than non-retained students. The study is a path analysis of samples ranging in size from 10,000 to 40,000 drawn from different geographical regions. Across these various samples, retention was associated with an increase in dropout rates of between 14 and 29 percentage points.

Alexander et al. used a logistic regression framework and found that the odds of dropout were approximately four times higher for students who had been retained than for comparable non-retained students.

Jimerson also determined that that retained students had a 50 percent higher chance of dropout by age nineteen than students of a matched comparison group who were never retained.

What’s more – many researchers in the field consider grade retention to be among the best predictors of later school dropout. In this case rating students based on assessments hurts the student short-term and long-term, proving that relying on tests alone is not a true determiner of what is right for the student.

Accountability: Just One Piece of the School Reform Puzzle

School reform can no longer rely mostly on inputs—that is, giving schools more resources and more support. In order for schools to really help the students on hand, the past must play a role and so must the individual needs of the school.

Do standards and accountability work?

Time has shown that inputs have no real impact on student performance. Federal edicts, such as NCLB, have enforced protocols based on standards, testing, and accountability. Standards emphasize performance objectives and require high levels of accountability from educators.

Required reform and accountability, particularly those which impose sanctions similar to those imposed by NCLB, often create much stress and anxiety. This certainly has been the case since NCLB went into effect. Many educators ask whether it is fair to hold schools accountable for student achievement. And, even if it is “fair,” how are we to measure such achievement? What testing and evaluation formulas will be used?  The answers to questions like the above are not easy. Obviously, achievement can only be guaranteed if we assess it in some way. However, current assessment models are flawed.

Research exists to suggest that standards and accountability may improve learning for some disadvantaged students, particularly those with disabilities. When some schools implement accountability guidelines, they promote an environment of increased collaboration among educators and created an environment where teachers expected disabled students to perform better, which in turn encouraged better learning outcomes.

Some countries have been able to show effective and useful outcomes based on their use of certain accountability policies. However, American policy-makers and researchers still do not have any real evidence that these latest accountability reforms are working to improve the performance of the vast majority of students.

What’s the argument surrounding accountability?

Conversations around school accountability have been polarized. Politicians and parents often want to hold schools and teachers completely responsible for student achievement. Teachers point to disinterested students and uninvolved parents, saying that there is only so much they can do. But studies have shown that if teachers and students work together, and schools hold themselves accountable, great strides can be made. All of this discussion of accountability and standards is intended to bring us to a place where schools are performing better and our children are learning.

Researchers at Sam Houston State University in Huntsville, Texas observed positive strides toward improved learning outcomes among a variety of middle schools. The researchers believed that improvement strategies must not only improve learning, but also develop responsiveness and social equity. While studying middle schools, they found that teachers at high-performing schools were using teaching strategies that required students to think critically, and strategies that involved the use of real-world problems.

These teachers were not simply teaching abstract ideas or teaching to the test. They noted that student achievement can be improved when students receive recognition for efforts such as note-taking and doing homework, as well as having the opportunities to work collaboratively in groups and engage in active learning like the testing of hypothesis.

These findings show that the type of assessment or accountability that NCLB brings is not the be all and end all of the teaching equation. Rather, the quality of instruction is the biggest part of learning. It is paramount that we continue to work toward a more balanced solution, finding ways to encourage quality instruction, while also monitoring results.

Inputs alone cannot properly reform a school or district; it takes constant monitoring and understanding of the student population to effect change that will positively impact the students it is meant to serve.