Building an Assessment that Makes Itself Obsolete

Ross Markle
Jun 1
4 min read

Educators are well-acquainted with the difference between formative and summative assessments. Formative measures happen along the way and are designed to give us feedback on learning: where are our strengths and where are areas we need to focus on before we get to the summative assessment. The summative assessment then measures whether the intervention has succeeded or failed — or, if you prefer, whether learning has occurred.

Personally, I've done much more work in what I would call diagnostic assessment. It's certainly not summative, but it's also a different context than what most people think of as formative. With diagnostic assessment, there is the same low-stakes setting and interest in identifying strengths and opportunities for growth.

However, diagnostic assessment typically takes place before the intervention — not during. While formative assessment is generally administered in a context where the learner is already familiar with the learning process (e.g., I know I'm working on writing, math, etc.), diagnostic assessment also tends to raise awareness about the learning experience itself. For example, with our ISSAQ platform, our goal is to help college students better understand the behavioral, motivational, emotional, and social aspects of student success.

One of the things we often hear when we talk to students about the experience is something to the effect of: "I wasn't even thinking about these things."

Diagnostic assessment — when done well, at least — should also have a direct connection not just to feedback, but to resources. When many people think of formative assessment, they imagine a classroom setting where an educator is present to interpret findings and guide next steps. With diagnostic assessment, that scaffolding isn't assumed. The assessment itself needs to do more of that work, pointing learners toward specific supports and, ideally, building in some directive action planning.

So, in short: diagnostic assessment is designed to measure characteristics a priori, provide informative feedback, and connect learners with resources — with action planning built in where possible.

The Validity Question Gets Complicated

This creates some genuinely interesting questions when it comes to validity. There are many ways to address the question of whether the assessment is working:

Is it measuring what really matters? (Content validity)
Is the assessment content predictive of success? (Criterion-related validity)
Do learners meaningfully engage with and process the feedback they receive? (An open methodological question — I'm not sure we have a good validity framework for this yet.)
Do the resources and interventions provided after feedback actually improve success?

That last question is particularly tricky. In the ISSAQ context, we provide feedback to students, but also to advisors, instructors, and the institution as a whole — not only to allow the student to act, but to guide those working with the student to provide better support.

When Remarkable Data Isn't Enough

Last week, I visited one of our institutional partners who administers ISSAQ as part of their student success course. For the first time, we had data not just on first-term GPA or retention to the second year, but on outcomes five and six years after students had taken the survey. The results were remarkable: in a sample of approximately 500 first-year students, 42% of those who had entered with low Engagement had graduated or were still enrolled after five years. For students who entered with high Engagement, that rate was 71%.

A 30 percentage-point difference based on a simple 10-item survey administered in their first semester.

The psychometrician in me was excited. Look at the instrument we built — how well it predicts outcomes this far out. But standing in front of a room of about fifty educators, the facilitator in me had a somewhat different response: we hadn't done enough.

We had the data. We knew these students were struggling with Engagement. We even tried to support them. But whatever we did didn't do the trick. Scientifically, I should acknowledge that we don't know what the success rate would have been without any intervention — perhaps the gap would have been even larger. But I think it's fair to say we didn't do enough... or perhaps we didn't do the "right" things to help these students succeed.

The Goal Is Zero Predictive Validity

This leads to what I think is the most fascinating point of all when it comes to diagnostic assessment — especially from the perspective of someone who has spent his career building and promoting these tools.

Our ultimate goal is to completely wash out predictive validity.

Think about it: if you have any meaningful student success effort, your goal is to say that any student can succeed regardless of what they bring to your institution. Low Sense of Belonging? Doesn't matter. You have interventions that can help, either by providing social support or by helping students build the skills to create connection and feel like they belong. If you're doing that work well, then the students who arrive with low Sense of Belonging can experience success in the same way as those who already feel connected.

The goal is zero predictive validity. I can't say I've fully operationalized that as an evaluation criterion yet, but I can see a future where we do — where the absence of a predictive relationship between ISSAQ factors and student outcomes is itself the evidence that an institution's support systems are working.

What This Means for Practice

Validity — particularly when we're talking about impact — is genuinely complex. There are many ways to define and assess whether a diagnostic assessment is working well, and I'd imagine many of these questions apply to formative assessment as well.

For me, what matters most is being facile with your data and working with your partners and constituents to make meaning locally. A 30 percentage-point gap in five-year outcomes is both a measure of how much noncognitive factors matter and a challenge to do better — to build the kinds of interventions, support structures, and institutional cultures that eventually make that gap disappear.

That's the real measure of a good diagnostic assessment: not how well it predicts failure, but how effectively it helps institutions prevent it.

Building an Assessment that Makes Itself Obsolete

The Validity Question Gets Complicated

When Remarkable Data Isn't Enough

The Goal Is Zero Predictive Validity

What This Means for Practice

Recent Posts

Comments