Can we measure student learning outcomes qualitatively? I don't think so...

... but I could be wrong.

Typically I write these posts from a position of knowledge and expertise, seeking to espouse a viewpoint that I've firmly established in my own mind. This, admittedly, it not one of those times. In this post I will posit some hypotheses, but as I'm not an expert on qualitative methodology, I'm sincerely open to discussion on this topic. So, before those of you who are qualitatively inclined start to muster your ire, please hear me out.

Qualitative vs. Quantitative

No matter how familiar you are with the comparison and contrast of quantitative and qualitative methods, it's worth taking a minute to define some terms and assumptions, because as some (e.g., Aspers, 2019) point out, even among like-minded researchers, definitions of these terms can vary.

The distinction begins with some really heady stuff, such as the nature of reality itself, (ontology), the ways in which we, as conscious beings, can even know that reality (epistemology), and how we can gather information to formulate knowledge (methodology; Guba & Lincoln, 1994; Shah & Corley, 2006). This can then lead to debates about functionalism vs. interpretivism, inductive vs. deductive reasoning, etc. (Yes, I'm just using a few big words to establish my bona fides.) But based on my read of the issue, there is one basic assumption that determines the appropriateness of quantitative versus qualitative methods:

Those that prefer quantitative methods assume that there are tangible elements of the human experience that are common, and can thus be assigned the same value (i.e., a number).
Those that prefer qualitative methods assume that the human experience is so rich, contextualized, and framed by our own perception that it cannot be compared across individuals, and thus not quantified.

This is perhaps the central point, and if you disagree with my description, this is the best place for us to start the conversation. However, hopefully I've covered as many key points as I can in ~2000 characters so that we can move on.

It's worth noting - particularly if you're unfamiliar with what I might refer to as "pure" qualitative approaches - how these assumptions lead to differing methodologies. Many consumers of research are familiar with the ways in which quantitative researchers use surveys, rubrics, or other measures to gather data to measure, assess, compare, and otherwise understand phenomena.

Qualitative researchers, however, use tools such as case studies, ethnographies, or other methods that allow them to deeply explore, evaluate, and understand phenomena. Typically, a researcher will observe an individual or group and describe their experiences in relation to the topic of study, using observation and insight, rather than quantitative evidence, to develop understanding.

Do we need student learning outcomes?

While the field of student learning outcomes (SLOs) assessment has grown and changed significantly over the last several decades, there is still debate about the value or even the necessity of such efforts. Obviously, given that I've spent a notable chunk of my career in this area, I acknowledge the value of such exercises (when done well, of course).

But, in reference to the purpose of this post, before we even talk about assessment we should establish a second premise: that creating SLOs is feasible exercise. Mind you, I'm not asking if we should endorse broad-based assessment systems, standardized accountability metrics, or even your institution's assessment-focused professional development day. All am asking is this, "do you think we can describe at least some aspects of student learning through the common language of outcomes?"

If you agree, you may find this to be an almost rhetorical question. Yet there are some (particularly those who write editorials for higher education publications) who believe SLOs are impractical; that the educational experience - particularly in higher education - is too expansive and individualized to capture in concise statements. You know... guys like the character in the meme below (expertly cast and portrayed in the HBO show, Silicon Valley):

If you feel this way, I'm happy to discuss that point, but it's a more fundamental assumption, akin to the comparison of qualitative and quantitative methods, that is another branch point from the conversation.

If, however, you agree to some extent with my qualitative/quantitative distinction and that SLOs are feasible, the we can proceed.

The Major Hurdle

If we agree that student learning outcomes are reasonable, we have already stated that at least some element of learning is common. Certainly - and I think even the most ardent quantitative methodologist would agree - the paths to that outcome, the mechanisms by which students learn, and the advantages/disadvantages that served/challenged them along the way are not common, but we've at least created some common articulation of the student experience.

Does this not violate the fundamental assumption of qualitative research?

If we acknowledge that learning can be articulated in a common way, then we have synthesized, at least to some extent, the human experience. Even a dichotomous (0 or 1) distinction acknowledges commonality between the behaviors classified as 0's and those classified as 1's. Competency-based programs, which sometimes consider only dichotomous scoring, are one (perhaps the most "extreme") example of this. In these cases, students either achieved the competency or did not.

Hopefully we're still together, or at least you can acknowledge my train of thought. That is, (a) if qualitative research is based on the presumption that the human experience is individually unique and thus cannot be quantified, and (b) we acknowledge that SLOs are reasonable to articulate the commonalities of learning, then (c) qualitative methods are not appropriate for measuring SLOs.

Yet, I still see many people drawn to more qualitatively oriented approaches. Simply put, many are uncomfortable with the structure, engagement, or perceived lack of authenticity in many forms of assessment, primarily selected response ("multiple choice") measures. Thus, they apply rubrics tasks like writing assignments, performance assessments, or similar tools in an effort to channel the individuality of qualitative methods.

Yet these are not qualitative approaches. Instead, they are different approaches to quantitative inquiry, (ironically) enacted out of a resistance to quantitative methods. Below, I'd like to discuss some examples of why people take these tactics, the common pitfalls, and what we can do to gather the best information possible.

Let's be clear: Rubrics are both quantitative and standardized

In recent years, the Association of American Colleges and Universities (AAC&U) has led a substantial effort around their VALUE (Valid Assessment of Learning in Undergraduate Education) initiative. Based on a set of well-established rubrics, rooted in an array of domains such as critical thinking, information literacy, and civic engagement, they've worked with colleges and universities across the country to promote what they tout as a more "authentic" means of assessment.

VALUE heavily promotes the use of class-embedded assignments, which are then evaluated using rubrics. As they stated in their 2017 report, On Solid Ground:" Rather than a standardized test divorced from the curriculum, VALUE draws evidence from the actual courses and teachers at an institution, assessing the learning artifacts (papers and assignments) produced by students to demonstrate their achievement of specific learning outcomes" (p. 3).

While much AAC&U's work is impressive, commendable, and has advanced student learning in higher education, it's also often misunderstood and mismarketed. First, to those intrigued because they feel it's more qualitative in nature, they fundamentally misunderstand the assumptions outlined above. The assessment is not the assignment or task, but the rubric. Once a rubric is applied in order to create a score for students, this method of assessment is quantitative, no matter how individualized, engaging, or "authentic" that task might be.

Second, because the same rubric is used across students, classes, programs, and even institutions, this is standardized. Personally, I find the quote above from the AAC&U's report infuriating, because it couches "authenticity" vs. "standardization." If the rubrics above are used in a common way across any unit - student, class, institution, etc. - then they are standardized. If the rubrics are adapted at any of those levels, then standardization is lost, as is the comparability of the scores that results from those rubrics.

I don't mean this to bash the VALUE rubrics or their use. To be clear, I certainly have my concerns and complaints about they ways in which their used. All I wish to say is that (a) this is not qualitative inquiry, (b) if you value qualitative approaches, then you shouldn't be using rubrics at all, and (c) in an effort to avoid some of the perceived weaknesses of quantitative methods, you've actually lost the validity of either approach.

But it's this fear of "standardization" that is worth discussing. There are three general examples where I see people use what I'd call "qualitatively-rooted" methods of assessment based on this fear. I'd like to discuss these cases, where they make an error similar to some of the issues already discussed here, and how they might be informed by either qualitative or quantitative methodologies to gather better information about student learning.

Example 1: We have an ill-defined construct

In my days as a student-affairs assessment professional, and in several other projects on which I've worked, I've encountered several very broadly defined constructs. Sociocultural competence, creativity, and civic engagement to name a few. On several occasions, faculty and staff have even questioned the possibility of these constructs being assessed. They'll usually say something to the effect of: "Well that just can't be measured!"

Some institutions venture to measure such constructs without a thorough understanding or effort to create meaning across relevant constituencies. In other words, it's common for an institution to adopt "civic engagement" or "cultural diversity" without actually establishing a meaningful definition of those constructs. Thus, a faculty member in the biology department might address and define "civic engagement" in an entirely different way than someone working in student life. When it comes time to assess civic engagement, each of these practitioners - lacking a meaningful definition - views this as a qualitative phenomenon: it's individualized (either from the student or programmatic perspective) and thus can't be quantified.

In fact, there are many meaningful definitions of civic engagement (e.g., AAC&U; Torney-Purta, et al., 2015). Where many practitioners fall short is by assuming that, because they don't have a meaningful definition (or lack the skills to establish one), that it can't be quantified.

So, my first recommendation is this: just because you don't have a sufficient operational definition to guide quantification doesn't mean you should use a qualitative approach. A qualitative approach should only be used if the definition is truly unknown.

(Spoiler alert: my recommendations are generally going to point to the need to consult with an assessment expert, though the approaches they use are not beyond anyone in higher education.)

Rather than a nebulous, ill-defined assessment that allows anyone in any setting to connect to a similarly ill-defined construct, what is needed is a thorough review of the literature. Institutions should seek to either adopt an existing model (e.g., the VALUE rubrics) or - as I'd prefer - synthesize and adapt findings from the theory in order to create a meaningful local definition. Certainly, assessment practitioners are helpful in this regard, particularly when it comes to translating theory into tangible learning outcomes, but such an exercise is essentially a literature review, which many individuals in an institution are able to conduct.

Example 2: We don't have a readily available measure

Early in my career, I remember working with a program on one of the aforementioned, seemingly nebulous constructs. While the program had agreed that the construct itself was important, they were not sufficiently familiar with research in the area to truly understand the granularity and complexity of extant theory. When discussing measurement approaches, one of the staff said something to the effect of, "The only way I could think to assess this is by asking students if they learned it."

To be sure, this type of approach has dominated much "assessment" work in the history of higher education. After all, if we truly break down most course surveys, they are somewhere between a satisfaction survey of the instructor alongside a self-report of learning.

If creating a definition of a complex construct seems difficult, building a meaningful assessment of that construct must seem nigh impossible. But, simply because someone doesn't understand how to build one and one is not readily available, this does not make the construct ineffable.

Other than use a self-report of learning, another common tactic is to create a rubric and apply it to student assignments. Again, I often find this tactic used more out of a lack of understanding about how to meaningfully link construct definitions, tasks, and scoring. Thus, rubrics are applied because they define the end goal without restricting the inputs - channeling that desire to individualize the experience.

A rubric may be a completely appropriate means of assessment, or moreover the best means. Yet I more often observe that a rubric (and the underlying attempt to allow multiple methods of expression) is applied more out a lack of understanding assessment development than it is thoughtful design. Here again, consulting an assessment practitioner can help answer this question.

Example 3: We want to capture the student voice

Even when a thorough construct definition and well-developed assessment are available, people may employ performance assessments and rubrics in order to capture the "student voice." Presentations of results will often refer to multiple forms of evidence, pairing data tables with quotes or examples from individual students.

In this case... I've got no qualms. In fact, this is an area where rubric-based methods far exceed purely selected-response measures. As much as I love and am compelled by data, there are many who are swayed most by a powerful story. Good assessment practitioners pair the two as much as possible in order to cover all audiences.

...

Ultimately, there are a few points I hope you've gleaned from this. First, if we're going to employ qualitative methods, we should truly understand what they mean, when they're appropriate, and their limitations. This should be a decision about choosing the best method rather than simply avoiding those where we have discomfort or objections. Second, I wish more practitioners understood precisely what is meant by the term "standardized" and the distinction between task and assessment. No matter how "authentic" the task, the rubric is the assessment. Third, the collaboration of proper assessment experts is key. They can help make meaning of these complex and nebulous constructs, identify the appropriate means of assessment, and - most importantly - promote the effective interpretation and use of the information you gather.

I want to conclude by saying again that I welcome input on the conversation. Assessment is hard! The more we can discuss different approaches and perspectives, the more we can learn from one another.

References

Aspers, P., & Corte, U. (2019). What is qualitative in qualitative research.Qualitative Sociology,42 (2), 139-160.
Guba, E. G. and Lincoln, Y. S. (1994). ‘Competing paradigms in qualitative research’. In Denzin, N. K. and Lincoln, Y. S. (Eds), Handbook of Qualitative Research. Thousand Oaks, CA: Sage, 105–17.
Shah, S. K., & Corley, K. G. (2006). Building better theory by bridging the quantitative–qualitative divide. Journal of management studies,43(8), 1821-1835.
Torney‐Purta, J., Cabrera, J. C., Roohr, K. C., Liu, O. L., & Rios, J. A. (2015). Assessing civic competency and engagement in higher education: Research background, frameworks, and directions for next‐generation assessment. ETS Research Report Series,2015(2), 1-48.

Can we measure student learning outcomes qualitatively? I don't think so...

Recent Posts

Comments