Me Vs. the Machine

I recently returned from a professional conference where I most definitely did not feel a lot of winning. While I’ve worked at my institution for almost two decades, I’m new to the field of Institutional Research.

On the one hand, back in the day, I did minor in social sciences, earned an A in statistics, and got a perfect score on the Analytical section of the GRE (the section that ended up being discontinued, so it very well could be that everyone who took it got a perfect score). But on the other hand, my undergraduate major was English, and my Master’s degree was in Liberal Arts. So I don’t exactly go around calling myself a “numbers person” like many of my IR colleagues do.

Sitting in sessions where terms like polynomial, coefficient, linear regression, and event history model* were bandied about recklessly was kicking my Imposter Syndrome into high gear and leaving me feeling like the stupidest person in the room.

I slumped in the back and tried to blend in and nod knowingly at the appropriate moments. I pictured being chased by a wild horde of pitchfork-wielding statisticians if they were to discover, to their horror, that an English major had sneaked into their midst.

wolves at the door

My job in IR so far has been consumed, not with calculating standard deviations or remembering what the hell a polynomial is, but with getting accurate data to the state, the feds, the accrediting agencies, the surveys, the administration, the faculty, staff, students, press, and John Q. Public. That, in itself, is more of a daunting task than you might think when every single seemingly simple term (ex., “student”) can be defined in multiple ways (someone enrolled in a class, you say? Okay, but enrolled as of when? Do you include auditing students? What about those enrolled only in zero-credit classes? Students taking off-campus contract courses tuition-free?). The answers are always dependent on who is asking and why and when, and the common definitions are few and far between.

So as I sat in sessions about actual, hard-core research and analysis, about 60% of my brain experienced feelings of inferiority, while the other 40% said, “The kind of rigorous, careful, scientific process required to establish true causation for things like why students drop out? Ain’t nobody got time fo dat.”

data zebra

There was one session about qualitative analysis coding and the very learned speaker mentioned linguistics. I perked up for a moment, but just as I was thinking I had found My People, he lost me down a deep rabbit hole discussing various complex statistical models that had nothing to do with language. The methodology he was describing for analyzing qualitative data was so far removed from anything I do at my job or anything I even realized we should be doing, I started to feel a little dizzy from the disconnect.

It was when the speaker showed us some examples of machine-categorized open-ended survey responses from his research that I began to cheer up and feel a win coming on. I started reading the sample responses from their telephone survey to non-returning students. Short, one-sentence responses were coded well by the software, but I noticed that the longer, more nuanced responses tended to be coded in over-simplified ways that didn’t truly capture the heart of the student’s problem.

For example, one student went on a diatribe about the difficulty of going to school while working but said she had to work because tuition is so high and the school hadn’t offered her an adequate level of funding and, at the very end, said her advisor did not help her resolve the situation. Since she used the word “advisor” but not the term “financial aid,” the software categorized her into the advising issues group, whereas it seemed to me the main reason she didn’t return to school was all related to finances, so it would have been more accurate to count her in the financial aid issue group. The best advisor in the world probably couldn’t have gotten her to stay, but a bigger financial aid award might’ve done the trick.

As I read along, I noticed there were quite a few that I would’ve categorized differently than the software. The presenter admitted to the limitations of the technology, particularly with responses over one sentence in length, and I began to feel vindication.

Okay, you could call me the unreliable rater, but I’m convinced it’s the machine failing the reliability test here. I’m sure I’ve had more heart-to-hearts with teary-eyed freshmen and read more of the texts they produce over the decade that I taught the freshman seminar than whatever brilliant but cloistered dude wrote this software.

I could feel that sense of pride and win growing in my heart as I thought to myself: maybe English majors do have a place in IR after all. Maybe there are gray areas and narratives hidden inside those seemingly black and white numbers and open-ended responses that we can tease out with our literary analysis skills. Maybe we understand the subtleties of language and can pick out recurring themes and hidden meaning a little more adeptly than a string of code can.

I felt downright John Henry-esque in my victory over the machine.

fox vs machine

I realize that I only have the machine beat for now, until the techie people get the kinks worked out of these Computer Assisted Qualitative Data Analysis software packages and get them to the point where they and I have an equal chance at understanding what the kids today are trying to tell us.

When that day comes and I am inevitably replaced by a machine, don’t weep for me. For I will pursue my dream of opening a puppy petting parlor/wine bar/collaging studio, and the world will be a much better place for us all.

*It turns out that “event history models,” disappointingly, have nothing to do with black holes or with smizing for the camera.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s