A Tale of Two Interview Loops

It was mainly the worst of times, when a comedy of errors left me on a prolonged job search, alternating between the spring of hope and the winter of despair. But there is lemonade to be made from the almost three dozen lemons interview loops I went through, and this article is that lemonade. Because, while astronomical amounts of money go into recruiting, and even though we live in the age of wisdom, most of the interviewing process is sorely broken at the majority of companies; and so we also live in the age of foolishness.

From my experience, most Big Tech companies more or less copy each other, and because they are giant ships that are impossible to turn quickly, change comes slowly. Many startups and smaller companies then look to the giants for inspiration, out of some combination of familiarity (because they used to work in Big Tech) or imitation (“we’re also Big Tech!”) or just assuming that anything Big Tech does must be good, because one hundred billion dollars can’t be wrong.

And so we end up with the typical tech interview loop:

  1. Sourcing: usually from a referral, but could be a recruiter reaching out after a LinkedIn search or, rarely, actually responding to an unsolicited application
  2. Recruiter screen: to do a sanity check and vet logistics (salary range and other expectations, like remote work)
  3. A general aptitude screen: coding or system design or leadership chat or some combination, depending on the role
  4. A panel, or “virtual on-site”: somewhere around 5 interviews scheduled all at once — but maybe not happening on the same day — with a mix of roles, from engineers to product managers to engineering managers to designers to QA
  5. If that went well, a final interview, with a senior leader, and then an offer

Some variant of this describes almost all of the ones I’ve come across, but: the devil is absolutely in the details, and those details are the difference between a great process and a terrible one. So let’s look at two hypothetical loops at those extremes.

The Bad Loop

If you’re hiring, and elements of the bad loop sound like your company, please think about improving them. If you’re looking, and you run into one a little too close to this: I’m sorry.

Sourcing

The bad interview loop usually starts off badly from the beginning. It’s obvious they don’t invest in candidate experience up front, and it rarely gets better further down the process.

There are a few ways to screw up sourcing, and the most common is just never getting back to candidates. No rejection on merit, no timeout rejection, no insulting GIF — just silence. This happens a lot.

Still better than nothing

Marginally better are companies that do reply, but after way too long. The worst of these for me was a startup that got back to my application — which was a referral — after 47 days. They set up a recruiter screen for the next day, which went well, and then an interview with the VPE for the following week. Unfortunately, that got rescheduled for the next week… and again, for the next week, and then again for the next week, and when it got to the 4th time, I decided to bow out of the process. After 84 days total.

The third common way to turn people away at this step is to send clearly automated emails or LinkedIn messages that are supposed to sound tailored to you. Like the recruiter is channeling a horoscope writer.

Hi Gabriel,

Your profile looks great. It looks like you are thriving at Hillrom, so I know it is a long shot that you’ll be interested, but hope springs eternal!

[Recruiting company] has been retained to recruit an Embedded Systems Manager at [hiring company]. This person will lead the embedded systems team comprised of electrical and software engineers while also serving as a software lead on connected care and digital health projects.

I’d welcome the opportunity to talk with you. Please let me know if you are interested to learn more. Thanks in advance and I look forward to your response.

An actual email I got. I’ve never worked with embedded systems.

In short, the sourcing is bad if they:

  1. Don’t value candidates, and
  2. Don’t value the company’s image

Recruiter Screen

This step is actually pretty hard to do badly, but it generally involves the recruiter being out of their depth and/or not caring about their craft. I’ve spoken to both (a) recruiters who had no idea there was a difference between Java and Javascript, and (b) ones who did, but treated our conversation like the most annoying part of their day — like their Zoom background might as well have a giant “I’d rather be doing literally anything else” banner. And of course, the worst is (a) and (b) combined:

Recruiter: yeah hey, I’m in a rush, but this won’t take long

Me: okay…

Them: so I saw your resume, I don’t have it in front of me right now, but can you just answer a few questions?

Me: sure

Them: ok so this is a management role, you have 5+ years of experience managing, recruiting, and mentoring a team of diverse engineers?

Me, realizing they have, in fact, not seen my resume: yeah

Them: do you have 8 years of Java?

Me: uh… [trying hard to make it past the syntax errors] … yeah

Them: do you thrive in a fast-paced environment?

Me: sure

Them: can you speak good English?

Me: I think you said the quiet part out loud.

General Aptitude Screen

After the boilerplate kind of checks, comes the first true test of the candidate’s mettle: do they actually have a notion of how to do this job, or are they a complete fraud? For engineers, this is almost always a somewhat basic (sadly, often leetcode-y) technical interview. For managers, it’s one of three things:

  1. A purely technical interview, because the thinking is managers should also know how to leetcode
  2. A leadership interview
  3. A bit of both

The bad versions of this are the ones that don’t test aptitude required for the job. We’re hiring a staff engineer for their background in designing APIs and scalable architectures, but first: let’s make sure they can figure out the trick to finding the longest palindromic substring. Because that’s the thing they’ll constantly trip over in their day-to-day job: figuring out solutions to neat little puzzles that can be done in 20 minutes.

And that’s really the crux of the problem, because this kind of exercise filters out a few kinds of traits that, in my experience, a lot of good engineers have:

  1. Not performing well while being watched
  2. Not having done algorithmic puzzles in years
  3. Not particularly caring to spend dozens of hours preparing for interviews
  4. Not having slept well the night before the interview

The arguments for leetcode interviews don’t hold water either:

  1. It’s fair: everyone from the new bootcamp grad to the Sr. Fellow gets the same questions. Is that fair though? It’s fair for the new grad, because they have no work experience and it’s all they can be tested on. And early in their career, they should still remember this stuff. But it’s not fair to someone with a decade of experience that (correctly) hasn’t thought about sorting anything beyond typing my_list.sort() in almost exactly as long.
  2. It shows motivation: the thinking is that if you’re willing to put in the hours to dust off long-forgotten skills, that means you’re a go-getter. Either that, or you have a ton of free time. Or are willing to work hard for 3 months so you can rest and vest.
  3. It’s a great weeding-out tool: “here at Hooli, we get far too many yahoos applying and we need a simple and consistent way to halve the middle of the funnel.” This I can’t argue with. If you have too many applicants and just need an efficient way to politely send scores of them away, while making it look like meritocracy, then this loop design does in fact meet those requirements.

Panel

The next, and sometimes final step of the process tends to be an interview panel. It’s kind of like a death panel, except it’s very much real, and only decides your professional and financial future. It’s a holdover from the days of in-person interviews, which you can tell because it’s often called a “virtual on-site”.

Back in the before times, companies used to fly people over to their headquarters, even across country, for a gauntlet of interviews held over a full day. You’d talk to 3-7 people across ~4 interviews — because sometimes they’d double up, or you might even have several people in one interview. Like much in-office culture, most companies have lifted and shifted this process into the Internet, and now offer the same experience, but worse because it’s over Zoom. Also better though, because you don’t have to fly to every interview.

The idea of having several people interview a candidate is obviously fine. The problems here are more around how they’re structured, and how much freedom the candidate has to tailor this step so they can make a good impression. Because to a lot of people, if not most, this step is very stressful. Some would rather do it in one day and get it over with (even if they’d perform better otherwise) and some would rather split it into two or three days. Companies that don’t care, also don’t care about candidate’s preferences here.

The more general problem though, that many companies are guilty of, is not adapting this step to the modern world. It’s not great for the company to sign up for more than ten people-hours of time (adding in the prep, write-up, and debrief time) just based on the one aptitude screen. Especially if it’s a low-signal one, like leetcode. Data is king here, but I would bet the panel pass rate at most quality companies is not that high, and so that adds up to a lot of waste all around: effort and stress for the candidate, plus time and energy for the interviewers.

The last thing to mention here is the companies not investing in quality tools, especially for the technical interviews: having the candidate write code in something like Google Docs instead of a purpose-built thing, like CoderPad or HackerRank. Same goes for whiteboarding, for like system design exercises. And using Microsoft Teams.

Final Interview

This step, mirroring the recruiter screen, tends to be more of a formality or sales tool. By now, the candidate has passed the panel gauntlet, and this final hurdle gives good candidates the opportunity meet with a senior leader in the organization, while also giving that leader the opportunity to meet with promising hires before the offer goes out.

Where this is done poorly, it’s either not done at all, or it’s a very rigorous interview. Not doing one at all can be fine, but — especially in a more talent-driven job market — having this step can sway a candidate with multiple options. It’s also great, from an organizational health perspective, for the most senior leader to either be part of the interview loop, or schedule a 1:1 shortly after the candidate starts; doing both is even better.

This interview should be real, of course, with questions that give the interviewer insight into the candidate; but if it’s too much more than formal, it can both put the candidate (who has already been through the gauntlet) off, while also causing an internal problem if the candidate fails a hard interview, after having already passed the panel. Can the more senior leader just override the panel? Because that doesn’t sound like a healthy org.

Surprise Step!

Sometimes, panels are split. Lots of soft thumbs up and down. And the job req has been open for months, so they don’t want to pass on a potentially good candidate. And so they decide to tack on another interview — or maybe repurpose the Final Interview above — in order to break the tie. This is a smell of a poor interview process. One that is not designed to produce strong yeses and nos. Or one that is, and has failed. Most well-run processes will explicitly forbid this, and for good reason: an 8th person won’t add that much value to the chorus.

The Good Loop

Now, for the fun one! Sorely few of the interview loops I’ve experienced have been good, but for the ones that have, it provided so much signal that it would’ve been worth taking a lower offer. A lot of the anxiety of committing to a role comes from the company’s culture, and a great culture usually comes out through the interview loop.

Sourcing

Great sourcing is genuine. Also the things I said earlier about valuing the candidate and the company’s image. But great places to work do it with care. The messages are personal, not form- or AI-generated. They show that the sourcer has actually read and understand your resume, as well the job description. And they make a good case for why they think you’d be a good fit and choose to go through their particular interview gauntlet.

But good sourcing is hard. To do this well, it obviously takes a lot more time, and therefore people, and therefore money. So it’s not a bad signal if a company is just okay at this. BUT, if the sourcing really is great, that’s great signal: it shows the company has committed actual dollars to candidate experience, because they care about attracting the best employees into a great environment.

Recruiter Screen

A great recruiter screen gives the candidate a positive glimpse into what the rest of the process will be like. The recruiter is prepared, understands what the candidate’s resume or LinkedIn says, and has good follow-up questions. The chat is friendly but organized and the run-down is provided up-front: “first I’ll give you an overview of the company, talk a little more about the role and give you a chance to ask questions, and then talk about your experience and how it might be a good fit.”

It’s mostly proforma, but value comes out of it: the recruiter gains more clarity than the resume provides, makes sure the logistics are lined up, and the candidate asks questions about the team/org, responsibilities, tech, etc that aren’t in the job description. Even if it turns out that something doesn’t line up, like salary expectations, everyone leaves feeling like it was a productive call.

General Aptitude Screen

An effective aptitude screen tests just that: how well this person is suited for this role. This means that the defining characteristics of the role have been thought through and distilled into discoverable skills; and then questions have been created which are good at discovering those skills. But an important piece of the puzzle is where the confidence comes from, that those questions illuminate the correct things. In most places, it’s just someone’s gut feeling.

We make social networks here at Hooli, so we need to make sure people understand Dijkstra’s algorithm. We’ll ask a question where the obvious solution is the algorithm, and pass people that either use it in the solution; or they mention it, have a good reason for not using it, and still come up with a good solution.

Hypothetical, but I’m sure someone somewhere has written interview questions like this.

Some problems with the above:

  1. Do successful employees really need to understand Dijkstra? Is there a correlation between that and job performance? Is it important for this particular role and experience level?
  2. To most candidates that are familiar with Dijkstra, is it obvious that it’s the right solution here? Does the wording and the problem statement lead them there? Or is it just obvious to the people that came up with the question?
  3. Can the question actually be favorably solved in another way? Are we too narrowly targeting a particular way of thinking?

Good questions have been purposely designed all the way through. Their goal has been validated by data as being good predictors of employee performance. The questions themselves have been vetted by trying them out on actual people, getting their feedback, and looking at false positives and false negatives. Ideally, those people aren’t just employees — though this, of course, is difficult.

The above applies to questions asked as part of the panel below. The essence here is that a great aptitude test is effective at predicting job performance.

Panel

I’m not sure what percentage of rejections in the interview process are false negatives — meaning a perfectly good employee has been turned away — but I would bet a small amount of money that it’s significant. As in, more than a quarter. And I’d also bet that a lot of that has to do with the candidate being nervous. And I’m not alone.

Chris Parnin of NCSU has a great paper in ACM on the matter, pointing out two things we all already know, intuitively:

  1. An estimated 40% of adults suffer from performance anxiety, and
  2. The typical software interview “has an uncanny resemblance to the Trier social stress test

For those of you who haven’t clicked on that link, that stress test was created to induce change in physiologically measurable stress markers, like heart rate, cortisol, and other things. The test makes those markers increase, in most people, anywhere from 30-700%. And how does it do that? It’s a simple, three part process:

  1. You have 5 minutes to prepare a 5 minute presentation for a job interview
  2. Ok, it’s time for the presentation, but: surprise! Your presentation materials have been taken away.
    • Also, the judges will be silent and maintain only neutral expressions the whole time
  3. Count down from 1022, by 13s. If you make a mistake, start again from 1022.

“Uncanny resemblance” indeed, right? It’s basically the same thing as software interview, which gives the candidate a complex problem — much more so than counting down by 13s — except they don’t even have time to prepare! And the typical interviewer is cold and silent. (As an aside, please don’t be that interviewer. Read my Art of Interviewing, if you are.)

What’s more is that he says “scientific evidence finds that women experience disproportionately more negative effects from test and performance”, and while I’m not saying that this is the only reason tech might be so heavily male-dominated, I wouldn’t be at all surprised if it turns out to be a big factor.

So what’s a company to do? The above paper has one suggestion — and I think it’s a special case of a more general solution — which is the “private interview”. That is, rather than give the candidate a problem and watch them solve it, give them the problem and go away, letting them solve it in private. When they ran this experiment, the results were undeniable:

In the public setting 61.5% of the participants failed the task compared with 36.3% in the private setting. […]

Interestingly, a post-hoc analysis revealed that in the public setting, no women (š¯‘› = 5) successfully solved their task; however, in the private setting, all women (š¯‘› = 4) successfully solved their taskā€”even providing the most optimal solution in two cases. This may suggest that asking an interview candidate to publicly solve a problem and think-aloud can degrade problem-solving ability.

from “Does Stress Impact Technical Interview Performance?”, in Association of Computing Machinery

The private interview is a great idea, and an improvement on the “take home” challenge, which suffers from the fact that some people will spend an hour on it, and others will spend ten, with two friends helping. But, there are problems with it. One that the paper mentions is that some people were frustrated by not having access to the proctor, to ask clarifying questions and be kept from drifting. So it’s not a one-size-fits-all solution.

But what’s the philosophy behind the private interview? That most people get too stressed out by a stress test an interview, and doing it in private helps. Most people though: not all people. And that, I think is the key insight: what the paper calls “provide accessible alternatives”.

Great companies will tailor their process to help the candidate succeed. They will allow the panel to be split up over as many days as there are interviews. They will allow the candidate to pick the programming language. They will have a “live” option and a “take home” option — I have yet to see the “private interview” option, though it’s by far my favorite. They will train their interviewers to be kind, to be expressive, and to focus on the person, not the question.

In short, as with all good science, they will make sure they have adjusted the process to eliminate as many confounding factors as they can, in order to minimize false negatives and false positives, and give everyone a fair chance.

Final Interview

There’s not much left to say about the final interview other than that a good leader will leverage them to motivate a promising candidate. So instead, I’ll summarize the two loops:

  • bad: disinterested people going through the motions of a process that will usually produce some result
  • good: thoughtful, kind, well-trained people assessing how well this candidate might fit into this role

The Art of Interviewing

The job interview process is a high-stakes dance that’s notoriously difficult and full of missteps — on both sides. (At least, in software engineering it is. If you care about another industry, <Jedi hand wave/> this is not the article you’re looking for.) Avoiding the missteps is an art, just like it is in dancing; we get better with practice, just like dancing; it’s a lot of fun when it goes well, just like dancing; and leading with dexterity is paramount. This article focuses on that last bit: not on what questions to ask or how to structure the time, but rather on artfully leading an interview. First, however — because it’s a crucial part of the interview context — we should start with how the job candidate views the process. Just like dancing.

Because while for Larry the Interviewer, it’s just a small fraction of his day — maybe an annoying one that pulls him from the dynamic programming problem he was definitely not working on — for Jane the candidate, it’s one of the most important events of her life. The median job tenure is about 4 years, so people will only have somewhere around 10 jobs their whole professional life. And each of those jobs is going to significantly affect Jane’s life both during, and after, it. The CV builds on the shoulders of the previous job, yes, but work is also where we spend the bulk of our weekday, make friends, and develop a large part of our identity.

So what does Jane go into this high stakes dance equipped with? If the company is large, maybe she can get a sense of the general culture from Glassdoor, or news articles, or social media. Otherwise, maybe a sense of how it wants to be seen, from its marketing. But even then: what will her daily life be like there? Would she like her new boss? Her teammates? The bureaucracy? The tasks she’ll be working on over the next year? Four years? Unless she knows someone on the inside, these questions mostly wonā€™t get answered until after the start date. Sure, she’ll get to peek in now and then through cracks in the process and through the five minutes of each interview in which she can ask stuff, but: largely unanswered. Which means it all adds up to a big gamble.

The gamble is better for more mercurial and adaptable personalities than those more averse to change, but your resume can only tolerate a couple of quick stints before it starts getting tossed aside. So it’s a big gamble for everyone. Or rather, I should say “for every job searcher”, because well… it’s a small gamble for Larry the Interviewer. Worst case for him? It isn’t even giving his “thumbs up” to a terrible employee — because someone truly terrible will get fired before too long. No, the worst case is that he gives his thumbs up to a really annoying Taylor that ends up on his team and really annoys him for the next 4 years. And also produces mediocre work that Larry then has to constantly deal with. But Taylor isn’t annoying enough and the work just isn’t bad enough to get fired or even be put on a PIP. Taylor kind of coasts just baaaarely on the good side of the policy. And in doing so, makes Larry’s work life feel like a chirping smoke alarm that he can never find. That is the worst case for Larry.

The best case? He ends up with a really awesome coworker. Which, depending on the existing coworkers, may or may not matter a lot. But in any case except for the narrow and unlikely one of Taylor, the stakes for Larry are much, much, much lower than for Jane. And yet, this is who decides Jane’s fate.

The irony is that all of us will — at different times — be the Jane, and most of us will also be the Larry at other times. And when we are the Larry, interviewing a job candidate, do we act as the interviewer we’d like to have interview us, when we’re searching for a job? When I grudgingly leave the house, I usually drive, and sometimes I walk, and sometimes I bike. And what I find fascinating is that when I’m a driver, I get mad at other drivers, when I’m a cyclist I get mad at drivers, and when I’m a pedestrian, I get mad at everyone. Even though I know exactly what challenges they’re all going through.

Through this lens, how should Larry — all of us, when we are the Larry — conduct his interviews? How can we be the Larry we want to see in the world?

Larry does have a concrete output from the interview; he needs to answer one simple question: is Jane likely to be successful in this role? “But wait,” you ask, “can Jane’s ideal Larry even answer that question? Can he both be empathetic and figure out if she should be hired?” To which I say: “not only can he, but that’s the best way to do go about it!” Let me illustrate by looking at some questions Larry should not be trying to answer:

  1. Did Jane answer all my questions correctly?
  2. Was Jane quick on her feet and graceful under pressure?
  3. Did Jane pick up on my algorithmic hints?
  4. Was her solution complete?
  5. Do I like Jane, as a human being? Could we be friends?

Those are certainly signals one can get in an interview, but are they relevant to her success in the role? Does her answering all of Larry’s questions correctly mean there’s a good chance she’ll get a high performance rating next year? It depends on the questions, right? Well then, what about the inverse: does getting answers wrong correlate with bad performance reviews? Or does it correlate with nervousness? Or miscommunication? Or ignorance of a concept that Jane could learn in the first week on the job?

The hard truth is that while most interviews test something, that something is more likely to be “ability to pass our interview” than “ability to do the job well”. Which — if the abstract interview is indeed a conscious choice, and more often it is not — then the argument for it goes like this:

We want people that will do what they need to in order to succeed; if they study hard for our arbitrary and irrelevant interview process and pass it, it means they (a) really want to work here, and (b) can do the same for a real-world project

Even for well-known companies that can attract enough people to run their gauntlet, this is of dubious value, because they’re filtering for a specific criteria: studiousness; and simultaneously filtering out many that are at odds with it, starting with people that don’t have the luxury of time to study for their interview.

Instead, I think interview questions should be relevant for the job and tailored to illuminate whether Jane will be successful in it. Does it matter if Jane came up with a coding solution in 10 minutes versus 20? How much time would she have in the course of her job for that problem? Does the job require her to be well-versed in algorithms? Even if Stack Overflow didn’t exist. Is it relevant that she didn’t finish the last part, even though she explained how it would work? Are the people we like more productive than the ones we don’t? Do these rhetorical questions make my point?

Which is that the interview is not a test and it’s certainly not an interrogation. It is above anything else a conversation, ideally between equals, in which both parties are trying to figure out if an employment arrangement would be a win/win scenario. And as the interviewer, by leading it with empathy, you accomplish two things:

  1. Have a much better chance at arriving at the truth
  2. Leave the candidate with a great impression

So how should Larry approach the interview? I like to think about role models, because we humans are great at mimicking, and in this case I think the right archetype is a podcaster. They have a guest on their show and they generally try to make the experience pleasant, to keep the content interesting, and to really get at what makes their guest tick in their particular way. If they have an actor on, they try to figure out what makes them a great actor; if it’s a business tycoon, what makes them great at business; if it’s a scientist, what makes them great at science. And similarly, Larry’s job is to figure out what makes his guest great at programming.

In order to successfully do that, the guest has to first and foremost be comfortable. People won’t let you in unless they’re comfortable. And if they won’t let you in, it is a giant barrier to understanding them. So Larry should spend some time in the beginning of the interview breaking that ice. He should make small talk — genuine small talk, not awkward conversation about the weather. Tell Jane a little about himself: what he does at the company, what he’s passionate about in his work, and what his background is. Things that will give Jane an understanding of him and help them find common ground. It’s not only worth the investment, but it’s what makes the rest of the interview worthwhile.

Once a good rapport has been established — or Larry’s given up on such happening — only then should he go into the topics he needs to cover. And he should always remember to treat Jane as he would a valued guest in his home. To be kind, to be forgiving, and to give her the benefit of the doubt. If she says something that sounds incorrect, he should make sure it’s not due to a simple mistake or misunderstanding. He should ask polite questions that help him understand how much Jane knows and understands about graph traversal or whatever — not merely that she does (or doesn’t) know that those words are the answer to his question. Because in the end, Larry’s job isn’t that of a proctor, to simply grade Jane on her performance as if this were an audition or a midterm exam. Larry’s job is actually much more difficult: it’s to create, in 45 to 60 minutes, a sorely incomplete mental model of Jane from a certain angle — be it programming ability or cultural fit or leadership style or what have you — and to then decide if that mental model is a good match to fill the open role.

Again: this is hard and it takes a lot of practice to do well, just like dancing. But taking shortcuts will just lead to lots of false positives or negatives. Tech companies love to industrialize the process and create complex questions with “objective” answers and rubrics that tally up those answers into a simple pass/fail exam and then also point to that process when talking about fairness. But in truth, there’s nothing fair about standardized tests — which is why higher education is finally moving away from the SATs.

They add a veneer of objectivity on top an industrialized process that answers not so much what a person understands, but what they’ve been exposed to and what they can quickly recall in a stressful situation. It’s like being tested on “ticking time bomb diffusal” for a job making watches. And the opportunities for subjectivity still abound: from how comfortable or nervous the candidate is, to how many hints they’re given, to how much sleep they’ve gotten the night before, to whether they’ve seen a similar question recently, and to the leniency of the proctor.

What sets Oxford and Cambridge apart from most other universities is that they use the Tutorial System, in which students learn the subject matter in whatever way makes sense, meet in very small groups with a tutor and have a discussion — in which it will become readily apparent how well they understand the subject.

It has been argued that the tutorial system has great value as a pedagogic model because it creates learning and assessment opportunities which are highly authentic and difficult to fake.

from Tutorial System at Wikipedia

Software interviews of a similar ilk have the same value.

But the one thing to remember is that regardless of the questions we ask when we’re being the Larry — no matter how good or fair or comprehensive they are — the questions are just a means to an end, and not the end. They are a conversation starter, and it’s up to us to guide that conversation in such a way that will allow us to understand our guest to such a degree that we can answer the only question that matters: “is this person likely to be successful in this role?”

The Best Testers Are Scientists

It doesn’t take long to appreciate a great software tester. And it doesn’t matter if she’s a manual tester or writes automated tests, because what really matters are the types of tests being run: curious tests. Tests that don’t just discover a bug and quickly document it away in a ticket, along with the state of the whole world at the time of discovery. But instead, tests that try to find the exact circumstances in which the bug occurs.

The more defined those circumstances, the more helpful the ticket is to the developer and, ideally, will take their mind right away to the exact function that is responsible for the bug. In those cases, you can almost see the light bulb go off:

Tester: I’ve only seen the bug on the audio configuration screen, and it usually crashes the app after single-clicking the “source” input, but I’ve seen it a couple of times from the “save” button too. And it seems to only happen after a fresh install on Android 10.

Dev: ohhhh! That’s because the way we handle configuration in Android 10 changed and the file the audio source is saved in doesn’t exist anymore!

This is exactly the kind of dev reaction you want to a bug report. It’s an immediate diagnosis of the problem, which was only made possible by a very well-researched and described bug. But notice how that description could, with changes only to the jargon, have been written by an entomologist:

Entomologist: I’ve only seen the bug on a tiny island off the coast of Madagascar, and it’s usually blue with green spots, but I’ve seen a couple of them with yellow spots too. And it seems to only come out right after sunset in the wet season.

Which is kind of obvious when you think about it, because what do scientists do? They test the software that is our reality. Galileo’s gravity experiments is one of the more famous in history (and likely didn’t happen), but what is it, in software terms? He wanted to know if the rules of our universe took weight into account when pulling things toward the Earth. A previous power user, Aristotle, figured that the heavier a thing was, the faster it would fall. But that user failed to actually do any testing. So thank God that talented testers, like John Philoponus and Simon Stevin, came along and figured out that things mostly fall at the same rate through air, and then bothered to update the documentation.

What Aristotle did was assume the software worked in a certain way. Granted that he didn’t have the requirements to reference, but he probably noticed that you have to kick a heavy ball harder to go the same distance as a lighter ball, and he figured that the Earth kicks all things equally hard. That’s the equivalent of our tester above seeing the “source” input work on Android 9 and not bothering to test it on 10. Or seeing that it worked on the video configuration screen and not bothering to test it on the audio one too.

And that’s okay, because Aristotle was not a tester. He was more like a fanboy blogger. But what testers should be, is bonafide scientists, like Simon Stevin, who follow the scientific method:

  1. Ask a question
  2. Form a hypothesis
  3. Make a prediction, based on your hypothesis
  4. Run a test
  5. Analyze the results

In our example with the “source” input, after the tester saw it the first time, she probably did something like this:

  1. “why did it crash?”
  2. “maybe it was because I pressed the ‘source’ input”
  3. “if so, that’ll make it crash again”
  4. Relaunched the app, tried it, it crashed again.
  5. “okay, that was definitely the reason”

Aristotle might stop there and file the bug: “app crashes when using the ‘source’ input”. And the developer would try replicating it on their Android 9 phone and kick the ticket back with “couldn’t replicate”, and that whole cycle would be a waste of time. But our tester asked another question:

  1. “does it crash on this other phone?”
  2. “if it doesn’t, it’s a more nuanced bug”
  3. “I think it’ll crash though”
  4. Tried it on the other phone: it didn’t crash
  5. “what’s different about this phone?”

And she continued the scientific process like that, asking more and more pertinent questions, until the environment that our bug exists in was fully described. Which is exactly what you want in a bug report, because anything less will, in aggregate, be a productivity weevil, wasting both developer and tester time with double replication efforts and conjectures about the tester’s environment and back and forths. A clear, complete bug report does wonders for productivity.

So then, why not just teach all your testers the scientific method? Because it doesn’t work in the real world. We all learn the scientific method, but few of us become scientists. And I imagine that, just like in any profession, a not-insignificant number of scientists aren’t good scientists. Knowing things like the scientific method is necessary, but not sufficient to make a good scientist. You also need creativity, in order to ask the interesting questions, and more importantly, curiosity to keep the process going until it’s natural conclusion — to uncover the whole plot.

Tangentially, curiosity is a hugely important trait in great developers, too. But for testers, even more so.