Software engineers often complain about the endless technical tests they have to overcome in order to get hired.

For programmers who are relatively green it’s perhaps understandable that they’d be tested heavily, but it’s somewhat galling if you have 10 years’ experience in a domain and get asked to answer basic programming and algorithms problems. You don’t ask an accountant to solve simple arithmetic problems before you allow them to file your taxes.

Anyway, the reason for all the testing is simple. Most programmers are rubbish at programming, so you have no choice but to test them extensively if you don’t want to inadvertently hire a smooth-talking knucklehead.

Unfortunately, the same is true of data science. Without testing a prospective hire’s ability to do data science, you’ll have no way of knowing whether they are actually able to do the job.

Most likely you’ll optimise for the ability to talk about data science rather than actually do data science, which is fine if all you want is a great communicator, but not so good if you need results.

When I’m interviewing prospective hires, I’m painfully aware that I too can be fooled by someone who uses roughly the right terminology and seems to follow the right kinds of processes.

I always find it interesting to see how candidates respond as I move from generalist “talk me through how you solved that problem”-type questions to more specific technical questions. Often candidates will be able to improvise their way through the generalist questions but come unstuck as soon as they’re asked something very specific.

In general, people are easily fooled by confident, eloquent speakers. It’s easy for a charlatan to make it sound like they know what they’re talking about when they really don’t. On the flip side, a slightly diffident data scientist might turn out to be exactly the right person for the job.

Testing data science candidates

When devising a data science test, you can forget about using brainteasers, personality tests or clever questions the interviewer thought up in order to demonstrate their own intelligence. Doing well in such tests is a poor predictor of on-the-job performance.

The best way to test a prospective hire is to have them perform work that is as close as possible to the actual role. This is often called a work sample test.

The reason for its effectiveness should be obvious: what better way to evaluate a candidate’s suitability for a job than to have them do the job itself?

Coming up with a good work sample test is not easy, however. There are a couple of things you’ll need to take into account:

  • You’ll need to distill the main elements of the job into a test that is concise enough to be doable within a short time frame.

    We’d suggest focusing on testing things that can’t be learnt quickly on the job.

  • You should devise an objective scoring rubric to grade each candidate.

  • You should standardise your test to ensure every candidate is treated in the exact same way. Ideally you should calibrate your scoring using data scientists you’ve worked with before.

    As a bare minimum, at least one person should successfully complete the test before you give it to a candidate.

  • Good data science tests typically involve a data task that encompasses exploratory analysis, some cleaning/feature engineering and building a model.

    You can also incorporate a system design problem into your task (e.g. “how would you productionise this model to serve predictions to hundreds of users per second?”) if it’s relevant to the job.

  • Try to ensure your test is substantive enough to allow you to gauge the applicant’s code quality. Clean, well-structured code is something many data scientists struggle to achieve.

We’d suggest allowing candidates to do the test at home instead of insisting that it be conducted in person. This will save time and money for both parties.

If the candidate is successful at the work sample stage, you can invite them over to your office to present their results.

On the subject of time, try to ensure the test can be completed in a maximum of 6 hours by a good candidate. If possible your test should be a lot shorter than this; between 2-4 hours would be ideal. Don’t forget that your candidate will most likely have other tests to do as well.

Senior candidates will also often have families, so they’ll be very constrained for time outside normal working hours. (I don’t have a family, but I also have very little free time. Those pastries don’t just eat themselves.)

Should you pay for the candidate to do the test? This is a tough one. It’s not the norm in industry for candidates to get paid to do tests, but I wouldn’t be surprised if this changes given current trends.

Morally I’d say you should pay for a candidate’s time, but I’ve never actually been paid to do a test before. For me, the benefit of securing a good job is worth the small investment of time. I also enjoy working on interesting, challenging tests, so do try to make sure your test isn’t as dull as dishwater.

Unless, of course, the job itself is boring, in which case it’s probably for the best if your candidate knows what to expect.

Encourage candidates to use the same resources that they’d use on the job and keep an open channel for any questions they might have. You wouldn’t refuse to answer questions in a real job situation, so it doesn’t make sense to create arbitrary restrictions during your testing phase.

Make sure each candidate knows how long it will take for their submission to be graded and stick to this timeline. Speaking of which…

Being nice

Try not to annoy prospective hires by demanding things that aren’t really necessary. From a human perspective, I shouldn’t need to explain further the importance of being nice.

But even if you’re a callous employer with a block of ice where your heart is meant to be, it’s still in your interest to be nice. Demand massively outstrips supply for good data scientists – within reason, we don’t have difficulty getting the kinds of packages we want.

Here are a couple of things that should be no-brainers, but very rarely are:

  • Right from the start, clearly communicate to the candidate your hiring process from CV review through to offer.

    All you need for this is a simple timeline indicating each stage of the process and who the candidate will be talking to. Ensure you stick to your timeline.

  • Realistically HR will get involved with your hiring process at some point. Again, set expectations for candidates so they won’t be surprised if HR throws up a few roadblocks (we’ve all been there!).

  • Don’t use a work sample test to get candidates to do real work for free. The data science community is fairly small and nobody will want to work with you again if you do this.

If your hiring process is very unpleasant, many good candidates will just walk away instead of wasting time and energy with your company. There are plenty of other data science jobs out there.

And if you’re happy with a candidate, make a serious offer as fast as you possibly can.

I can’t emphasise this enough. I don’t want to boast (he lied), but I’ve never been turned down for a job once I’ve spoken to a real person. I also don’t like to waste time, so I simply take the first good offer I receive.

Generally I do this out of a (possibly irrational) fear that I’ll lose the offer if I hang around for too long. I’m pretty sure a lot of data scientists are the same way, so you can save yourself a bunch of time and money just by making hiring decisions more quickly than your competitors.

Don’t copy Google

It’s quite common for companies to cite Google or some other hugely successful tech company as inspiration for their hiring practices. After all, Google is known for having outstanding employees, so surely it would make sense to learn from how they hire?

Unfortunately, what works for Google almost certainly won’t work for you. Because, well, you’re not Google.

All hiring processes are fundamentally a tradeoff between false positives and false negatives. A false positive is an accidental hire of a bad candidate. A false negative is a rejection of a candidate you should have hired.

A super-high interviewing bar means that you’ll have very few false positives but also a lot of false negatives. You’ll reject a lot of great candidates, but the few people you do hire will likely be excellent.

On the other hand, a low interviewing bar means that you’ll accept a lot of dodgy candidates… and then you’ll either have to put up with incompetent employees or fire people regularly. Also not good.

Google-style interviews are intended to minimise false positives: they really, really don’t want to inadvertently hire a dud candidate. For this reason, they have a very low acceptance rate and reject a lot of people they “should” have hired.

So why not do the same as Google? There are two reasons for this:

  • Everyone wants to work at Google. They have a huge pool of candidates to draw from so even if they reject 99% of them, they’ll still be left with a sizeable number of people.
  • Google pays for the privilege of being so selective. Do you pay as well as Google? Is your company as prestigious as Google? If not, perhaps you should reconsider being as selective as Google.

Of course, this doesn’t mean you should be OK with false positives and accept mediocre candidates. It just means you can’t afford to have an interview process where you reject huge numbers of qualified people. Your hiring process needs to minimise both false positives and false negatives.

Yes, that’s right. You need to be better at hiring than Google. It’s actually not that hard, because Google is optimising for something different from you.

Writing a job description

A good job description for a data science role is not much different from any other technical role. Things you’ll need to cover:

  • A description of your company and team/department
  • An overview of responsibilities for the role
  • Hard requirements for the applicant
  • Nice-to-haves
  • Salary, benefits and other perks

Employers have the unfortunate tendency to list far more skills than they need as “required” for data science roles.

Almost all the jobs I’ve worked in over the last 3 years have specified deep learning as a hard requirement. For the most part, the subject never comes up again during either the interviews or the jobs themselves. (Interestingly, few job descriptions ever mention “data munging skills” as a hard requirement…)

This is so common that I view it as a minor annoyance rather than a dealbreaker, but it doesn’t give a particularly good first impression for a company.

And not every applicant will realise that your “hard requirements” aren’t actually requirements. You’ll scare off both good and bad candidates by writing an inaccurate job description, so just honestly state what the job involves and don’t exaggerate.

If you really do think writing JavaScript might be part of a candidate’s responsibilities in the future, put this down as a nice-to-have and use it as a tiebreaker in an “all other things are equal” situation.

It’s really nice if you say what a “day in the life of a data scientist” would be like at your company. An experienced candidate should ask about this anyway, of course; there’s very little consistency between data science operations in different companies.

Without this knowledge, accepting a new role can feel a bit like walking across a motorway with your eyes closed in the hope of finding a pastry on the other side.


Oh, remember when we said we’re putting together a free work sample test and scoring rubric just for you?

We’ll also create a sample job description for you to use as a template. Aren’t we nice?

Subscribe below to our newsletter if you’d like us to send you these materials when they’re ready. We hate spam so we’ll only send stuff that we ourselves would be happy to read.

As always, you’re welcome to customise our materials as much as you like.

(In the second and final part of this series, I’ll say more about the specific skills you should test for when hiring a data scientist.

Want to discuss your own data problems? Book a free consultation with us – we’re here to help!)

Updated: