After all the hard work you put in to get yourself prepared for a Data Science career, you are going to look at Data Science job interview. Even if you are very confident in your skills, and maybe a good reference, it’s still a stressful experience.
Data Science interview questions are more hands-on kind of questions on a high level, so you need to give yourself a lot of time to practice and not cram materials, some questions floating on the internet. Of course, you need to prepare for a certain type of screening questions such as
- What is a random forest?
- How to interpret a p-value
- Where do you use the decision tree algorithm?
- Which language you prefer R or Python and why so?
You may also expect open-ended in your screening interview such as 1. How Data Science help Business in general? 2. Why do you choose to become Data Scientist?
Though these questions are open-ended, which can have lengthy answers, I suggest keeping in straight to be a point with around a min length response. To keep it this way, you need to prepare beforehand for this kind of questions.
What do interviewers looking for?
From my experience of working with big companies in advising them in Data Science Strategy, I can tell you that many companies, even the big ones, have a limited idea about what a data scientist should know. Some require Data Scientist with strong business & analysis skills and expect them to drive Data Science projects with a team of Machine learning experts, Analyst and Developers, while others expect Data Scientist to a strong knowledge in creating models and building things by themselves. Especially, if the Data Science team is just a few people and Data Scientist should be capable of building things.
What kind of questions to expect?
Well, this depends on the kind of job role you have applied for. That being said, the questions in Data Science interviews can be broadly categorized into 5 areas: 1.General 2. Statistics 3. Programing including R, Python etc., 4. Modelling & Problem Solving 5. Behavioural and Cultural fit
And of course, some terrible questions, which generally have no answers, to test your temperament.
These are usually to know the high-level knowledge in Data Science.
- How Data Science is different from traditional business intelligence?
- Do you think Data Science can be useful to our organization? How?
- What are the different types of Analytics? (for ex., Descriptive, Predictive, Discovery and Prescriptive..)
- What would be the timeline you suggest to transform our organization to Data Science enabled organization (Typically this question for senior professionals as Data Scientist)
These questions are intended to test your knowledge in applied statistics and math.
- What is regression, where it is applied?
- Explain Binomial Probability Formula?
- What does the R-Square value mean?
- Explain the Central limit theorem with an example.5. What is Gaussian distribution? Where it is applied?
These questions would be focussed on general programming skills, SQL and data retrieving skills, big data, popular applications for data analysis and languages such as R and Python.
- If you were given a list of numbers, how do you sort them?
- What is sparsity? How do you deal with it?
- Model Performance vs Accuracy, which is more important when designing a machine learning model?
- How would you represent 5-dimensional data effectively?
- What is the difference between an outer join, inner join, left/right join, and union?
- What packages are you most familiar with in R language and Why?
- What are the different types of sorting algorithms available in R, which one you prefer?
Modeling & Problem Solving:
- How do you identify spam emails from ham emails? Which algorithm do you suggest?
- What you prefer, 5 days of effort for developing a 90% accurate solution, or 10 days for 100% accuracy? Explain why?
- Where does a general linear model fail? Give few examples.
- How would you validate a model you created to generate a predictive model of a quantitative outcome variable using multiple regression?
- How would you detect bogus reviews, bogus Facebook accounts used for malicious purposes?
- Can you suggest a model identify plagiarism?
Behavioural and Cultural fit
- What do you think makes a good data scientist?
- How did you become interested in data science?
- Tell me about a time you failed, and what you have learned from it.
- What unique skills do you think you’d bring to the team?
Finally, it’s all depends on how confident you are and that doesn’t mean that you know the answer to all the possible questions. You don’t have to master everything you need to perform that role in the interview itself, rather you will be learning and improving your skills based on the role you perform. So trust yourself, prove that you have a good foundation and can learn quickly to contribute to the role you are being interviewed for. Remember that a failed interview is a great learning opportunity so that you can crack the next one. Wish you the very best.