276°
Posted 20 hours ago

Machine Learning System Design Interview

£16.15£32.30Clearance
ZTS2023's avatar
Shared by
ZTS2023
Joined in 2023
82
63

About this deal

Alexey: The typical components of a machine learning system – this is the first part of the question – are things like data pipelines, data preparation, things to calculate features? ( 49:35) Full Book Name: System Design Interview – An insider’s guide Volume 1 And Volume 2 By Alex Xu (Set Of 2 Books)

Each ML use case in your organization has been deployed using its own workflow, and you want to lay down the foundation (e.g., model store, feature store, monitoring tools) that can be shared and reused across use cases.

Provide the most relevant and up to date ML techniques and domain knowledge for your targeted teams/companies.

Valerii: These interviews are, of course, behavioral, project impact, (that makes sense, right?) and two very important things are the system design interview – which is how to design the system overall – and machine learning system design. These interviews are usually conducted for people starting from level five. Of course, at the very beginning nobody knows what level you are – it might be between four and five, so you might end up being level four, which is still common for this interview. ( 9:36) The quality and quantity of training data is a big factor in determining how far you can go in your machine learning optimization task. Data collection techniques primarily involve user interactions, human labelers, or specialized labelers. You also want to bring up technical scaling requirements (don’t make assumptions, it’s key to clarify this out loud): The standard development cycle of machine learning includes data collection, problem formulation, model creation, implementation of models, and enhancement of models. It is in the company’s best interest throughout the interview to gather as much information as possible about the competence of applicants in these fields. There are plenty of resources on how to train machine learning models and how to deploy models with different tools. However, there are no common guidelines for approaching machine learning system design from end to end. This was one major reason for designing this course. These papers show there’s a wide range of ways to create state of the art recommenders, it’s definitely not a cookie cutter problem. Pay attention to the setups, how they do offline and online evaluation.

Overview

Alexey: That's quite a lot of information. I was trying to process this. That's quite a lot of things. So this was an example of machine learning system design. The interview starts and then the person – the interviewer – asks you, “Let's design a system for detecting fraud.” And then you probably ask this person a few questions and then you do this information dump on that person, right? ( 20:33) Kmeans. Try to implement Kmeans from scratch sample code from flothesof.github.io. Bonus point for vectorized version in numpy + completed in 20 minutes. Follow-up with worst case time complexity and improvement for initialization. I’m a SWE, ML with 10 years of experience ( Linkedin profile). I had offers from Google, LinkedIn, Coupang, Snap and StichFix. Read my blog. This book is the result of the collective wisdom of many people who have sat on both sides of the table and who have spent a lot of time thinking about the hiring process. It was written with candidates in mind, but hiring managers who saw the early drafts told me that they found it helpful to learn how other companies are hiring, and to rethink their own process.

Gradient boosted trees: Better performance than logistic regressions, can find non-linear interactions, typically doesn’t require much tuning. Obviously there’s many more items here. Notice that the concepts are still vague, and would require clarification to actually use in a model. Eg. don’t just leave a feature as ‘history of items liked’, that’s not a numeric value you can train a model with. Feature Representation Alexey: [laughs] I might be wrong with using these words. I think the recruiter probably used different words. But the reason for me failing the process – the whole interview – was machine learning system design. Not the others. I was afraid about the others. But in the others, I did well, but I failed that one. And the reason there was because the interviewer expected me to talk about actual machine learning. Instead, we talked about metrics, heuristics, and then I didn't have enough time to actually cover machine learning. Yeah, so what do you think about this? Is this typical for the process? Is it expected? ( 28:28)Be familiar with core ML concepts and infra listed above (hopefully you’ve already been preparing for this) Valerii: Yeah. We run the A/B test there, and what is the metric of interest? Again – you see, this question pops up every time. “What is the metric of interest? What are we actually trying to achieve?” ( 58:07) It comes with links to practical resources that explain each aspect in more details. It also suggests case studies written by machine learning engineers at major tech companies who have deployed machine learning systems to solve real-world problems. Alexey: Basically, when you interview they do this automatically and probably at this round, they use it to assess which level to put you. ( 55:30) Logistic regression. Try to implement logistic regression from scratch. Bonus point for vectorized version in numpy + completed in 20 minutes sample code from martinpella. Followup with MapReduce version.

When you have nailed down all of your ML system’s requirements, you can proceed to building your model. This involves: The slides, (very intensive) notes, assignments, and final project instructions will be made publicly I’m author of ML interview github repo and the ML system design course on educative.io and interviewquery.com.Make sure you bring up how you would launch the system and actually evaluate whether it’s achieving its business objectives. This is almost always via A/B testing, which has lots of its own nuances. Talk about which metrics you’d measure and statistical tests you’d perform for an A/B test. You can go into some depth talking about ramping patterns and issues that arise with A/B testing. Model Lifecycle Management Deep Neural Networks for Youtube Recommendations (2016) - How Youtube uses embeddings for candidate generation Valerii: Yeah, it's a good taxonomy. It's a good taxonomy. It's a good book. If it didn't reveal anything new to me, it doesn't mean it's a bad book. It just means that it's my problem. ( 53:24) selection, training, scaling, how to continually monitor and deploy changes to ML systems, as well as Valerii: Do we need to introduce some weights? Okay, good. What data will we use? Is it the amount of the transaction? Is it just the history of the user? How fast will we update them? Now let's say we have a model. How can we assume that model is better than the previous one? Of course, we have some offline metrics. We have an expected calibration error, weighted expected calibration error, precision – we don't have precision, forget about that. It's a bad metric because it's class-balance sensitive. We have specificity. We have recall. What now? ( 16:43)

Asda Great Deal

Free UK shipping. 15 day free returns.
Community Updates
*So you can easily identify outgoing links on our site, we've marked them with an "*" symbol. Links on our site are monetised, but this never affects which deals get posted. Find more info in our FAQs and About Us page.
New Comment