- 1 How do you scale with ML?
- 2 How you test your ML models for production scale?
- 3 How do you scale data?
- 4 What is machine learning at scale?
- 5 How do ML models train?
- 6 Why we need to scale the data?
- 7 Which is the correct way to describe ml problems?
- 8 When to use mL to identify anomalies?
- 9 Can a ML model beat a heuristic?
- 10 Which is the hardest problem to solve with ML?
How do you scale with ML?
Feature Scaling is a technique to standardize the independent features present in the data in a fixed range. It is performed during the data pre-processing to handle highly varying magnitudes or values or units.
How you test your ML models for production scale?
4 steps model testing:
- Local development. Model development could often start with a hypothesis, say.
- Testing in CI/CD. The second step in the ML model testing I recommend you to implement is testing as part of CI/CD.
- Stage testing / Shadow testing.
- A/B test.
How do you scale data?
Good practice usage with the MinMaxScaler and other scaling techniques is as follows:
- Fit the scaler using available training data. For normalization, this means the training data will be used to estimate the minimum and maximum observable values.
- Apply the scale to training data.
- Apply the scale to data going forward.
What is machine learning at scale?
This course builds on and goes beyond the collect-and-analyze phase of big data by focusing on how machine learning algorithms can be rewritten and extended to scale to work on petabytes of data, both structured and unstructured, to generate sophisticated models used for real-time predictions.
How do ML models train?
How To Develop a Machine Learning Model From Scratch
- Define adequately our problem (objective, desired outputs…).
- Gather data.
- Choose a measure of success.
- Set an evaluation protocol and the different protocols available.
- Prepare the data (dealing with missing values, with categorial values…).
- Spilit correctly the data.
Why we need to scale the data?
Feature scaling is essential for machine learning algorithms that calculate distances between data. Therefore, the range of all features should be normalized so that each feature contributes approximately proportionately to the final distance.
Which is the correct way to describe ml problems?
However, it is more accurate to describe ML problems as falling along a spectrum of supervision between supervised and unsupervised learning. For the sake of simplicity, this course will focus on the two extremes of this spectrum.
When to use mL to identify anomalies?
One alternative approach is to label some items before you cluster, and then try to propagate those labels across the entire cluster. For instance, if all items with label X end up in one cluster, maybe you can spread label X to other examples. Sometimes, people want to use ML to identify anomalies.
Can a ML model beat a heuristic?
One option is to define a heuristic and use it to label anomalies. However, once you’ve defined this heuristic, you might as well use the heuristic in your production system, since an ML model can’t beat the heuristic used to train it.
Which is the hardest problem to solve with ML?
ML can identify correlations—mutual relationships or connections between two or more things. Determining causation (one event or factor causing another) is much harder. In other words, it is easy to see that something happened, but much harder to understand why it happened.