ADVERTISEMENT

Digital Transformation

10 Things You Must Be Aware Of Machine Learning

ADVERTISEMENT

1. In “Machine Learning,” there is “learning,” meaning “learning from data.
” Artificial Intelligence (AI) is just a buzzword. Machine Learning benefits from this buzz: there are an incredible number of problems you can solve by providing the correct training data to the suitable Machine Learning algorithm. Call it Artificial Intelligence if it helps you sell it, but remember that AI is a buzzword and can mean anything people want to put in it.

2. Machine Learning is about data and algorithms, but mostly about data.
There is a lot of excitement about the latest advances in Machine Learning algorithms, especially Deep Learning. But data is the critical element that makes Machine Learning possible.

3. Unless you have a lot of data, stick to simple models.
Machine Learning trains a model from patterns in your data, exploring the space of possible models as defined by parameters. If the area of your parameters is too large, you risk overfitting your training data and thus generating a model that cannot generalize beyond that data. A detailed explanation would require more mathematics, but as a general rule, you should keep the model as simple as possible.

ADVERTISEMENT

4. Machine Learning will never be better than the data you use to train it.
The expression “Garbage in, garbage out” is not new to Machine Learning, but it aptly characterizes one of the critical limitations of Machine Learning. For supervised Machine Learning, such as classification, you need a properly labeled and richly illustrated training data set.

5. Machine Learning only works if your training data set is representative.
Just like an investment fund that states in its prospectus, “past performance is no guarantee of future performance,” Machine Learning should warn that it will only work correctly on data with the same kind of distribution as the data used to train it. So be aware of any deviations between your training and production data set. Therefore, re-train your models often to avoid variations.

6. Most of the hard work in Machine Learning lies in data transformation.
Reading all the hype about the new Machine Learning techniques, one may get the impression that the art lies in choosing and tuning the correct algorithm. The reality is much more prosaic: most of your time and effort will be spent cleaning up the data and enriching it with features (feature engineering). In concrete terms, this means starting with the raw data and enhancing it with components highlighting the signal that makes sense in your data.

7. Deep Learning is a revolutionary advance, but it is not a miracle solution.
Deep Learning has earned its credentials by providing advances in many Machine Learning areas. Moreover, Deep Learning automates some of the work traditionally done by feature engineering (especially for processing images and videos). But Deep Learning is not a magic solution. It is not a rabbit out of the hat; you need to invest significant effort in cleaning and transforming your data.