Machine Learning Fundamentals
A Few Definitions of Machine Learning
1. Machine learning is the science and art of programming computers so they can learn from data. In other words, it’s the science and art of getting computers to learn without being explicitly programmed.
The following definition is a slightly more general:
2. Machine learning is the field of study that gives computers the ability to learn without being explicitly programmed.— Arthur Samuel, 1959.
Your email spam filter is an example of machine learning program that can learn to flag spam. The examples that the system uses to learn are called the training set. Each training example is called a training instance (or sample). The part of a machine learning system that learns and makes predictions is called a model.
Two good examples of models are neural networks and random forests. We can train machines to learn by themselves, and machine learning can help humans to learn.
Data Mining
Data mining is the process of digging into large amounts of data to discover hidden patterns. In other words, data mining is the process of extracting valuable insights, patterns, and knowledge from large sets of data. It involves using various statistical and machine learning techniques to identify patterns and relationships within the data that are not immediately apparent.
In other words, data mining involves using statistical and machine learning techniques to analyze data sets and discover hidden patterns, trends, and relationships.
The ultimate goal of data mining is to extract useful and actionable information from large and complex datasets, which can then be used to make better decisions, improve processes, and gain a competitive advantage. Data mining is used in a variety of fields, including business, healthcare, finance, and science, to name a few.
So, we can use data mining to transform raw data into actionable insights that can be used to make informed decisions. This is achieved by applying various data analysis techniques such as clustering, classification, regression analysis, association rule mining, and anomaly detection, among others.
Data mining can also be used to solve a wide range of problems in different fields such as finance, healthcare, marketing, and education. For example, it can be used to identify fraudulent financial transactions, predict customer churn, detect disease outbreaks, and improve student performance.
The data mining process typically involves the following steps:
- Data collection: This involves gathering data from various sources such as databases, websites, and social media platforms.
- Data preprocessing: This involves cleaning, transforming, and reducing the data to make it suitable for analysis.
- Data exploration: This involves analyzing the data to identify patterns and relationships.
- Model building: This involves developing models using statistical and machine learning techniques to predict outcomes or classify data.
- Model evaluation: This involves testing the model to ensure its accuracy and reliability.
- Deployment: This involves deploying the model to make predictions or classify new data.
In summary, data mining is an important tool for extracting insights from large volumes of data, and it has a wide range of applications in various industries. It involves using statistical and machine learning techniques to analyze data and discover hidden patterns, trends, and relationships.
The following are the kinds of problems that machine learning is good at solving:
- Problems for which existing solutions require a lot of fine-tuning or long lists of rules (a machine learning model can often simplify code and perform better than the traditional approach).
- Complex problems for which using a traditional approach yields no good solution (the best machine learning techniques can perhaps find a solution).
- Fluctuating environments (a machine learning system can easily be retrained on new data, always keeping it up to date).
- Getting insights about complex problems and large amounts of data.
Some Applications of Machine Learning
Let’s look at some concrete examples of machine learning tasks, along with the techniques that can tackle them:
- Analyzing product images on a production line to automatically classify them
This is image classification, typically performed using convolutional neural networks (CNNs) or sometimes transformers.
- Brain tumor detection using scans
This is semantic image segmentation, where each pixel in the image is classified (as we want to determine the exact location and shape of tumors), typically using CNNs or transformers.
- Automatic classification of news articles
This is natural language processing (NLP), and more specifically text classification, which can be tackled using recurrent neural networks (RNNs) and CNNs, but transformers work even better.
- Automatic flagging of offensive comments on discussion forums
This is also text classification, using the same NLP tools.
- Automatically summarizing long documents
This is a branch of NLP called text summarization, again using the same tools.
- Creation of a personal assistant or chatbot
This involves many NLP components, including natural language understanding (NLU) and question-answering modules.
- Forecasting a company’s revenue next year, based on many performance metrics
This is a regression task (i.e., predicting values) that may be tackled using any regression model, such as a linear regression or polynomial regression model, a regression support vector machine, a regression random forest, or an artificial neural network. If you want to take into account sequences of past performance metrics, you may want to use RNNs, CNNs, or transformers.
- Making an app react to voice commands
This is speech recognition, which requires processing audio samples: since they are long and complex sequences, they are typically processed using RNNs, CNNs, or transformers.
- Detection of credit card fraud
This is anomaly detection, which can be tackled using isolation forests, Gaussian mixture models, or autoencoders.
- Segmentation of clients based on their purchases so that a different marketing strategy can be designed for each segment
This is called clustering, and it can be achieved using k-means, DBSCAN, and more.
- Representation of a complex, high-dimensional dataset in a clear and insightful diagram
This is called data visualization, which often involves dimensionality reduction techniques.
- Recommendation of a product that a client may be interested in, based on past purchases
This is a recommender system. One approach is to feed past purchases (and other information about the client) to an artificial neural network, and make it output the most likely next purchase. This neural net would typically be trained on past sequences of purchases across all clients.
- Building an intelligent bot for a game
Reinforcement learning (RL) is often used to tackle this. RL is a branch of machine learning that trains agents (such as bots) to pick the actions that will maximize their rewards over time. For example, a bot may get a reward every time the player loses some life points within a given environment (such as the game). The famous AlphaGo program that beat the world champion at the game of Go was built using RL.
This list is very long, but hopefully it gives you an idea of the incredible breadth and complexity of the tasks that machine learning can tackle, and the types of techniques that you would use for each task.
Types of Machine Learning Systems
- Supervised, Semi-supervised or Self-supervised Learning: In this type of machine learning, the system is trained on labeled data, which means that the data is already categorized or classified. The algorithm learns to identify patterns in the input data and maps those patterns to the correct output labels. In supervised learning, the training set you feed to the algorithm
includes the desired solutions, called labels. Supervised learning is used in many applications, such as image classification, speech recognition, and sentiment analysis. In unsupervised learning, as you might guess, the training data is unlabeled. The system tries to learn without a teacher. - Unsupervised Learning: In this type of machine learning, the system is trained on unlabeled data, which means that the data is not pre-categorized or classified. The algorithm identifies patterns in the input data and groups similar data points together based on similarities in the features or characteristics of the data. Unsupervised learning is used in applications such as clustering, anomaly detection, and dimensionality reduction.
- Reinforcement Learning: In this type of machine learning, the system learns by interacting with its environment and receiving feedback in the form of rewards or penalties. The algorithm learns to take actions that maximize the rewards it receives over time. Reinforcement learning is used in applications such as game playing, robotics, and autonomous vehicles.
Differences Between Machine Learning, Deep Learning and Artificial intelligence
Machine learning (ML), deep learning (DL), and artificial intelligence (AI) are three related but distinct fields of computer science. Here are the key differences between them:
- Definition: AI is a broad field that encompasses all aspects of creating intelligent machines that can perform tasks that typically require human intelligence, such as understanding natural language or recognizing objects in images. ML is a subset of AI that involves training models to make predictions or decisions based on data, without being explicitly programmed to do so. DL is a subset of ML that uses artificial neural networks to learn from data, enabling machines to perform tasks such as image and speech recognition.
- Scope: AI encompasses a broad range of technologies, including expert systems, robotics, natural language processing, and machine learning. ML focuses specifically on teaching machines to learn from data, while DL is a specific type of ML that uses deep neural networks to learn from large datasets.
- Data requirements: ML and DL both require large amounts of data to train models effectively. AI systems may also require large amounts of data, depending on the application.
- Training methods: ML and DL use different training methods to learn from data. ML algorithms typically require some level of human input to label the data and specify the outcome that the model is trying to predict. DL algorithms, on the other hand, use unsupervised learning to discover patterns in the data without being explicitly told what to look for.
- Applications: AI has a wide range of applications in different industries, including healthcare, finance, and transportation. ML is used for applications such as predictive modeling, recommendation systems, and fraud detection. DL is used for tasks such as image and speech recognition, natural language processing, and autonomous vehicles.
- Complexity: DL algorithms are typically more complex than traditional ML algorithms and require more computing power to train. AI systems can be even more complex, requiring multiple technologies and algorithms to work together.
In summary, AI, ML, and DL are three related but distinct fields of computer science, with different goals, methods, and applications. While AI encompasses all aspects of creating intelligent machines, ML focuses specifically on teaching machines to learn from data, and DL is a specific type of ML that uses deep neural networks to learn from large datasets.
As we progress in this class, you’ll practice implementing machine learning algorithms yourself. You’ll learn about the most important machine learning algorithms, some of which are exactly what is being used in large Artificial Intelligence (AI) or large tech companies today and you get a sense of what is the state of the art in AI. Beyond learning the algorithms though, you’ll also learn all the important practical tips and tricks for making them perform well. You will get to implement them and see how they work for yourself. Why is machine learning so widely used today?
ML Has Grown up as a Sub-field of Artificial Intelligence
We wanted to build intelligent machines. It turns out that there are a few basic things that we could program a machine to do, such as how to find the shortest path from point a to point b, like in your GPS. But for the most part, we just did not know how to write an explicit program to do many of the more interesting things, such as perform web search, recognize human speech, diagnose diseases from X-rays or build a self-driving car.
The only way we knew how to do these things was to have a machine learn to do it by itself. When I worked on problems like speech recognition, computer vision for Google Maps, Street View images and advertising, or leading AI Baidu, I worked on everything from AI for augmented reality to combating payment fraud to leading a self-driving car team.
Most recently, I’m beginning to work on AI applications in the factory, large-scale agriculture, health care, e-commerce, and other problems. Today, there are hundreds of thousands, perhaps millions of people working on machine learning applications who could tell you similar stories about their work with machine learning.
When you’ve learned these skills, I hope that you too will find the great fun to dabble in exciting different applications and maybe even different industries. In fact, I find it hard to think of any industry that machine learning is unlikely to touch in a significant way now or in the near future. Looking even further into the future, many people, including me, are excited about the AI dream of someday building machines as intelligent as you or me. This is sometimes called Artificial General Intelligence or AGI.
Artificial General Intelligence (AGI)
I think AGI has been over hyped and we’re still a long way away from that goal. I don’t know. It’ll take 50 years or 500 years or longer to get there. But mostly AI researchers believe that the best way to get closer toward that goal is by using learning algorithms. Maybe ones that take some inspiration from how the human brain works. You also hear a little more about this quest for AGI later in this course.
According to a study by McKinsey, AI and machine learning is estimated to create an additional 13 trillion US dollars of value annually by the year 2030. Even though machine learning is already creating tremendous amounts of value in the software industry, I think there could be even vastly greater value that has yet to be created outside the software industry in sectors such as retail, travel, transportation, automotive, materials manufacturing, and so on.
Because of the massive untapped opportunities across so many different sectors, today there is a vast unfulfilled demand for this skill set. That’s why this is such a great time to be learning about machine learning. If you find machine learning applications exciting, I hope you stick with me through this class.
I can almost guarantee that you’ll find mastering these skills worthwhile. In the next lesson, we’ll begin to talk about the main types of machine learning problems and algorithms. You pick up some of the main machine learning terminology and start to get a sense of what are the different algorithms and when each one might be appropriate.
Supervised versus Unsupervised Machine Learning
Supervised learning is the type of machine learning that is used most in many real-world applications and has seen the most rapid advancements and innovation. By far, the most used types of learning algorithms today are supervised learning, unsupervised learning, and recommender systems.
The other thing we’re going to spend a lot of time on in this specialization is practical advice for applying learning algorithms. This is something I feel pretty strongly about. Teaching about learning algorithms is like giving someone a set of tools, and so even more important, is to making sure you have great tools is making sure you know how to apply them.
In machine learning, making sure you have the tools is really important and so is making sure that you know how to apply the tools of machine learning effectively. That’s what you get in this course, the tools as well as the skills to apply them effectively.
I regularly visit friends and teams in some of the top tech companies, and even today I see experienced machine learning teams apply machine learning algorithms to some problems, and sometimes they’ve been going at it for six months without much success. When I look at what they’re doing, I sometimes feel like I could have told them six months ago that the current approach won’t work and there’s a different way of using these tools that will give them a much better chance of success.
In this class, one of the relatively unique things you learn is you’ll learn is the best practices for how to actually develop a practical, valuable machine learning system. This way, you’re less likely to end up in one of those teams that end up losing six months going in the wrong direction.
In this class, you gain a sense of how the most skilled machine learning engineers build systems. I hope you finish this class as one of those very rare people in today’s world that know how to design and build serious machine learning systems. That’s machine learning.
To sum up, supervised and unsupervised machine learning are two different approaches to machine learning that have distinct characteristics and use cases. Here are some of the key differences between them:
- Definition: Supervised learning involves training a model on a labeled dataset, where the input data is paired with known output values, whereas unsupervised learning involves training a model on an unlabeled dataset, where there are no predefined output values.
- Goal: The goal of supervised learning is to predict the correct output for new input data, whereas the goal of unsupervised learning is to identify patterns or relationships within the data without any predefined outputs.
- Input data: Supervised learning requires labeled input data, while unsupervised learning can work with both labeled and unlabeled data.
- Output data: Supervised learning produces a set of predicted output values for each input data point, while unsupervised learning produces a set of patterns or clusters that describe the underlying structure of the data.
- Training process: Supervised learning involves a training process where the model is presented with labeled data and adjusts its parameters to minimize the difference between the predicted output and the actual output. Unsupervised learning involves a training process where the model discovers patterns or clusters within the data without any external feedback.
- Examples: Supervised learning is often used in applications such as image classification, speech recognition, and natural language processing, where there is a clear input-output relationship. Unsupervised learning is often used in applications such as anomaly detection, customer segmentation, and recommendation systems, where there is no predefined output.
- Evaluation: Supervised learning models can be evaluated based on their ability to accurately predict the output for new input data. Unsupervised learning models can be evaluated based on the quality and usefulness of the patterns or clusters they discover.
Overall, the main difference between supervised and unsupervised learning is that supervised learning requires labeled input and output data, while unsupervised learning works with only input data and seeks to discover underlying patterns or relationships within the data.
Career Opportunities in Machine Learning and Artificial intelligence
There are many career opportunities in machine learning and artificial intelligence (AI), given the growing demand for these technologies across various industries. Here are some of the most common job roles in this field:
- Machine learning engineer: Responsible for developing and implementing machine learning models that can be used in various applications.
- Data scientist: Collects, analyzes, and interprets large and complex data sets to identify trends and patterns using machine learning techniques.
- AI researcher: Conducts research to develop new algorithms and models to improve the accuracy and efficiency of AI systems.
- AI programmer: Develops software applications that utilize AI technologies to solve real-world problems.
- AI consultant: Provides advice and guidance to organizations on how to implement and utilize AI technologies to achieve their business goals.
- Robotics engineer: Designs and builds robots that can perform various tasks autonomously, using AI and machine learning techniques.
- Natural language processing (NLP) engineer: Develops systems that can analyze and understand human language, which is essential for building chatbots, voice assistants, and other language-based applications.
- Computer vision engineer: Develops systems that can interpret and understand visual data, which is critical for applications such as autonomous vehicles and facial recognition.
- AI ethics specialist: Responsible for ensuring that AI systems are developed and used ethically, taking into account their impact on society and individuals.
Overall, machine learning and AI offer a wide range of career opportunities for individuals with diverse skill sets, including programming, data analysis, mathematics, and ethics.
In the next lesson, we’ll look more deeply at what supervised learning is and also what unsupervised learning is. In addition, you’ll learn when you might want to use each of them. I’ll see you in the next lesson.

