Alumni stories
Stanford CS229: Machine Learning
“Topics include: supervised learning (generative/discriminative learning, parametric/non-parametric learning, neural networks, support vector machines); unsupervised learning (clustering, dimensionality reduction, kernel methods); learning theory (bias/variance tradeoffs; VC theory; large margins); reinforcement learning and adaptive control.”
Did you understand that course description? No? Good. That means you’re in the right place. Machine learning (and new technology in general) is often portrayed in highly-technical and complicated language. Yet understanding the fundamental ideas behind machine learning doesn’t have to be so unapproachable.
Machine learning (ML) is the science of getting computers to act without being explicitly programmed. This branch of computer science affects all of us whether we work in a technical field or not. A basic knowledge of how ML functions is fundamental to understand how today’s world works and the future of the 21st century.
The most publicized use of ML today is in autonomous vehicles. The two most prominent examples include Tesla and Google’s Waymo. These systems leverage advanced machine learning algorithms to drive the cars. Basically the vehicles take in visual cues in the same way our eyes do before making decisions on the road. Self-driving cars are on the cusp of being mainstream. For the next 5 years, they will be a novelty. In 10 years they will be the norm. Within 20 years we will no longer teach our children how to drive. That is the power of machine learning. These algorithms will completely change our transportation.
And that is only one example. ML drives facial recognition. This exists on our favorite social media apps as well as in Apple’s iPhones and iPads with FaceID. And it’s not limited to online. Amazon Go is a cashier-less grocery store concept Amazon has been rolling out for a few years now. They use cameras to track who is buying what. In the West, facial recognition is still fairly nascent, but it’s already well established in China, to a dystopian degree.
It’s in healthcare too. Machine learning has can identify certain cancer better than doctors.
And we are only beginning.
Machine learning is being ingrained in everything we do. And it is only going to play a larger role in our lives in the future. So if you want to understand the one thing that will shape your life more than anything else, you need to commit time to understanding machine learning.
photo by Vishal Maini
In the media we hear a lot about AI and machine learning in the same context. It’s easy to get the idea that they are the same thing. However, this is not true. ML is a subset of Artificial Intelligence. AI incorporates parts of logic, philosophy, natural language processing, among others. Machine learning is only one part, albeit a very important part.
And to be fair, this can be a bit tricky, because the definition of AI has changed in the past and remains a moving target. Moreover, there are three different levels of AI. It can be a lot to understand, so I won’t go into detail here. If you
want to dive into it, this link goes very in-depth.
Machine learning algorithms mostly fall into two categories. There are others, but supervised & unsupervised offer a good starting point.
Typically there are two types of problems machine learning answers.Problems where we know what the possible answers are (supervised learning). And problems where we don’t (unsupervised learning).
First, consider supervised learning. Say we want to figure out what price to sell a house at. The problem is that we don’t know what to price the house at, but we know the answer will indeed be some dollar amount. ML can take into account the neighborhood, square footage, list price, sale price, etc. of thousands of other houses to learn how to price our new house. That’s what is known as training data. The more training data, the better the algorithm.
Once trained, an algorithm can estimate the best price for our house by comparing its features to houses in the training set. Then it suggests the price. That is supervised learning.
Unsupervised learning on the other hand is most commonly seen as a grouping problem (the technical term being clustering). For example, say we want to split 1,000 social media influencers up into groups in order to target ads online. We might group them by category, follower count, post engagement, or demographics. Yet, we don’t know ahead of time what the most efficient grouping is for our ad targeting. In other words, there isn’t a predefined answer to that problem.
ML algorithms look at all the facts we provide about these influencers to determine the best way to group the influencers. Often times the results of this grouping can be counterintuitive to human intuition. The major concern here is that it can be difficult to know when the algorithm is wrong. It’s not as clear as with estimating the cost of a house.
If we use bad data it doesn’t matter how great an algorithm is, we won’t get a good answer. Consider a ML algorithm trained to find great job applicants. It will discriminate against women if the data it is trained on under-represents the actual number of women in the applicant pool, as Amazon found out in 2018. These algorithms aren’t bad, they just learned from bad data. Here is a great TED Talk and book explaining this danger more.
To wrap up, machine learning is becoming ever more ingrained into our lives. You don’t need to be a computer science major to understand the basic principles. And knowing the basics is increasingly important to be computer literate in the 21st century. For further learning here are more introductory resources on ML.