Paper shows why you will struggle at Machine Learning

This is easily the most technically complex paper I’ve ever read.

Devansh
5 min readNov 1, 2021

Join 31K+ AI People keeping in touch with the most important ideas in Machine Learning through my free newsletter over here

I’ve been going Multiplying Matrices without Matrices (link: https://arxiv.org/abs/2106.10860). And it’s a paper I have spent a lot of time on. How can I not? The abstract claims, “Experiments using hundreds of matrices from diverse domains show that it often runs 100× faster than exact matrix products and 10× faster than current approximate methods. In the common case that one matrix is known ahead of time, our method also has the interesting property that it requires zero multiply-adds.” If you understand machine learning, this has huge implications for the learning process.

This sentiment is not uncommon. If you feel this way, you are not alone

At the same time, I came across the above tweet on my timeline. And I can definitely see where this is coming from. Meaningful ML is by its nature multi-disciplinary. While the code for an LSTM and Random Forests stay the same, the context around the problem changes. Depending on what you’re working on, the way you get, prepare, clean, and evaluate your data changes. Thus you will end up needing to become proficient at multiple things. This process involves a lot of Googling and can be very frustrating/disheartening.

The paper is a rather extreme example of that. I double major in Math and Computer Science. Selected my courses to get good at Coding and AI/ML in particular. So I’m well suited to understanding the details. But even after a month, a lot of this paper is very challenging.

Me trying to understand the paper.

In this article, I will use the paper as an example of why good Machine Learning is difficult. I will explain why that’s a good thing for you, and what you can do to benefit from this. If nothing else, I hope that by the end of this article you understand what it takes to get to a high level at ML.

Understanding the Implications of this paper

A quick word on why this paper is greatness. In machine learning data points are represented as multi-dimensional matrices. Multiplying matrices is very important for a lot of functions. It is also notoriously difficult. To those interested, this article by Quanta is pretty good to understand.

Don’t underestimate pre-processing.

This is where the paper gets insane. “In the common case that one matrix is known ahead of time, our method also has the interesting property that it requires zero multiply-adds.” When might we see such cases? Imagine our model has the weights and just needs to compute the predictions based on input. The weights are a matrix we know which will be multiplied with the input matrix. Given how much this process happens, your savings will really add up.

This is one example of a great application of matrix multiplacation.

Why this paper is a nightmare to understand.

So now that we have some idea of why this concept is important let’s talk about why this paper is challenging. Simply put, it traverses a lot of technical fields. Here’s a depiction of the Product Quantization they use:

Not only is it using Vectors, but it also relies on prototype learning, hashing, and aggregation. This would require very good coding and mathematical skills. Even their hashing is far from basic. The authors rely on hashing trees, which can be terryfing. Check out section 4.1 for more details. The complexity and wide-ranging nature of the paper was best articulated by the authors as “our work draws on a number of different fields but does not fit cleanly into any of them”. Developing your understanding of the basics will help you at least understand the assumptions and experiment setups.

For a detailed look at some of the assumptions in the paper, check out this video. I go over the assumptions, a concrete example of the matrix multiplication approximation. Make sure to pause the video and read the snippets I’ve taken from the paper. I found them particularly insightful.

Why this complexity is a Good Thing for you

Obviously not every Machine Learning/AI venture is as complex as this paper. However, real-life ML will be complex. Following is an exchange I had with someone who read and enjoyed my article, 5 Unsexy Truths About Working in Machine Learning.

The complexity of Machine Learning opens a lot of doors. It means that there is always new ways to try things, new knowledge to discover, new protocols/ensembles to invent. It will allow you to specialize in the fields you’re most interested in. If you’re willing to put in the work and struggle, you will soon be able to develop your own value-adds. And that’s when it gets fun. How to become a Machine Learning Expert is an article to help you speed up the process. As long as you’re willing to find areas you’re interested in and dive into them, you will be able to get great results in your Machine Learning Journeys.

If you liked this article, check out my other content. I post regularly on Medium, YouTube, Twitter, and Substack (all linked below). I focus on Artificial Intelligence, Machine Learning, Technology, and Software Development. If you’re preparing for coding interviews check out: Coding Interviews Made Simple.

For one-time support of my work following are my Venmo and Paypal. Any amount is appreciated and helps a lot:

Venmo: https://account.venmo.com/u/FNU-Devansh

Paypal: paypal.me/ISeeThings

Reach out to me

If that article got you interested in reaching out to me, then this section is for you. You can reach out to me on any of the platforms, or check out any of my other content. If you’d like to discuss tutoring, text me on LinkedIn, IG, or Twitter. If you’d like to support my work, using my free Robinhood referral link. We both get a free stock, and there is no risk to you. So not using it is just losing free money.

Check out my other articles on Medium. : https://rb.gy/zn1aiu

My YouTube: https://rb.gy/88iwdd

Reach out to me on LinkedIn. Let’s connect: https://rb.gy/m5ok2y

My Instagram: https://rb.gy/gmvuy9

My Twitter: https://twitter.com/Machine01776819

If you’re preparing for coding interviews: https://codinginterviewsmadesimple.substack.com/

Get a free stock on Robinhood: https://join.robinhood.com/fnud75

--

--

Devansh
Devansh

Written by Devansh

Writing about AI, Math, the Tech Industry and whatever else interests me. Join my cult to gain inner peace and to support my crippling chocolate milk addiction

No responses yet