Interesting Content in AI, Software, Business, and Tech- 08/07/2024
Content to help you keep up with Machine Learning, Deep Learning, Data Science, Software Engineering, Finance, Business, and more
A lot of people reach out to me for reading recommendations. I figured I’d start sharing whatever AI Papers/Publications, interesting books, videos, etc I came across each week. Some will be technical, others not really. I will add whatever content I found really informative (and I remembered throughout the week). These won’t always be the most recent publications- just the ones I’m paying attention to this week. Without further ado, here are interesting readings/viewings for 08/07/2024. If you missed last week’s readings, you can find it here.
Reminder- We started an AI Made Simple Subreddit. Come join us over here- https://www.reddit.com/r/AIMadeSimple/. If you’d like to stay on top of community events and updates, join the discord for our cult here: https://discord.com/invite/EgrVtXSjYf. Lastly, if you’d like to get involved in our many fun discussions, you should join the Substack Group Chat Over here.
Community Spotlight: Emergent Garden
Emergent Garden puts out very interesting videos on Life simulations, neural networks, cellular automata, and other emergent programs. They’re more “interesting” and less “informational” than a lot of the other sources I put out, but I often find myself wanting to learn much more about an idea after watching one of EG’s videos. It’s a pretty good way to engage with a lot of the more advanced ideas in AI without having to deal with the load of learning 50 new theorems and ideas. Also, EG covers Evolutionary Algos a lot more than other AI creators, so I feel a sense of kin-ship with him.
If you’re doing interesting work and would like to be featured in the spotlight section, just drop your introduction in the comments/by reaching out to me. There are no rules- you could talk about a paper you’ve written, an interesting project you’ve worked on, some personal challenge you’re working on, ask me to promote your company/product, or anything else you consider important. The goal is to get to know you better, and possibly connect you with interesting people in our chocolate milk cult. No costs/obligations are attached.
Previews
Curious about what articles I’m working on? Here are the previews for the next planned articles-
The Economics of ESports
Here’s a teaser. See if you can guess the topic (FT is fine-tuning).
Highly Recommended
These are pieces that I feel are particularly well done. If you don’t have much time, make sure you at least catch these works.
LLM Paper Reading Notes — August 2024
Every month, Jean David Ruvini posts his notes on LLM/NLP related papers, and every month I share his notes here. JD has been doing cutting-edge NLP at scale for a while, so his insights are very valuable. Given how many people want to stay cutting edge in NLP- and how hard it is to know what to focus on- domain specific sources like JD are a godsend-
Sharing short notes (from myself and others) about LLM research papers I came across in July. These notes differ in their level of detail and precision. I hope they’re still useful in piquing your curiosity and helping you breathe under the waterfall. At the current pace of AI, it takes the power of all of us to keep up.
Retrieval Augmented Generation or Long-Context LLMs? A Comprehensive Study and Hybrid Approach
A lot of you have seen discussions of this paper, so this is your reminder to read it.
Retrieval Augmented Generation (RAG) has been a powerful tool for Large Language Models (LLMs) to efficiently process overly lengthy contexts. However, recent LLMs like Gemini-1.5 and GPT-4 show exceptional capabilities to understand long contexts directly. We conduct a comprehensive comparison between RAG and long-context (LC) LLMs, aiming to leverage the strengths of both. We benchmark RAG and LC across various public datasets using three latest LLMs. Results reveal that when resourced sufficiently, LC consistently outperforms RAG in terms of average performance. However, RAG’s significantly lower cost remains a distinct advantage. Based on this observation, we propose Self-Route, a simple yet effective method that routes queries to RAG or LC based on model self-reflection. Self-Route significantly reduces the computation cost while maintaining a comparable performance to LC. Our findings provide a guideline for long-context applications of LLMs using RAG and LC.
How to Use Benchmarks to Build Successful Machine Learning Systems
I’ve discussed Goodhart’s Law and how any sufficiently powerful system significantly changes its environment (and why this makes our way of training AI incompatible with AGI). Logan Thorneloe takes a slightly different approach, explaining how overfitting to benchmarks leads to system issues. Logan is at his best exploring the intersection of ML and Software Engineering, and this is one such case.
Tl;dr: Software engineers building applications using machine learning need to test models in real-world scenarios before choosing which model performs best. Benchmarks are good preliminary measures but don’t reflect the complexities of real-world scenarios.
How GitHub uses GitHub Actions and Actions larger runners to build and test GitHub.com [Technique Tuesdays]
A piece by yours truly for our sister publication, Tech Made Simple, on Github Actions and how they are used to speed-up workflows. This is the kind of AI that often gets overlooked in discussions of AI and it’s utility, which I think is a bummer b/c there’s so much really awesome shit happening all over. It would be nice if people could stop being so tribal about every development, and just marvel at the cool things being built.
Recently, I came across a very interesting piece called, “How GitHub uses GitHub Actions and Actions larger runners to build and test GitHub.com”, which is a pretty interesting overview of using Github Actions for CI/CD (learn more about what it is and how it enables smoothness and collaborations across large, diverse teams here). It was pretty interesting, and I think it’s always good to study different software engineering tools to see how we can improve our own work experiences-
we run 15,000 CI jobs within an hour across 150,000 cores of compute
This article will be my overview + analysis of the article to understand how GitHub achieves speed, efficiency, and reliability at a massive scale. To understand the article, it’s helpful to first understand Github Actions and Action Runners.
If you like this article, please consider becoming a premium subscriber to my newsletter AI Made Simple so I can spend more time researching and sharing information on truly important topics. We have a pay-what-you-can model, which lets you support my efforts to bring high-quality technical Education to everyone for less than the price of a cup of coffee.
I provide various consulting and advisory services. If you‘d like to explore how we can work together, reach out to me through any of my socials over here or reply to this email.
RankRAG: Unifying Context Ranking with Retrieval-Augmented Generation in LLMs
As mentioned, I’m going to do a piece on RAG soon, so I’m doing a lot of research on it. This was an interesting piece on it. I’ll have to dig into this, but if this paper says what I think it’s saying- it will change the way I do RAG.
Large language models (LLMs) typically utilize the top-k contexts from a retriever in retrieval-augmented generation (RAG). In this work, we propose a novel instruction fine-tuning framework RankRAG, which instruction-tunes a single LLM for the dual purpose of context ranking and answer generation in RAG. In particular, the instruction-tuned LLMs work surprisingly well by adding a small fraction of ranking data into the training blend, and outperform existing expert ranking models, including the same LLM exclusively fine-tuned on a large amount of ranking data. For generation, we compare our model with many strong baselines, including GPT-4–0613, GPT-4-turbo-2024–0409, and ChatQA-1.5, an open-sourced model with the state-of-the-art performance on RAG benchmarks. Specifically, our Llama3-RankRAG significantly outperforms Llama3-ChatQA-1.5 and GPT-4 models on nine knowledge-intensive benchmarks. In addition, it also performs comparably to GPT-4 on five RAG benchmarks in the biomedical domain without instruction fine-tuning on biomedical data, demonstrating its superb capability for generalization to new domains.
Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies
Another interesting example, showing the benefits of good data. The quantification of the relative sparsity of vocab size to model sizes is pretty interesting, and I wonder what other performance gains we’re leaving on the table when we rush into model size.
Research on scaling large language models (LLMs) has primarily focused on model parameters and training data size, overlooking the role of vocabulary size. We investigate how vocabulary size impacts LLM scaling laws by training models ranging from 33M to 3B parameters on up to 500B characters with various vocabulary configurations. We propose three complementary approaches for predicting the compute-optimal vocabulary size: IsoFLOPs analysis, derivative estimation, and parametric fit of the loss function. Our approaches converge on the same result that the optimal vocabulary size depends on the available compute budget and that larger models deserve larger vocabularies. However, most LLMs use too small vocabulary sizes. For example, we predict that the optimal vocabulary size of Llama2–70B should have been at least 216K, 7 times larger than its vocabulary of 32K. We validate our predictions empirically by training models with 3B parameters across different FLOPs budgets. Adopting our predicted optimal vocabulary size consistently improves downstream performance over commonly used vocabulary sizes. By increasing the vocabulary size from the conventional 32K to 43K, we improve performance on ARC-Challenge from 29.1 to 32.0 with the same 2.3e21 FLOPs. Our work emphasizes the necessity of jointly considering model parameters and vocabulary size for efficient scaling.
Confabulation: The Surprising Value of Large Language Model Hallucinations
I’ve been saying this for a while, but hallucinations are not the devil people paint them to be. Yes they are a problem, if you can’t plan for them, but what we call hallucinations are just a natural by-product of the way Autoregressive Models work. This paper presents an interesting inversion to the way we think about Hallucinations (note this doesn’t mean that you don’t need to account for hallucinations in your product, just that it’s a risk of doing business and should be treated accordingly).
This paper presents a systematic defense of large language model (LLM) hallucinations or ‘confabulations’ as a potential resource instead of a categorically negative pitfall. The standard view is that confabulations are inherently problematic and AI research should eliminate this flaw. In this paper, we argue and empirically demonstrate that measurable semantic characteristics of LLM confabulations mirror a human propensity to utilize increased narrativity as a cognitive resource for sense-making and communication. In other words, it has potential value. Specifically, we analyze popular hallucination benchmarks and reveal that hallucinated outputs display increased levels of narrativity and semantic coherence relative to veridical outputs. This finding reveals a tension in our usually dismissive understandings of confabulation. It suggests, counter-intuitively, that the tendency for LLMs to confabulate may be intimately associated with a positive capacity for coherent narrative-text generation.
An Update on Cloud Markets and AI Value Creation
If it seems like I’ve been fan-boying Eric Flaningam recently, it’s because it’s true. Super glad I found his newsletter, b/c it’s my favorite for understanding the kinds business/investor side of things. His articles + ModernMBA’s YouTube deepdives are a must for any technical person so that they can better understand the money side of the industry.
I like to provide a quarterly update on the hyperscalers as they give us the best gauge on technology markets as a whole. We get data across infra, cloud, and applications. For those interested in AI adoption, they also give us the best insight into AI adoption (Capex, AI cloud revenue, and AI app adoption).
For background info, I published a primer on the cloud here, providing a breakdown of its history, technology, and markets.
This article will be structured as follows:
- Background on the Hyperscalers’ AI Strategy
- The Capex Story
- Market Share Data
- Azure Quarterly Update
- AWS Quarterly Update
- GCP Quarterly Update
Turkey’s comeback, Russia’s overheating economy & more
Another Economics related resource I absolutely love are Joeri Schasfoort ‘s economics videos for Money and Macro. This video is a great way to stay in touch with important economic developments and debates happening around the world.
I haven’t read as much Alejandro Piad Morffis as I should have, and that’s totally on me. I found this article in my bookmarks, and it is a masterpiece. Do yourself a favor and subscribe to his newsletter because it will make your journey into Computer Science and AI so much easier. This piece is a great introduction to Graphs and some of the core algorithms that drive everything else.
The Paltry Economics of Esports
I’m not super into Gaming and Esports, so this video was an eye opener to me on so many levels. Learning about business models is one of my favorite things, and it’s done a lot of good things for me.
Other Good Content
The Meme that gave me Imposter Syndrome
An interesting overview of type attributes in IOS development.
The best way to understand type attributes is by checking how they’re implemented in the Swift source code.
Ha, just kidding, I’ll explain them here.
Type attributes provide the compiler with additional information about a type. When used in a function, they enforce constraints on the type they are applied to — in many cases, this type is a closure: () -> Void.
3 Hours to 3 Minutes: How Mobile reCell Is Importing Customer Data 60x Faster
Mobile reCell is the pioneer of software-driven recovery for corporate-owned IT assets, such as laptops, tablets, and smartphones from employees. Using Mobile reCell’s platform, customers — like a leading US airline — can initiate workflows like replacing the iPads in every cockpit of their fleet.
Mobile reCell’s enterprise customers manage the hardware recovery of tens of thousands of devices. On a daily basis, each of their customers initiates hundreds of workflows to request employee device returns, send return kits or QR codes, and more through their platform.
Why Airbnb moved away from a monolithic architecture
In 2018, Airbnb began its migration to a service-oriented architecture, as the Ruby on Rails “monorail” started becoming hard to maintain and was a single point of failure.
The main difference between SOA and microservices has to do with the architecture scope.
While Airbnb didn’t necessarily need to move to a SOA, they chose to as it made sense for their organizational needs.
In a recent 2023 talk, they outlined four lessons:
- Invest in shared infrastructure early
- Simplify service dependencies
- Centralize data hydration (fetching and transformation)
- Separate UI logic from backend logic
In this article, I summarize the talk and make architecture comparisons with other large tech companies, like Meta, Google, and Uber.
If you liked this article and wish to share it, please refer to the following guidelines.
I put a lot of effort into creating work that is informative, useful, and independent from undue influence. If you’d like to support my writing, please consider becoming a paid subscriber to this newsletter. Doing so helps me put more effort into writing/research, reach more people, and supports my crippling chocolate milk addiction. Help me democratize the most important ideas in AI Research and Engineering to over 100K readers weekly.
PS- We follow a “pay what you can” model, which allows you to support within your means. Check out this post for more details and to find a plan that works for you.
I regularly share mini-updates on what I read on the Microblogging sites X(https://twitter.com/Machine01776819), Threads(https://www.threads.net/@iseethings404), and TikTok(https://www.tiktok.com/@devansh_ai_made_simple)- so follow me there if you’re interested in keeping up with my learnings.
Reach out to me
Use the links below to check out my other content, learn more about tutoring, reach out to me about projects, or just to say hi.
Small Snippets about Tech, AI and Machine Learning over here
AI Newsletter- https://artificialintelligencemadesimple.substack.com/
My grandma’s favorite Tech Newsletter- https://codinginterviewsmadesimple.substack.com/
Check out my other articles on Medium. : https://rb.gy/zn1aiu
My YouTube: https://rb.gy/88iwdd
Reach out to me on LinkedIn. Let’s connect: https://rb.gy/m5ok2y
My Instagram: https://rb.gy/gmvuy9
My Twitter: https://twitter.com/Machine01776819