Interesting Content in AI, Software, Business, and Tech- 10/18/2023

Content to help you keep up with Machine Learning, Deep Learning, Data Science, Software Engineering, Finance, Business, and more

9 min readOct 19, 2023

A lot of people reach out to me for reading recommendations. I figured I’d start sharing whatever AI Papers/Publications, interesting books, videos, etc I came across each week. Some will be technical, others not really. I will add whatever content I found really informative (and I remembered throughout the week). These won’t always be the most recent publications- just the ones I’m paying attention to this week. Without further ado, here are interesting readings/viewings for 10/18/2023. If you missed last week’s readings, you can find it here .

Reminder- We started an AI Made Simple Subreddit. Come join us over here- https://www.reddit.com/r/AIMadeSimple/

Community Spotlight- Will Durant

Technically, Will is not a member of this community (he died long before the internet) nor does he have much to do with AI or Tech. But his work is worthwhile regardless. Durant was a historian, philosopher, and writer who spent his life cataloging some of history’s greatest thinkers and civilizations. You can find his work in video format at the YouTube channel Durant and Friends here. Although his work and conclusions can often be very euro-centric, the videos still contain a ton of interesting information. I love playing the videos in the background while I play Civ 5 or Age of Empires 2 (the greatest video game ever made, and I will die on that hill). Thanks to the above channel, I have picked up an interest in Babylon and Jean-Jacques Rousseaue. If you’re looking to add to your reading list/exploring new interests, I can’t recommend his work highly enough.

If you’re doing interesting work and would like to be featured in the spotlight section, just drop your introduction in the comments/by reaching out to me. There are no rules- you could talk about a paper you’ve written, an interesting project you’ve worked on, some personal challenge you’re working on, ask me to promote your company/product, or anything else you consider important. The goal is to get to know you better, and possibly connect you with interesting people in our chocolate milk cult. No costs/obligations are attached.

Join 150K+ tech leaders and get insights on the most important ideas in AI straight to your inbox through my free newsletter- AI Made Simple

Highly Recommended

These are pieces that I feel are particularly well done. If you don’t have much time, make sure you at least catch these works.

What does the OpenLLM Leaderboard measure?

Super duper important. Must read.

We find that it is indeed hard to gauge the real-world usability of LLMs from the results of the leaderboard, as the tasks it includes are disconnected from how LLMs are used in practice. Furthermore, we find clear ways the leaderboard can be gamed, such as by exploiting the common structure of ground truth labels. In sum, we hope that this report demonstrates the importance of testing your model in a disaggregated way on on data that is representative of the downstream use-cases you care about.

How Transparent Are Foundation Model Developers?

There is a right way and a wrong way to regulate AI. Creating benchmarks/checklists for quantifying certain measures like Transparency are an important first step.

Today, we’re introducing the Foundation Model Transparency Index to aggregate transparency information from foundation model developers, identify areas for improvement, push for change, and track progress over time. This effort is a collaboration between researchers from Stanford, MIT, and Princeton.

The inaugural 2023 version of the index consists of 100 indicators that assess the transparency of the developers’ practices around developing and deploying foundation models. Foundation models impact societal outcomes at various levels, and we take a broad view of what constitutes transparency.

xVal: A Continuous Number Encoding for Large Language Models

Numbers have been a sticky point for LLMs. This approach seems promising enough, and if someone can crack number embedding with LLMs, they suddenly become wayyy more useful. PS: I found this paper in the excellent publication Davis Summarizes Papers. If you’re interested in keeping up with papers and AI Research, highly recommend subscribing to his newsletter. If your priority is to keep up with every update, then Davis is an even better resource than myself

Large Language Models have not yet been broadly adapted for the analysis of scientific datasets due in part to the unique difficulties of tokenizing numbers. We propose xVal, a numerical encoding scheme that represents any real number using just a single token. xVal represents a given real number by scaling a dedicated embedding vector by the number value. Combined with a modified number-inference approach, this strategy renders the model end-to-end continuous when considered as a map from the numbers of the input string to those of the output string. This leads to an inductive bias that is generally more suitable for applications in scientific domains. We empirically evaluate our proposal on a number of synthetic and real-world datasets. Compared with existing number encoding schemes, we find that xVal is more token-efficient and demonstrates improved generalization.

An AI Revolution Isn’t Coming — We’re Already In It

A great overview of important developments in AI that are already here.

Headlines focus on what these models mean for our future — but what about the present? It’s an injustice to consumers to state an AI revolution is on the horizon when it’s already begun. AI is currently being used to disrupt many industries in ways consumers aren’t even aware will drastically improve their lives. Below are a few companies and their products that have the potential to fundamentally change your life.

Explained Simply: How A.I. Defeated World Champions in the Game of Dota 2

I wish I had the patience to write this. Aman Y. Agarwal really decided to explain every little nuance related to Open AI’s RL publication. I don’t think I could improve upon it, even if I tried.

In 2019, the world of esports changed forever. For the first time, a superhuman AI program learned to cooperate with copies of itself and defeated the reigning world champions in the #1 competitive video game on the planet, Dota 2.

In this essay, I will take that breakthrough research paper by OpenAI and explain it paragraph-by-paragraph, in simple English.

This will make it more approachable for those who don’t (yet) have a strong background in ML or neural networks, or who don’t use English as their first language.

What Data Science Forgot

A great writeup, echoing many of the sentiments I often like to express. Great engineering means nothing if you don’t solve the right problems. Data Science/AI/Software teams often get hot and bothered about disputing areas they know nothing about. More often than not, it just results in a product that tries to work around some regulation, burns a ton of investor money, and never turns a profit. If you want to disrupt more than just your bank account, always understand your domain and the challenges you’re solving. Intimately.

The main thing Data Science forgot is that understanding the data and the goals for the data are essential. In doing so, they fundamentally forgot their roots when it comes to Operations Research and Statistics which have a fundamental rule each:

Operations Research — Answering the why before the what or how.
Statistics — Form a null hypothesis and then work to disprove it.

Forgetting to validate and tie to actual business objectives and forgetting that you aren’t supposed to prove the hypothesis but disprove it, is exactly what the xkcd comic above demonstrates and something that has driven me nuts for years.

Why Muay Thai and Kickboxing Fails in the West

Even if you’re not a combat sports fan like me, this video is a pretty interesting look into how the current social climate has moved to favor gimmicks and spectacle over skill. Fighting has a pretty different economic setup to other traditional sports, and it actually has some interesting parallels to social media and the current investing climate. And the last part about the Name Bias (ethnic names getting fewer call-backs) is worth noting.

Watch it here

AI Content

Efficient Streaming Language Models with Attention Sinks (Paper Explained)

How does one run inference for a generative autoregressive language model that has been trained with a fixed context size? Streaming LLMs combine the performance of windowed attention, but avoid the drop in performance by using attention sinks — an interesting phenomenon where the token at position 0 acts as an absorber of “extra” attention.

Watch here

Training AI to Play Pokemon with Reinforcement Learning

I was going to do a piece on RL and how it might make a comeback in testing. Then I came across this video. It highlights some of the points I wanted to highlight.

Watch here

LangChain is Garbage Software

This is an interesting hot-take. Came across this through Prithivi Da. Would love to know what y’all think.

Overall, LangChain seems to embody a complex and intricate approach, despite its relatively young age. Adapting it to meet specific requirements would result in substantial technical debt, which cannot be easily resolved as is often the case with AI startups relying on venture capital to manage such issues.

Ideally, API wrappers should simplify code complexity and cognitive burden when dealing with intricate ecosystems, given the mental efforts already required to work with AI. However, LangChain stands out as one of the few software pieces that introduce additional overhead in its prevalent use cases.

More-efficient approximate nearest-neighbor search

In a paper we presented at this year’s Web Conference, we describe a new technique that makes graph-based nearest-neighbor search much more efficient. The technique is based on the observation that, when calculating the distance between the query and points that are farther away than any of the candidates currently on the list, an approximate distance measure will usually suffice. Accordingly, we propose a method for computing approximate distance very efficiently and show that it reduces the time required to perform approximate nearest-neighbor search by 20% to 60%.

What Every Developer Should Know About GPU Computing

Most programmers have an intimate understanding of CPUs and sequential programming because they grow up writing code for the CPU, but many are less familiar with the inner workings of GPUs and what makes them so special. Over the past decade, GPUs have become incredibly important because of their pervasive use in deep learning. Today, it is essential for every software engineer to possess a basic understanding of how they work. My goal with this article is to give you that background.

BitNet: Scaling 1-bit Transformers for Large Language Models

The increasing size of large language models has posed challenges for deployment and raised concerns about environmental impact due to high energy consumption. In this work, we introduce BitNet, a scalable and stable 1-bit Transformer architecture designed for large language models. Specifically, we introduce BitLinear as a drop-in replacement of the nn.Linear layer in order to train 1-bit weights from scratch. Experimental results on language modeling show that BitNet achieves competitive performance while substantially reducing memory footprint and energy consumption, compared to state-of-the-art 8-bit quantization methods and FP16 Transformer baselines. Furthermore, BitNet exhibits a scaling law akin to full-precision Transformers, suggesting its potential for effective scaling to even larger language models while maintaining efficiency and performance benefits.

Learn More

What is algebraic geometry?

Algebraic geometry is often presented as the study of zeroes of polynomial equations. But it’s really about something much deeper: the duality between abstract algebra and geometry.

Watch it

China’s Crumbling Economic Story

Deng Xiaoping’s reforms in the 80s transformed Shenzhen, a small town near Hong Kong, into an economic powerhouse. China’s rapid growth lifted millions from poverty but did it grow too quickly to be sustainable? Now China faces deflation, and experts are worried that this could spell the end of the economic miracle.

Watch it

Nature mimics: why bugs mate with beer bottles

JEWEL BEETLES HAVE BEEN known for inspiring engineering innovations of such things as forest fire sensors, new materials, and even palace decor. But one Australian jewel beetle (Julodimorpha bakewelli) has achieved fame simply by choosing a rather unorthodox mate: a beer stubbie.

Fables and Folktales: The Boy Who Found Fear At Last

Fun little folk tale I came across

If you liked this article and wish to share it, please refer to the following guidelines.

If you find AI Made Simple useful and would like to support my writing- please consider becoming a premium member of my cult by subscribing below. Subscribing gives you access to a lot more content and enables me to continue writing. This will cost you 400 INR (5 USD) monthly or 4000 INR (50 USD) per year and comes with a 60-day, complete refund policy. Understand the newest developments and develop your understanding of the most important ideas, all for the price of a cup of coffee.

Become a premium member