Interesting Content in AI, Software, Business, and Tech- 04/18/2024

Content to help you keep up with Machine Learning, Deep Learning, Data Science, Software Engineering, Finance, Business, and more

Devansh
12 min readApr 19, 2024

A lot of people reach out to me for reading recommendations. I figured I’d start sharing whatever AI Papers/Publications, interesting books, videos, etc I came across each week. Some will be technical, others not really. I will add whatever content I found really informative (and I remembered throughout the week). These won’t always be the most recent publications- just the ones I’m paying attention to this week. Without further ado, here are interesting readings/viewings for 04/18/2024. If you missed last week’s readings, you can find it here.

Reminder- We started an AI Made Simple Subreddit. Come join us over here- https://www.reddit.com/r/AIMadeSimple/. If you’d like to stay on top of community events and updates, join the discord for our cult here: https://discord.com/invite/EgrVtXSjYf.

Community Spotlight: Roger Lam

I’ve been following Roger Lam on LinkedIn for a bit, and I love what I’ve seen so far. He shares a lot of interesting content/updates (both his own and other people’s), which makes staying updated much easier. His posts are less dense than some of the experts we usually feature, but that’s not a strict negative since they’re much more digestible. If you’re looking for ML news in a more approachable format, I’d suggest giving Roger a follow.

If you’re doing interesting work and would like to be featured in the spotlight section, just drop your introduction in the comments/by reaching out to me. There are no rules- you could talk about a paper you’ve written, an interesting project you’ve worked on, some personal challenge you’re working on, ask me to promote your company/product, or anything else you consider important. The goal is to get to know you better, and possibly connect you with interesting people in our chocolate milk cult. No costs/obligations are attached.

Previews

Curious about what articles I’m working on? Here are the previews for the next planned articles-

Tech Made Simple

UFC’s 300 Million Dollar Settlement and how it teaches us about the Game Theory of Wealth Inequality.

AI Made Simple

How to hack HuggingFace + how to implement better AI Security.

Before we get into the reading recs, I have an important update. I’m looking into AI for different chaotic systems. If you have insight into these systems, please shoot me a message:

Financial Modeling

Supply Chain Analysis.

Health

Weather Forecasting.

It would be cool if you had specifically modeled the above as chaotic systems, but even general insights into the fields would be very helpful. As always, if you know about something else that you’d like to share, you’re more than welcome to send me a message.

Join 150K+ tech leaders and get insights on the most important ideas in AI straight to your inbox through my free newsletter- AI Made Simple

Highly Recommended

These are pieces that I feel are particularly well done. If you don’t have much time, make sure you at least catch these works.

How to Write an Email (90s Tutorial)

A lot of people ask me for advice on how to write better. Improving your email writing is one of the best skills for improving overall writing ability. I recently found this guide to writing emails, and I’ve been implementing a lot of it’s wisdom. I’m sure you’ll appreciate the results.

2024–4–14 arXiv roundup: backlog highlights part 2

Davis Blalock made his grand return to writing after DBRX’s successful release. He’s going to get back to publishing regularly, so if you’re interested in getting fairly in-depth summaries of ML Papers, then check him out. To give you a feel for his summaries, I’m attaching his summary of one of the papers below. Every article he publishes has multiple such summaries, so make sure you read the linked article to see them all.

GoEX: Perspectives and Designs Towards a Runtime for Autonomous LLM Applications

This is one of those rare papers that really changed my thinking about where we’re headed.

I’ve thought for a long time about how LLM-generated code would need to be sandboxed, and how you probably want to sanitize the inputs in various ways, and how you might want to postprocess or constrain the outputs to do something like guarantee that you’re getting valid JSON.

This paper points out that what we really want is not a bunch of separate pieces, but a cohesive LLM Runtime that enforces certain properties, even in the presence of an untrusted LLM agent. Some example properties include:

  1. Ensuring your LLM agent’s actions are “reversible,” meaning that you can always undo them if you don’t like them.
  2. If you can’t get full reversibility, enforcing “damage confinement,” so that the persistent side-effects are well-understood and the rest of the system can work around them. E.g., you can’t un-send an email, but you might have a whitelist of recipients and MIME types.
  3. Ensuring that cryptographic secrets don’t get shared with third-party LLM APIs.
  4. Sandboxing code execution (think VMs, Docker, WASM, intercepting syscalls, limiting dependencies, etc).

To achieve these properties, they lean on a lot of ideas from classic data systems, like ACID transactions and commits.

While the paper reads like a position paper on LLM agents initially, you realize halfway through that this is hardcore systems research. Like, look at all these different runtime components they built just to safely hit a REST API:

Similarly, they end up with some interesting design tradeoffs around damage control/reversibility. The easy case is if you can lean on a database to just handle undo operations for you. The harder case is if the actions are too complex to cleanly undo, and you have to snapshot different versions of your system state and restore them.

I’m excited about this for a few reasons.

First, it’s just much cleaner thinking about what we need in order to make agents safe to deploy. I’d previously only thought about a hodgepodge of hammers and nails, but formulating this as “We need to design an LLM runtime with a particular threat model that guarantees particular properties” is much clearer.

The second is that this clean formulation means the field will pick up on this idea and make real progress. We’ll see systems researchers for the next 5+ years gradually tightening the guarantees, reducing the overheads, expanding the functionality, etc. We’ll see open source LLM runtimes, with auditable security properties. We’ll see vendor solutions that integrate with containers and/or hypervisors. We’ll see public REST and GraphQL APIs start to add reversibility features to facilitate calls from agents. And more.

The last reason is that I see this as a promising attack on AI doom scenarios. For a long time, this literature has focused on extremely high-level failure modes (e.g., humans being disempowered as we grow dependent on AI) and properties of agents (e.g., being aligned with human values). I feel like this paper has “compiled” many abstract concerns down to the level of verifiable system properties — like, if we can “undo” an agent’s actions, it probably can’t kill us all.

On a more meta level, it also reinforces my conviction in the following maxim:

Never solve with machine learning that which can be solved with regular code.1

For example, people spent decades trying to come up with some algorithmic alternative to backpropagation, in part to avoid the (biologically implausible) storage of activations and synchronous updates. But it turns out we can just devise systems improvements like sharded data parallelism, pipeline parallelism, etc, that make these downsides fine in practice. It’s much easier to make working algorithms fast than to find fast algorithms that work.

In the case of this paper and AI doom, I feel like we’ve spent decades thinking about high-level problems like specifying human values and “target loading” and detecting “deceptive alignment,” and what’s actually going to happen is we’re just going to sandbox the crap out of our agents, ensure their actions can be rolled back, and build good observability and monitoring tools. And we’ll train them on a few million samples of human preference data for good measure, although that’s as much a UX play as a defense-in-depth one. In short, whatever we haven’t solved with RL, philosophy, activation probing, etc., will get solved with strong systems work.

Stop “reinventing” everything to solve alignment

Loved Nathan Lambert’s most recent article exploring Social Choice Theory and Alignment. I’ve always maintained that it’s very important for tech people to study non tech topics, b/c we build solutions for people. As Nate puts succinctly, “Integrating some non-computing science into reinforcement learning from human feedback (RLHF) can give us the models we want.” Even in a purely pragmatic sense, there’s a lot of value to picking up different skills and picking up multiple perspectives (one of the reasons why these updates cover multiple topics).

“As I discussed in my post where I described the different groups interested in the openness of LLMs, one of the core reasons for openness is so that scientists outside of AI, particularly outside of the biggest technology companies, can show that fields other than CS have contributions to make to the future of AI.

I so often get the question “What is left for academics to do in the space of LLMs?” This sort of thing is a perfect answer — study the emerging integration of two long-standing fields of science. Show that we can make the process for training our models more transparent with the goals, and then we can probably use it to actually make the models better”

China’s thirsty data centres, AI industy could use more water than size of South Korea’s population by 2030: report warns

As we covered in “The rise of AI as Magic”, there is a large disconnect between the people who use and benefit from AI, and the people who are hit by its negative externalities. Natural Resources are constrained, and we need to be judicious with how we use them for AI Projects.

China’s thirsty data centres and the rapid growth of artificial intelligence (AI) could dramatically increase demand on the country’s water resources, according to a new report by think tank China Water Risk.

The Hong Kong-based non-profit estimated the annual water consumption of data centres in China to be around 1.3 billion cubic metres (343 billion gallons) — enough for residential use for 26 million people. By 2030, the figure could reach over 3 billion cubic metres as more data facilities are expected to open, equivalent to the demand of a population greater than that of South Korea.

What Do Developers Want From AI?

Google does very interesting research into developer productivity. Shoutout to Sarah D’Angelo for her work looking at how developers are want to utilize AI Based tools.

The evolution of AI is a pivotal moment in history, but it’s not the first time we have experienced technological advances that have changed how humans work. By looking at the advances in automobiles, we are reminded of the importance of focusing on our developers’ needs and goals.

The recent advances in AI have resulted in the development of an increasing number of developer tools enhanced with AI (e.g., DuetAI, CoPilot, and ChatGPT for coding tasks). With this growth, there has been a lot of research on the impact of these enhancements from the perspective of developer productivity: Are AI enhancements increasing the speed at which developers write code? Do they improve the quality of the code written?1 Do they help developers find more creative solutions? However, there has been far less discussion of where and how developers want to interact with AI in their tools. If we do not address these questions, we risk focusing too much on the technology and its capabilities and not enough on identifying promising opportunities. As we have emphasized in this column before, our team takes a human-centered approach to understanding developer productivity, and accordingly, we began our explorations into this space from the developer’s perspective. Where do developers want AI in their workflows, and what do they anticipate its effects to be?

2024 generative AI predictions

Came across this a great report by CB Insights on Gen AI. Gets a few things off, but it is a very comprehensive overview of the market with a lot of good insights.

All Learning Algorithms Explained in 14 Minutes

If you want a beginner-friendly intro to the major family of algorithms used in AI, this is the video for you.

Talking existential risk into being: a Habermasian critical discourse perspective to AI hype

Remember how we talked about how a lot of Doomers/people pushing existential risk were doing it for money? Here’s a great study to back the claims up.

Recent developments in Artificial Intelligence (AI) have resulted in a hype around both opportunities and risks of these technologies. In this discussion, one argument in particular has gained increasing visibility and influence in various forums and positions of power, ranging from public to private sector organisations. It suggests that Artificial General Intelligence (AGI) that surpasses human intelligence is possible, if not inevitable, and which can — if not controlled — lead to human extinction (Existential Threat Argument, ETA). Using Jürgen Habermas’s theory of communicative action and the validity claims of truth, truthfulness and rightness therein, we inspect the validity of this argument and its following ethical and societal implications. Our analysis shows that the ETA is problematic in terms of scientific validity, truthfulness, as well as normative validity. This risks directing AI development towards a strategic game driven by economic interests of the few rather than ethical AI that is good for all.

A pragmatic introduction to Fractals

A week ago, we did a look into Fractals on my sister publication- Tech Made Simple. It was very well-received, and I’m sharing it here in case it’s interesting to any of you. I’m trying to get a community project where we explore the visualizations created by different Fractals over here on Github, and would love to have you join over here.

Recently, we did an extensive deep-dive into Chaotic Systems and how we can use AI to model them better. As we pointed out there, Fractals were a recurring theme in many models for chaotic systems. So I figured it would make sense to do a deeper look into Fractals. As you will see in this article, Fractals have 3 properties that make them immensely appealing to any engineer and researcher:

  1. They show up in a lot of places: Fractals have a very interesting tendency to show up in places where you don’t expect them. We will cover some diverse use cases/fields where we can spot fractals.
  2. Their Math is Super Special: Fractals are created by dancing at the edge of order and chaos. And I’m not being poetic here (this is literally how we create fractals). That gives them some very interesting properties that we don’t see with a lot of other systems that we study.
  3. They are Beautiful: There is an aesthetic to Fractals that you would not find anywhere else. It can be a lot of fun playing around with different functions to create the coolest fractal. Too many people never engage deeply with higher-level math b/c schools are horrible at teaching the introductory stuff, but such hands-on experiences can be great for sparking the curiosity to look deeper.

Let’s look into these points in more detail.

How Many Lemons to Melt the Eiffel Tower?

Answering one of humanity’s most important questions.

If you liked this article and wish to share it, please refer to the following guidelines.

I put a lot of effort into creating work that is informative, useful, and independent from undue influence. If you’d like to support my writing, please consider becoming a paid subscriber to this newsletter. Doing so helps me put more effort into writing/research, reach more people, and supports my crippling chocolate milk addiction. Help me democratize the most important ideas in AI Research and Engineering to over 100K readers weekly.

Help me buy chocolate milk

PS- We follow a “pay what you can” model, which allows you to support within your means. Check out this post for more details and to find a plan that works for you.

I regularly share mini-updates on what I read on the Microblogging sites X(https://twitter.com/Machine01776819), Threads(https://www.threads.net/@iseethings404), and TikTok(https://www.tiktok.com/@devansh_ai_made_simple)- so follow me there if you’re interested in keeping up with my learnings.

Reach out to me

Use the links below to check out my other content, learn more about tutoring, reach out to me about projects, or just to say hi.

Small Snippets about Tech, AI and Machine Learning over here

AI Newsletter- https://artificialintelligencemadesimple.substack.com/

My grandma’s favorite Tech Newsletter- https://codinginterviewsmadesimple.substack.com/

Check out my other articles on Medium. : https://rb.gy/zn1aiu

My YouTube: https://rb.gy/88iwdd

Reach out to me on LinkedIn. Let’s connect: https://rb.gy/m5ok2y

My Instagram: https://rb.gy/gmvuy9

My Twitter: https://twitter.com/Machine01776819

--

--

Devansh

Writing about AI, Math, the Tech Industry and whatever else interests me. Join my cult to gain inner peace and to support my crippling chocolate milk addiction