What you should know in AI, Software, Business, and Tech- 3/4/2025

Content to help you keep up with Machine Learning, Deep Learning, Data Science, Software Engineering, Finance, Business, and more

17 min read4 days ago

A lot of people reach out to me for reading recommendations. I figured I’d start sharing whatever AI Papers/Publications, interesting books, videos, etc I came across each week. Some will be technical, others not really. I will add whatever content I found really informative (and I remembered throughout the week). These won’t always be the most recent publications- just the ones I’m paying attention to this week. Without further ado, here are interesting readings/viewings for 3/4/2025. If you missed last time’s readings, you can find it here.

Reminder- We started an AI Made Simple Subreddit. Come join us over here- https://www.reddit.com/r/AIMadeSimple/. If you’d like to stay on top of community events and updates, join the discord for our cult here: https://discord.com/invite/EgrVtXSjYf. Lastly, if you’d like to get involved in our many fun discussions, you should join the Substack Group Chat Over here.

Community Spotlight: Jean David Ruvini

Jean David Ruvini is one of the most prominent experts in Natural Language Processing and LLMs. He’s been a senior AI Leader at both EBay and Meta, leading very high impact projects. JD regularly shares his insights/notes on LLM related research papers on LinkedIn, and they’re worth following for anyone technical (I’ve shared his work several times and I save every edition to study). If you’re interested in the eCommerce space, JD has also founded WiseCues, which helps Shopify store owners by helping your customers find the best products for them by blending search and chatbots into one platform. If you’re looking to invest or want or try the product out, send JD a message.

If you’re doing interesting work and would like to be featured in the spotlight section, just drop your introduction in the comments/by reaching out to me. There are no rules- you could talk about a paper you’ve written, an interesting project you’ve worked on, some personal challenge you’re working on, ask me to promote your company/product, or anything else you consider important. The goal is to get to know you better, and possibly connect you with interesting people in our chocolate milk cult. No costs/obligations are attached.

Previews

Curious about what articles I’m working on? Here are the previews for the next planned articles-

Tech Made Simple

Google’s Employee Productivity

AI Made Simple-

How to Nurture AI Ecosystems for Startups (I need to work out the title).

I provide various consulting and advisory services. If you‘d like to explore how we can work together, reach out to me through any of my socials over here or reply to this email.

Highly Recommended

These are pieces that I feel are particularly well done or important. If you don’t have much time, make sure you at least catch these works.

How much progress have we made on climate change?

Some very positive Climate news here. That makes me very happy. However, this is a field with a lot of fraud and a lot of greenwashing, where organizations fake their positive climate impacts, so I’m I don’t know how real a lot of this is. That being said, I’ll choose to optimistic, and even if this stuff is partially true- I think it’s worth celebrating.

How AI is Optimizing Venture Capital Investments & Operations 🧠

I want to eventually get into Venture Capital (specifically preseed and seed AI enabled Deep Tech) so I’ve been studying the market a bit. Luis Llorens Gonzalez has been an incredibly helpful resource.

Summary

❇️ Current State of AI Adoption in VC Firms

❇️ Build vs Buy AI Tools

❇️ Human-AI Collaboration

❇️ The Future of AI in VC

❇️ Most Popular AI Tools in VC

❇️ Top 3 GPTs for VCs

This post delves into how VCs are leveraging AI across various aspects of our workflow, the challenges we face, and the next frontier of AI-driven investment strategies.

I recently had the privilege of joining a roundtable at the World AI Cannes Festival to discuss how AI is transforming VC Workflows.

Luis (me) — Plug and Play, Alexis — Elaia, Joy — Nexterra, Tey — Mckinsey & Company

As early-stage VC investors, we thrive on identifying the next big opportunity. However, the process is anything but simple — keeping up with rapidly evolving trends and spotting promising entrepreneurs before they even update their LinkedIn.

AI is now at the core of optimizing VC decision-making, from deal sourcing to portfolio management and internal productivity.

Learning Molecular Representation in a Cell

I have to study the math/Computer Science a bit more to see how deeply I want to be hyped about this, but this is very promising at first glance. S/o to Srijit Seal and the rest of the people that wrote this, and credit to Luke Yun for the find.

Predicting drug efficacy and safety in vivo requires information on biological responses (e.g., cell morphology and gene expression) to small molecule perturbations. However, current molecular representation learning methods do not provide a comprehensive view of cell states under these perturbations and struggle to remove noise, hindering model generalization. We introduce the Information Alignment (InfoAlign) approach to learn molecular representations through the information bottleneck method in cells. We integrate molecules and cellular response data as nodes into a context graph, connecting them with weighted edges based on chemical, biological, and computational criteria. For each molecule in a training batch, InfoAlign optimizes the encoder’s latent representation with a minimality objective to discard redundant structural information. A sufficiency objective decodes the representation to align with different feature spaces from the molecule’s neighborhood in the context graph. We demonstrate that the proposed sufficiency objective for alignment is tighter than existing encoder-based contrastive methods. Empirically, we validate representations from InfoAlign in two downstream tasks: molecular property prediction against up to 19 baseline methods across four datasets, plus zero-shot molecule-morphology matching.

The Great AI Hallucination Lie: Why Legal Tech Keeps Failing You

A quick writeup we did (with very cool visuals) on what Legal AI is getting wrong about Hallucinations (grouping two very different kinds of errors into one umbrella + only addressing a narrow very set of problems to be able to market their 99% accuracy) .

By now, we’ve all seen it: every few months, a new legal tech tool bursts onto the scene claiming it’s “completely error-free” and has solved hallucinations once and for all — sometimes propped up by a sponsored study at a top institution.

Then you try it and find it’s just another overpriced GPT wrapper. Why does this keep happening, and what can we actually do about it?

Today, we’d like to explain how Iqidis views hallucinations and, more importantly, how we’re tackling them. While hallucinations are a major challenge for any legal AI tool, we believe our unique approach provides the strongest foundation for overcoming them.

The way we see it, hallucinations are just an AI specific term for errors. From a legal perspective this falls into two categories:

Fake citations and/or references, and
Bad analysis built on top of real citations.

Learning Pokemon with Reinforcement Learning

Personally I feel like Pokemon Games are pretty overrated (that’s also how I feel about the whole franchise), but there is a weirdly high level of top-level research that gets done through Pokemon games. The Beyblade community really needs to pick up their slack here and drop some bangers so I have an excuse to play Beyblades at work for research purposes (gun to your head- Pokemon or Beyblade?)

Hi! Since 2020, we’ve been developing a reinforcement learning (RL) agent to beat the 1996 game Pokémon Red. As of February 2025, we are able to beat Pokémon Red with Reinforcement Learning using a <10 million parameter policy (60500x smaller than DeepSeekV3) and with minimal simplifications. The output is not a policy capable of beating Pokémon, but a technique for producing solutions to Pokémon. This website describes the system’s current state. All code is open sourced and available for you, the reader, to try.

A Hybrid Commercial with Real and AI generated video

This is a very high-quality commercial shot by Rufus Blackwell. It is technically impressive, but looking at it made me think about the future impact of AI on the Labor Markets. I interact with AI in three major capacities- writing, AI Research, and, Programming. All 3 fields use AI aggressively and based on my observations here is what I’m noticing- AI seems to be increasing work not reducing it. Since more things become possible with AI, the quality expected also rises. For example, increased velocity through code assistant leads to more devs and managers thinking they can do more within the same budget (this is sometimes true and other times overconfidence which creates problems). So the difficulty and scope of tickets increases.

Over the long term, this might mean you need fewer people doing things, but the people left might end up working the same or even more hours. Essentially, work gets concentrated into fewer and fewer people and other people do something else. Not too different from how technology increases division of labor I guess. But guess the utopia of AI working while I do nothing is still far away #sedLife.

Also, I found this one on Threads (over here). The research scene there is still not as big as Twitter, but a lot of interesting discussions are starting to happen on it, especially in the application side. So I would definitely sign up and monitor this as a place of market research. Come say hi if you join.

FNet: Mixing Tokens with Fourier Transforms

My goat Manny Ko shared this gem. This can be game-changing for Encoder Architectures, especially at edge and in long context retrieval. It’s also a Google publication, which is noteworthy (usually this work is pushed by smaller, less established labs). I’m guessing Google might be looking to speed up their search right now.

We show that Transformer encoder architectures can be sped up, with limited accuracy costs, by replacing the self-attention sublayers with simple linear transformations that “mix” input tokens. These linear mixers, along with standard nonlinearities in feed-forward layers, prove competent at modeling semantic relationships in several text classification tasks. Most surprisingly, we find that replacing the self-attention sublayer in a Transformer encoder with a standard, unparameterized Fourier Transform achieves 92–97% of the accuracy of BERT counterparts on the GLUE benchmark, but trains 80% faster on GPUs and 70% faster on TPUs at standard 512 input lengths. At longer input lengths, our FNet model is significantly faster: when compared to the “efficient” Transformers on the Long Range Arena benchmark, FNet matches the accuracy of the most accurate models, while outpacing the fastest models across all sequence lengths on GPUs (and across relatively shorter lengths on TPUs). Finally, FNet has a light memory footprint and is particularly efficient at smaller model sizes; for a fixed speed and accuracy budget, small FNet models outperform Transformer counterparts.

TechBio and Microbiome (I)

I don’t understand how Marina T Alamanou, PhD constantly has time to monitor every little thing in the TechBio space. Seriously, she does a great job informing the reader on what’s happening and why it matters, so if you have an interest in this space, then she’s the ultimate one stop shop for information.

How To Linearize The Attention Mechanism!

Damien Benveniste, PhD is another phenomenal contributor to the sci-comms space in cutting edge research. Highly recommend.

Today, we talk about how to engineer attention mechanisms in O(n) complexity instead of O(n2). This newsletter tends to be a bit more math-flavored than my usual content, but it is liberating to be able to use math for the greater good!

Low-Rank Projection of Attention Matrices: Linformer

Recurrent Attention Equivalence: The Linear Transformer

Kernel Approximation: Performer

Self-attention’s quadratic complexity in sequence length has long been a central bottleneck for large-scale Transformer models. Handling tens of thousands of tokens becomes computationally prohibitive and can quickly exhaust available memory. Linear attention mechanisms represent a paradigm shift in transformer architecture by mathematically re-engineering the attention operation to achieve O(n) complexity while maintaining global context awareness. Unlike sparse attention’s pattern restrictions, which preserve quadratic complexity but limit interactions to predefined token subsets, linear attention fundamentally redefines how all tokens interact by reformulating the attention matrix computation rather than pruning token interactions. Where sparse attention sacrifices theoretical completeness for practical speed, linear attention preserves global relationships at the cost of approximating pairwise token influences. This enables native handling of extreme sequence lengths (1M+ tokens) while avoiding sparse attention’s blind spots.

How to Get Small Business Ideas and Evaluate Them: Part 2

I expect Rubén Domínguez Ibar and Chris Tottman to spit game, but this is honestly one of the biggest value bombs on entrepreneuships on the internet. So much good stuff here. Just look at the table of contents. I’ve almost never been jealous of another writer’s work, but this is one of the few times I’ve wished I had the ability to write something like this.

In the previous post, we talked about how to generate small business ideas, from tapping into your passions and skills to brainstorming with frameworks like mind mapping and the SCAMPER technique. But coming up with an idea is only half the battle. The real challenge lies in testing whether that idea is viable.

That’s where this article comes in. This is Part 2 of our journey where we dive deeper into the crucial process of validating your business idea. Whether your concept was born out of frustration, creativity, or careful research, it’s time to put it to the test. Validation ensures your idea isn’t just exciting in theory but practical in execution. It’s how you’ll gather insights, refine your concept, and make informed decisions before investing time and resources.

Let’s pick up where we left off and turn that promising idea into a solid foundation for your business.

Table of Contents

1. Market Research: Understanding Demand and Competition

Identifying Your Target Market

Analyzing Industry Trends and Growth Potential

Studying Competitor Strengths and Weaknesses

Tools for Conducting Effective Market Research

2. Customer Discovery: Talking to Real People

Why Customer Feedback is Essential

How to Conduct Informational Interviews

Crafting the Right Questions to Uncover Insights

Where to Find Potential Customers for Testing

3. Building a Minimum Viable Product (MVP)

What an MVP Is (and Isn’t)

Different Types of MVPs: Landing Pages, Prototypes, Service Pilots

How to Build an MVP with Minimal Investment

Validating Demand Before Scaling Up

4. Pricing and Profitability Testing

How to Test If People Will Actually Pay

Setting Your Initial Price: Strategies and Psychology

Running Pre-Sales, Waitlists, and Beta Tests

Understanding Profit Margins and Costs from Day One

5. Using AI and Online Tools for Validation

How AI Can Help Analyze Market Demand

Using AI for Customer Surveys and Data Analysis

Online Tools for Rapid Prototyping and Feedback Collection

6. Testing Marketing Channels and Sales Strategies

Creating a Simple Landing Page to Gauge Interest

Running Small Paid Ads to Measure Click-Through Rates

Leveraging Social Media and Communities for Organic Feedback

Pre-Orders and Crowdfunding as Validation Tactics

7. Pivot or Proceed: Making Data-Driven Decisions

How to Analyze Your Validation Results

Signs Your Idea Needs Tweaking (or Scrapping)

When to Move Forward and Start Scaling

Next Steps: Preparing to Launch Your Business

Hypocritical AI: The Fast Track to Becoming a Made Man in the Healthcare Mafia [Part 1 of 2]

Top tier work by Sergei Polevikov . He really shines when going for blood and exposing all the hyprocisy and profiteering in healthcare. The following is how the article starts, and it gets more and more peak from there. Another rare time I was jealous of a writer and their ability with the pen and the strength of their thought.

If you’re at the VIVE conference in Nashville on Tuesday, February 18, don’t miss the 3 p.m. panel with Daryl Tol of General Catalyst and Munjal Shah of Hippocratic AI. Expect plenty of spin about how Hippocratic AI will somehow “grow” into its 205.0 revenue multiple and how it’s the “first ever” AI healthcare agent.

If you’re in the audience, here are two questions worth asking Munjal (I wish I was there):

1️⃣ If the Hippocratic AI co-founders and executive team truly believe in the company’s mission, why have they offloaded millions of dollars’ worth of shares in the secondary market?

In one example, during the Series A funding round on March 18, 2024 — less than a year after launch — the founders cashed out in the secondary accompanying the round. Have you ever heard of anything like that? Imagine how demoralizing that must have been for the employees. Not even 12 months in, and the founders were already sending the signal: “We have no clue how this is going to play out — but hey, money is money.” Blows my mind. 🤯

2️⃣ If your “AI agents” are so capable, then why does it still take an army of thousands — yes, thousands — of outsourced human nurses (from a company called OpenLoop) to monitor every single AI interaction?

My readers know I’ve spent the past two years investigating digital health companies — not just to learn from their mistakes, but to expose the perpetrators. The list is long: Babylon, VillageMD, Walgreens, Oak Street (this one started as a LinkedIn post but set off a firestorm in the comments), Olive AI, Teladoc, Livongo, Amwell, Summa Health, Cigna, Oscar Health, Clover Health, Optum’s Telehealth, Optum Ventures, Walmart Health, Cue Health, Truepill, Pieces Technologies, IBM Watson Health, Epic, Suki, Infermedica, Sniffle, Isabel, MayaMD, and Klick Labs.

But nothing prepared me for what I uncovered about Hippocratic AI.

Other Content

Controlled automatic task-specific synthetic data generation for hallucination detection

We present a novel approach to automatically generate task-specific synthetic datasets for hallucination detection. Our approach features a two-step generation-selection pipeline, where the generation step integrates a hallucination pattern guidance module and a language style alignment module. Hallucination pattern guidance makes it possible to curate synthetic datasets covering the most important hallucination patterns specific to target applications. Language style alignment improves the dataset quality by aligning the style of the synthetic dataset with benchmark text. To obtain robust supervised detectors from synthetic datasets, we also propose a data mixture strategy to improve performance robustness and model generalization. Our supervised hallucination detectors trained on synthetic datasets outperform in-context-learning (ICL)-based detectors by a large margin. Our extensive experiments confirm the benefits of our two-staged generation pipeline with cross-task and cross-hallucination pattern generalization. Our data-mixture-based training further improves generalization and the robustness of hallucination detection.

How Sevilla FC is discovering future soccer stars with Llama

Cool stuff. Not complex or vow-y, but cool to see.

Sevilla Fútbol Club (Sevilla FC), the seven-time Europa League champions, has long been a vanguard of innovation in professional sports. Whether it’s match analysis, player performance, or fan marketing, the club’s data department has pioneered machine learning and AI-powered tools to enhance performance both on and off the field.

Despite these advancements, one challenge remained — the team needed a way to efficiently analyze and leverage unstructured data from the more than 300,000 scouting reports in its database. To solve this, Sevilla FC’s data department partnered with IBM to create Scout Advisor — a generative AI-driven scouting tool designed and built on watsonx, with Llama 3.1 70B Instruct. IBM’s watsonx and Llama enables Sevilla FC to bridge the gap between traditional human-centric and data-driven scouting in the identification and characterization of potential recruits.

“Our in-house tools excelled at identifying and characterizing players based on structured numerical and categorical data, but they fell short with unstructured data — an invaluable scouting resource that encapsulates the human expert opinions that are crucial for comprehensive player evaluations,” says Elías Zamora, Chief Data Officer at Sevilla FC.

Google’s principles for measuring developer productivity

Abi Noda does fantastic looks into developer productivity. Top 5 must eads for Engineering Leaders.

This week I read Measuring Productivity: All Models Are Wrong, But Some Are Useful by Google researchers Ciera Jaspan and Collin Green. This paper gives an inside look at how Google approaches developer productivity measurement in a way that’s useful and not problematic. The lessons they share may serve as a guide for leaders to evaluate the metrics and frameworks their teams are using.

My summary of the paper

Models are used to explain, describe, or predict the world, and are ideally made as simple as possible. (Simpler models are easier to understand and explain.) However, this simplification comes at a cost: we have to decide what to include and what to leave out in any model, and bad models will omit important details that undermine their utility.

“When you construct a model you leave out all the details which you, with all the knowledge at your disposal, consider inessential… Models should not be true, but it is important that they are applicable.” — George Box, British statistician, 1976

Measuring engineering productivity is fundamentally an exercise in model building. It requires selecting, mapping, and validating relationships between inputs and outputs. It also requires other careful considerations in order for the model to be both useful and not problematic. Here, the authors share the principles they’ve developed over time that shape how they measure productivity today.

How to Hack AI Agents and Applications

I was expecting a bit more b/c this was shared by someone I consider smart, but this is a fantastic intro-ish guide to the common kinds of attacks for AI Security testing. Good if you want to get into the field, not super useful if you know the space better. For example, I’m probably around a high level beginner (Early-Mid BJJ Blue Belt) in this niche of AI and nothing here came as a complete surprise to me/was amongst things that I hadn’t come across before. This guide will be great for white belts, okay for blue belts, but stops being useful as you rank up (although I might also not have the requisite knowledge to appreciate the full genius of this post)

I often get asked how to hack AI applications. There hadn’t been a single full guide that I could reference until now.

The following is my attempt to make the best and most comprehensive guide to hacking AI applications. It’s quite large, but if you take the time to go through it all, you will be extremely well prepared.

And AI is a loaded term these days. It’s can mean many things. For this guide, I’m talking about applications that use Language Models as a feature.

Overview

This is the path to become an AI hacker:

Understand current AI models

Get comfortable using and steering them

Study the different AI attack scenarios

Below is the table of contents for the guide. Jump to the section that’s most relevant to you. If you’ve used AI for a few months, but haven’t messed with jailbreaks, go to section two. If you’ve already messed with jailbreaks a lot, jump to step three. The first two sections are mostly guidance and links, where as the third section is the bulk of the novel content because it’s the content that didn’t really exist yet.

Table of Contents

Overview

Table of Contents

1. Understand Current AI Models

2. Get Comfortable Using LLMs

System Prompts

Retrieval-Augmented Generation (RAG)

Jailbreaking

3. AI Attack Scenarios

Understanding Prompt Injection

AI App Responsibility Model

Attack Scenarios

Traditional Vulnerabilities Triggered by Prompt Injection

Prompt Injection Vulnerability Examples

Other AI Security Vulnerabilities

AI Trust and Safety Flaws

Multimodal Prompt Injection Examples

Invisible Prompt Injection Examples

Mitigations For Prompt Injection

AI Hacking Methodology Overview

1. Identify Data Sources

2. Find Sinks (Data Exfiltration Paths)

3. Exploit Traditional Web Vulnerabilities

4. Exploit AI Security and Multi-modal Vulnerabilities

Bug Bounty Tips for AI-Related Vulnerabilities

Exploring Markdown-to-HTML Conversion Vulnerabilities

Appreciation and Staying Informed

If you liked this article and wish to share it, please refer to the following guidelines.

I put a lot of effort into creating work that is informative, useful, and independent from undue influence. If you’d like to support my writing, please consider becoming a paid subscriber to this newsletter. Doing so helps me put more effort into writing/research, reach more people, and supports my crippling chocolate milk addiction. Help me democratize the most important ideas in AI Research and Engineering to over 100K readers weekly. You can use the following for an email template.

Help me buy chocolate milk

PS- We follow a “pay what you can” model, which allows you to support within your means, and support my mission of providing high-quality technical education to everyone for less than the price of a cup of coffee. Check out this post for more details and to find a plan that works for you.