Interesting Content in AI, Software, Business, and Tech- 09/04/2024 [Updates]
Content to help you keep up with Machine Learning, Deep Learning, Data Science, Software Engineering, Finance, Business, and more
A lot of people reach out to me for reading recommendations. I figured I’d start sharing whatever AI Papers/Publications, interesting books, videos, etc I came across each week. Some will be technical, others not really. I will add whatever content I found really informative (and I remembered throughout the week). These won’t always be the most recent publications- just the ones I’m paying attention to this week. Without further ado, here are interesting readings/viewings for 09/04/2024. If you missed last week’s readings, you can find it here.
Reminder- We started an AI Made Simple Subreddit. Come join us over here- https://www.reddit.com/r/AIMadeSimple/. If you’d like to stay on top of community events and updates, join the discord for our cult here: https://discord.com/invite/EgrVtXSjYf. Lastly, if you’d like to get involved in our many fun discussions, you should join the Substack Group Chat Over here.
Community Spotlight: Artem Kirsanov
Artem Kirsanov produces high-quality videos on computational neuroscience and AI (YT channel here). He doesn’t cover the usual topics you’d expect from an AI channel, which is a good thing because this means his channel has a lot of very new ideas/perspectives for more traditional Machine Learning people like us. If you’re looking for exposure to ideas that extend beyond the usual ML fare, his channel will be a good source of inspiration.
If you’re doing interesting work and would like to be featured in the spotlight section, just drop your introduction in the comments/by reaching out to me. There are no rules- you could talk about a paper you’ve written, an interesting project you’ve worked on, some personal challenge you’re working on, ask me to promote your company/product, or anything else you consider important. The goal is to get to know you better, and possibly connect you with interesting people in our chocolate milk cult. No costs/obligations are attached.
Previews
Curious about what articles I’m working on? Here are the previews for the next planned articles-
The Economics of ESports
The economics of Open Source.
Highly Recommended
These are pieces that I feel are particularly well done. If you don’t have much time, make sure you at least catch these works.
I need to read the paper more to fully analyze this paper, but the premise makes sense, and it tackles a very important challenge with Cosine Similarity. Definitely going to be keeping my eye on this, and would request you guys to chip in with insights/experiments on this.
— The advancement in computational power and hardware efficiency enabled the tackling of increasingly complex and high-dimensional problems. While artificial intelligence (AI) achieved remarkable results, the interpretability of highdimensional solutions remains challenging. A critical issue is the comparison of multidimensional quantities, which is essential in techniques like Principal Component Analysis (PCA), or k-means clustering. Common metrics such as cosine similarity, Euclidean distance, and Manhattan distance are often used for such comparisons — for example in muscular synergies of the human motor control system. However, their applicability and interpretability diminish as dimensionality increases. This paper provides a comprehensive analysis of the effects of dimensionality on these metrics. Our results reveal significant limitations of cosine similarity, particularly its dependency on the dimensionality of the vectors, leading to biased and less interpretable outcomes. To address this, we introduce the Dimension Insensitive Euclidean Metric (DIEM) which demonstrates superior robustness and generalizability across dimensions. DIEM maintains consistent variability and eliminates the biases observed in traditional metrics, making it a reliable tool for high-dimensional comparisons. This novel metric has the potential to replace cosine similarity, providing a more accurate and insightful method to analyze multidimensional data in fields ranging from neuromotor control to machine and deep learning.
Does Writing with Language Models Reduce Content Diversity?
You’re about to notice a theme with some papers selected. That’s because I’m researching something in particular.
Large language models (LLMs) have led to a surge in collaborative writing with model assistance. As different users incorporate suggestions from the same model, there is a risk of decreased diversity in the produced content, potentially limiting diverse perspectives in public discourse. In this work, we measure the impact of co-writing on diversity via a controlled experiment, where users write argumentative essays in three setups — using a base LLM (GPT3), a feedback-tuned LLM (InstructGPT), and writing without model help. We develop a set of diversity metrics and find that writing with InstructGPT (but not the GPT3) results in a statistically significant reduction in diversity. Specifically, it increases the similarity between the writings of different authors and reduces the overall lexical and content diversity. We additionally find that this effect is mainly attributable to InstructGPT contributing less diverse text to co-written essays. In contrast, the user-contributed text remains unaffected by model collaboration. This suggests that the recent improvement in generation quality from adapting models to human feedback might come at the cost of more homogeneous and less diverse content.
Standardizing the Measurement of Text Diversity: A Tool and a Comparative Analysis of Scores
The diversity across outputs generated by large language models shapes the perception of their quality and utility. Prompt leaks, templated answer structure, and canned responses across different interactions are readily noticed by people, but there is no standard score to measure this aspect of model behavior. In this work we empirically investigate diversity scores on English texts. We find that computationally efficient compression algorithms capture information similar to what is measured by slow to compute n-gram overlap homogeneity scores. Further, a combination of measures — compression ratios, self-repetition of long n-grams and Self-BLEU and BERTScore — are sufficient to report, as they have low mutual correlation with each other. The applicability of scores extends beyond analysis of generative models; for example, we highlight applications on instruction-tuning datasets and human-produced texts. We release a diversity score package to facilitate research and invite consistency across reports.
The Curious Decline of Linguistic Diversity: Training Language Models on Synthetic Text
This study investigates the consequences of training language models on synthetic data generated by their predecessors, an increasingly prevalent practice given the prominence of powerful generative models. Diverging from the usual emphasis on performance metrics, we focus on the impact of this training methodology on linguistic diversity, especially when conducted recursively over time. To assess this, we adapt and develop a set of novel metrics targeting lexical, syntactic, and semantic diversity, applying them in recursive finetuning experiments across various natural language generation tasks in English. Our findings reveal a consistent decrease in the diversity of the model outputs through successive iterations, especially remarkable for tasks demanding high levels of creativity. This trend underscores the potential risks of training language models on synthetic text, particularly concerning the preservation of linguistic richness. Our study highlights the need for careful consideration of the long-term effects of such training approaches on the linguistic capabilities of language models.
Aurora: A Foundation Model of the Atmosphere
Ohh look amazing non LLM related work done by Big Tech. It’s sad that there’s no effort put into promoting this development, even though this can be life saving.
Deep learning foundation models are revolutionizing many facets of science by leveraging vast amounts of data to learn general-purpose representations that can be adapted to tackle diverse downstream tasks. Foundation models hold the promise to also transform our ability to model our planet and its subsystems by exploiting the vast expanse of Earth system data. Here we introduce Aurora, a large-scale foundation model of the atmosphere trained on over a million hours of diverse weather and climate data. Aurora leverages the strengths of the foundation modelling approach to produce operational forecasts for a wide variety of atmospheric prediction problems, including those with limited training data, heterogeneous variables, and extreme events. In under a minute, Aurora produces 5-day global air pollution predictions and 10-day high-resolution weather forecasts that outperform state-of-the-art classical simulation tools and the best specialized deep learning models. Taken together, these results indicate that foundation models can transform environmental forecasting.
The New Math of How Large-Scale Order Emerges
I forget who shared this with me (please take credit in the comments), but this is very cool. The article is an introduction to the “Software in the natural world: A computational approach to hierarchical emergence” paper. The abstract is copied below-
Understanding the functional architecture of complex systems is crucial to illuminate their inner workings and enable effective methods for their prediction and control. Recent advances have introduced tools to characterise emergent macroscopic levels; however, while these approaches are successful in identifying when emergence takes place, they are limited in the extent they can determine how it does. Here we address this limitation by developing a computational approach to emergence, which characterises macroscopic processes in terms of their computational capabilities. Concretely, we articulate a view on emergence based on how software works, which is rooted on a mathematical formalism that articulates how macroscopic processes can express self-contained informational, interventional, and computational properties. This framework establishes a hierarchy of nested self-contained processes that determines what computations take place at what level, which in turn delineates the functional architecture of a complex system. This approach is illustrated on paradigmatic models from the statistical physics and computational neuroscience literature, which are shown to exhibit macroscopic processes that are akin to software in human-engineered systems. Overall, this framework enables a deeper understanding of the multi-level structure of complex systems, revealing specific ways in which they can be efficiently simulated, predicted, and controlled.
The authors (and many people) expressed surprise that imposing formatting restrictions on LLMs leads to performance degradation. Personally I would think this was obvious, given that an LLM only has finite resources points that it can invest into outputs. Quality and Constraints will be opposing forces for these points. This is why it’s best to split the quality and formating into different modules (have a simple tuned generator that rewrites tuned to a specific format if it’s really that important). In principle- this is no different from separation of concerns, which is a well-understood software development principle. More interesting is this papers implications on AI Safety- which we will discuss when I feel like writing on it.
Structured generation, the process of producing content in standardized formats like JSON and XML, is widely utilized in real-world applications to extract key output information from large language models (LLMs). This study investigates whether such constraints on generation space impact LLMs’ abilities, including reasoning and domain knowledge comprehension. Specifically, we evaluate LLMs’ performance when restricted to adhere to structured formats versus generating free-form responses across various common tasks. Surprisingly, we observe a significant decline in LLMs’ reasoning abilities under format restrictions. Furthermore, we find that stricter format constraints generally lead to greater performance degradation in reasoning tasks.
New LLM Pre-training and Post-training Paradigms
Another masterpiece by Sebastian Raschka, PhD . One of the best resources for cutting-edge NLP out there-
There are hundreds of LLM papers each month proposing new techniques and approaches. However, one of the best ways to see what actually works well in practice is to look at the pre-training and post-training pipelines of the most recent state-of-the-art models. Luckily, four major new LLMs have been released in the last months, accompanied by relatively detailed technical reports.
In this article, I focus on the pre-training and post-training pipelines of the following models:
- Alibaba’s Qwen 2
- Apple Intelligence Foundation Language Models
- Google’s Gemma 2
- Meta AI’s Llama 3.1
These models are presented in order based on the publication dates of their respective technical papers on arXiv.org, which also happens to align with their alphabetical order.
What P vs NP is actually about
Some very high-level computer science insights here. I learned so much from this video.
What if we could run algorithms backwards? We discuss how we could do this by turning algorithms into circuits and encoding those into satisfiability problems. We then explain how it all connects to P vs NP.
Meta-Learning through Hebbian Plasticity in Random Networks (Paper Explained)
I see the word Hebbian a lot, so I’ve decided to learn more about it. Saw this great video by Yannic Kilcher , which is a great first introduction.
Reinforcement Learning is a powerful tool, but it lacks biological plausibility because it learns a fixed policy network. Animals use neuroplasticity to reconfigure their policies on the fly and quickly adapt to new situations. This paper uses Hebbian Learning, a biologically inspired technique, to have agents adapt random networks to high-performing solutions as an episode is progressing, leading to agents that can reconfigure themselves in response to new observations.
I put a lot of work into writing this newsletter. To do so, I rely on you for support. If a few more people choose to become paid subscribers, the Chocolate Milk Cult can continue to provide high-quality and accessible education and opportunities to anyone who needs it. If you think this mission is worth contributing to, please consider a premium subscription. You can do so for less than the cost of a Netflix Subscription (pay what you want here).
Many companies have a learning budget, and you can expense your subscription through that budget. You can use the following for an email template.
I provide various consulting and advisory services. If you‘d like to explore how we can work together, reach out to me through any of my socials over here or reply to this email.
Other Good Content
Stealing Part of a Production LLM | API protects LLMs no more
How it is possible to steal part of LLMs protected behind an API? 🥷 We explain both papers that made a breakthrough on this, one from Carlini et al. (Google), and the other one from Finlayson et al. (USC), see references below.
EP125: How does Garbage Collection work?
An interesting collection of System Design tidbits by the legend, Alex Xu
This week’s system design refresher:
- Linux Performance Tools! (Youtube video)
- How does Garbage Collection work?
- A Cheat Sheet for Designing Fault-Tolerant Systems
- 10 System Design Tradeoffs You Cannot Ignore
3 steps to align AI with the ancient philosophy of human flourishing | Brendan McCord
The content of the video is aggressively mid-tier (and that’s if I’m nice), but the premise and setup is interesting to think about. Also, +1 for the call-out of both doomers and hype-bros.
GoEX: Perspectives and Designs Towards a Runtime for Autonomous LLM Applications
This is a paper I’ve shared before, which is why it’s not in highly-recommended section. I’ve been studying agents, and this a great case-study on them.
Large Language Models (LLMs) are evolving beyond their classical role of providing information within dialogue systems to actively engaging with tools and performing actions on real-world applications and services. Today, humans verify the correctness and appropriateness of the LLM-generated outputs (e.g., code, functions, or actions) before putting them into real-world execution. This poses significant challenges as code comprehension is well known to be notoriously difficult. In this paper, we study how humans can efficiently collaborate with, delegate to, and supervise autonomous LLMs in the future. We argue that in many cases, “post-facto validation” — verifying the correctness of a proposed action after seeing the output — is much easier than the aforementioned “pre-facto validation” setting. The core concept behind enabling a post-facto validation system is the integration of an intuitive undo feature, and establishing a damage confinement for the LLM-generated actions as effective strategies to mitigate the associated risks. Using this, a human can now either revert the effect of an LLM-generated output or be confident that the potential risk is bounded. We believe this is critical to unlock the potential for LLM agents to interact with applications and services with limited (post-facto) human involvement. We describe the design and implementation of our open-source runtime for executing LLM actions, Gorilla Execution Engine (GoEX), and present open research questions towards realizing the goal of LLMs and applications interacting with each other with minimal human supervision. We release GoEX at this https URL.
If you liked this article and wish to share it, please refer to the following guidelines.
Reach out to me
Use the links below to check out my other content, learn more about tutoring, reach out to me about projects, or just to say hi.
Small Snippets about Tech, AI and Machine Learning over here
AI Newsletter- https://artificialintelligencemadesimple.substack.com/
My grandma’s favorite Tech Newsletter- https://codinginterviewsmadesimple.substack.com/
Check out my other articles on Medium. : https://rb.gy/zn1aiu
My YouTube: https://rb.gy/88iwdd
Reach out to me on LinkedIn. Let’s connect: https://rb.gy/m5ok2y
My Instagram: https://rb.gy/gmvuy9
My Twitter: https://twitter.com/Machine01776819