A very quick introduction to the Reversal Curse haunting ChatGPT and Llama

Here is what 2 simple prompts about Tom Cruise and his mother can teach you about a curse haunting LLMs and AI.

4 min readSep 26, 2023

A good way to test a language models ability to generalize and language understanding is to reverse the order in statements. By common sense, we know that if “A is B” then “B is A”. Testing a model’s ability to catch this is necessary in evaluating its NLU capabilities. However, what might be obvious to us, is a huge problem for many LLMs. Even when an LLM is trained directly with information in the form of “A is B” it doesn’t improve it’s performance in solving “B is A”

Ask ChatGPT “Who is Tom Cruise’s mother” and it will answer. However, flip this question and ask ChatGPT, “Who is Mary Lee Pfeiffer’s son?” and it will not be able to answer. Even though the 2 questions are functionally identical in information, ChatGPT is unable to answer the second one.

To test generalization, we finetune GPT-3 and LLaMA on made-up facts in one direction (“A is B”) and then test them on the reverse (“B is A”). We find they get ~0% accuracy! This is the Reversal Curse.

To quote researchers for a more formal definition of the Reversal Curse- If a model is trained on a sentence of the form “A is B”, it will not automatically generalize to the reverse direction “B is A”. This is the Reversal Curse. For instance, if a model is trained on “Olaf Scholz was the ninth Chancellor of Germany”, it will not automatically be able to answer the question, “Who was the ninth Chancellor of Germany?”. Moreover, the likelihood of the correct answer (“Olaf Scholz”) will not be higher than for a random name. Thus, models exhibit a basic failure of logical deduction and do not generalize a prevalent pattern in their training set (i.e. if “A is B” occurs, “B is A” is more likely to occur).”

Moreover, this curse doesn’t go away as we scale up. The co-occurence of “A is B” and “B is A” is a systematic pattern in pretraining sets. Auto-regressive LLMs completely fail to meta-learn this pattern, with no change in their log-probabilities and no improvement in scaling from 350M to 175B parameters.

To learn more about the reversal curse impacting LLMs, I would suggest reading the paper- “The Reversal Curse: LLMs trained on “A is B” fail to learn “B is A” “. Paper Link- https://owainevans.github.io/reversal_curse.pdf

PS: Looks like Bard handles the reversal curse better than GPT. Ran a basic experiment here-https://twitter.com/Machine01776819/status/1706447329061118410

For more details, sign up for my free AI Newsletter, AI Made Simple. AI Made Simple- https://artificialintelligencemadesimple.substack.com/

If you liked this article and wish to share it, please refer to the following guidelines.

That is it for this piece. I appreciate your time. As always, if you’re interested in working with me or checking out my other work, my links will be at the end of this email/post. If you like my writing, I would really appreciate an anonymous testimonial. You can drop it here. And if you found value in this write-up, I would appreciate you sharing it with more people. It is word-of-mouth referrals like yours that help me grow.

If you find AI Made Simple useful and would like to support my writing- please consider getting a premium subscription to my sister publication Tech Made Simple below. Supporting gives you access to a lot more content and enables me to continue writing. You can use the button below for a special discount for readers of AI Made Simple, which will give you a premium subscription at 50% off forever. This will cost you 400 INR (5 USD) monthly or 4000 INR (50 USD) per year.

Support AI Made Simple