Paper of the Month (July 2023) - Making AI Smarter with Plan-and-Solve Prompting

A new way to boost the reasoning abilities of AI models.

Jul 31, 2023

Paper: Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models

a blue and pink abstract background with wavy lines — Photo by vackground.com on Unsplash

The Big Idea

This paper introduces a new prompt engineering method called Plan-and-Solve (PS) Prompting, an improvement to the well-known Chain-of-Thought Prompting, to improve the reasoning task accuracy of large language models (LLMs). PS Prompting involves devising a plan to divide the entire task into smaller subtasks, and then carrying out the subtasks according to the plan.

In Layman’s Terms — Imagine you're trying to solve a complex problem. Instead of tackling it all at once, you break it down into smaller, manageable tasks and solve them one by one. That's essentially what PS Prompting does for AI models, helping them to solve complex reasoning tasks more accurately by asking it to break the problem down first into smaller tasks, then solve them.

Why You Should Care

How It Impacts Us — Currently, Large Language Models (LLMs) perform well when given specific instructions for tightly scoped tasks. However, their performance tends to falter when faced with multi-step tasks of a larger scope. The method proposed in this paper can enhance the accuracy and efficiency of AI models in solving these more open-ended and complex tasks, with minimal human intervention or prompting.

Future Potential — We are still discovering the best ways to prompt LLMs, a field that has new findings every few weeks. These prompt strategies allow prompt engineers to give tasks that are larger in scope with fewer specific instructions and examples needed, which gives the LLM greater autonomy to plan and execute the task provided.

Facts in 5 Seconds

Fact #1 — Plan-and-Solve Prompting consistently outperforms other methods across all datasets by a large margin.

Fact #2 — Despite its simplicity, PS strategy greatly improves the quality of the generated reasoning process.

Fact #3 — Even though PS+ prompting does not require manual demonstration examples, it performs similarly to an 8-shot CoT prompting in arithmetic reasoning.

How To Apply the Learning

For Tech Folks — The specific PS prompt used in the paper attached. This prompt guides the Large Language Models (LLMs) to break down the task into smaller subtasks and solve them according to the plan.

Let’s first understand the problem and devise a plan to solve the problem. Then, let’s carry out the plan and solve the problem step by step.

For Non-Tech Folks — For business people or the general public, understanding the concept of PS Prompting can help you appreciate the complexity and potential of AI technology. It's a reminder that even AI models can benefit from breaking down complex tasks into smaller, manageable parts - just like we do.

Jacky’s Take

This paper offers groundbreaking insights, as it demonstrates continued progress in prompt engineering methods - an area of intense interest and debate. Some argue prompt engineering will become obsolete as large language models grow more capable. While true in some respects, prompt design remains critical given today's LLMs. My experience with advanced models like GPT-4 and Claude reveals they still require meticulous prompt crafting for complex reasoning, even with general triggers like "step-by-step." When tasks are tightly scoped with clear, precise instructions, GPT-4 performs remarkably well. But on open-ended tasks with minimal guidance, it quickly falters.

In essence, this work highlights how aptly-designed prompts can better elicit reasoning from today's LLMs. The results underscore prompt engineering's ongoing importance for unlocking latent skills, despite rapid advances in model architecture. With refined prompting strategies like those proposed here, we can continue accessing impressive reasoning abilities, while models themselves evolve toward less constraint. This study offers both engineering breakthroughs and philosophical insights into the interplay between prompts and innate model strengths.

Quick Links

Read the Full Piece — Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models (2023)
Additional Links #1 — Plan-and-Solve Prompting Github Code
Additional Links #2 — Chain-of-Thought Prompting Elicits Reasoning in Large Language Models (2022)

Emoji Summary — 🤖🧠📝📈

Thank you for reading Topography of Applied AI. Learned something? Feel free to share it with your friends and community.