News Details

🤖 AI Snack 🍿 : Iterative Refinement with Self-Feedback

Improving the LLM results by simple and clever prompt engineering.

Image created with Bing Image Creator

Continuous refinement is integral to human progress. Just as when we write an article draft, we often re-read it and rewrite parts to make it better. What if AI assistants could do the same and refine their own work? A technique called Self-Refine explores this idea.

Let's walk through how it works using a real example. Say we ask ChatGPT to write a tweet summarizing an article. Using the “Self-refine” technique it would:

  1. Generate an initial tweet about the article.
  2. Critique its own tweet and give feedback on how to improve it.
  3. Use that feedback to write a better second draft of the tweet.
  4. Repeat steps 2-3, iteratively refining the tweet.

This technique hinges on two pivotal stages:

  • Feedback: Here, the initial task and response are evaluated, generating insights on potential improvements. It's beneficial to outline specific attributes you'd like the revised answer to exhibit.
  • Refine: Armed with the initial task, response, and feedback, the LLM enhances its original answer.

Researchers have assessed this method across seven diverse tasks. Both GPT-3.5 and GPT-4 displayed performance boosts ranging from 5% to 40%. Our personal experience mirrors these findings. With the aid of this strategy, we've achieved tasks using GPT-4 that previously seemed elusive. Interestingly, most enhancements materialize during the first iteration, sufficient for a majority of tasks. This mechanism bears a striking resemblance to Reinforcement Learning from AI Feedback with its 'Constitution'. Such developments hint at a near future where models might autonomously refine their capabilities.

In ChatFAQ, this technique plays a pivotal role in enhancing the way we process and understand text. By leveraging this innovative method, we're able to dissect large volumes of text into semantically coherent and meaningful segments, often referred to as 'chunks’. As a result, our system can generate responses that are not only more accurate but also of higher quality. The ability to identify and separate texts into these chunks ensures that the information presented is relevant, concise, and in line with the user's intent.

Link to article