Model WarsJanuary 27, 2022via OpenAI Blog

Aligning language models to follow instructions

Why it matters

OpenAI deployed a new generation of instruction-tuned models via RLHF alignment techniques, establishing a new baseline for user-intent adherence and safety. This shift from GPT-3 to InstructGPT as the API default signals a market-wide move toward alignment-first model development.

Key signals

InstructGPT trained with human-in-the-loop RLHF alignment
Outperforms GPT-3 on instruction-following
Improved truthfulness and reduced toxicity vs. GPT-3
Deployed as default language model on OpenAI API
Published Jan 27, 2022

The hook

OpenAI just replaced GPT-3. InstructGPT is now the default on their API—better at following instructions, more truthful, less toxic.

We’ve trained language models that are much better at following user intentions than GPT-3 while also making them more truthful and less toxic, using techniques developed through our alignment research. These InstructGPT models, which are trained with humans in the loop, are now deployed as the default language models on our API.

Read full story on OpenAI Blog

Relevance score:78/100

Aligning language models to follow instructions

Get stories like this every Friday.