OpenAI develops AI model to critique its AI models
Briefly

OpenAI uses the CriticGPT AI model to assist human trainers in identifying errors in ChatGPT's code output, resulting in 60% better performance when compared to those without aid.
The Reinforcement Learning from Human Feedback (RLHF) process involves human workers interacting with models to annotate responses. However, as models improve, identifying flawed answers becomes challenging for human trainers.
CriticGPT is designed to supplement human feedback, enhancing the capabilities of generative AI models like ChatGPT in generating programming code.
This model augments the knowledge of trainers administering reinforcement learning, leading to improved outcomes compared to relying solely on crowdsourced workers.
Read at Theregister
[
|
]