Improbable AI Lab and MIT researchers used machine learning to train a red-team model that autonomously generates prompts to elicit a wider range of toxic responses.
Collection
[
|
...
]
Improbable AI Lab and MIT researchers used machine learning to train a red-team model that autonomously generates prompts to elicit a wider range of toxic responses.