Improbable AI Lab and MIT researchers used machine learning to train a red-team model that autonomously generates prompts to elicit a wider range of toxic responses.
[
add
]
[
|
|
...
]
Improbable AI Lab and MIT researchers used machine learning to train a red-team model that autonomously generates prompts to elicit a wider range of toxic responses.