The CDAO's pilot identified over 800 potential vulnerabilities and biases in using large language models for military medical services through extensive testing and collaboration.
This exercise will develop benchmark datasets for evaluating future AI tools and contribute to the DOD's policies on the responsible use of Generative AI.
Collection
[
|
...
]