
"The idea is to find the world's foremost experts, have them architect benchmarks, then train AI judges to evaluate models at scale. For Forum AI's geopolitics work, Brown has recruited Niall Ferguson, Fareed Zakaria, former Secretary of State Tony Blinken, former House Speaker Kevin McCarthy, and Anne Neuberger, who led cybersecurity in the Obama administration. The goal is to get AI judges to roughly 90% consensus with those human experts, a threshold she says Forum AI has been able to reach."
"Forum AI - which she discussed recently with TechCrunch's Tim Fernholz at a StrictlyVC evening in San Francisco - evaluates how foundation models perform on what she calls "high-stakes topics" - geopolitics, mental health, finance, hiring - subjects where "there are no clear yes-or-no answers, where it's murky and nuanced and complex." The idea is to find the world's foremost experts, have them architect benchmarks, then train AI judges to evaluate models at scale."
""I was at Meta when ChatGPT was first released publicly," she recalled, "and I remember really shortly after realizing this is going to be the funnel through which all information flows. And it's not very good." The implications for her own children made the moment feel almost existential. "My kids are going to be really dumb if we don't figure out how to fix this," she recalled thinking."
"What frustrated her most was that accuracy didn't seem to be anyone's priority. Foundation model companies, she said, are "extremely focused on coding and math," whereas news and information are harder. But harder, she argued, doesn't mean optional. Indeed, when Forum AI began evaluat"
Forum AI evaluates foundation models on high-stakes topics including geopolitics, mental health, finance, and hiring. The approach targets areas with murky, nuanced, complex answers where accuracy is difficult but essential. The company recruits leading experts to architect benchmarks and then trains AI judges to evaluate model outputs at scale. For geopolitics, experts include Niall Ferguson, Fareed Zakaria, Tony Blinken, Kevin McCarthy, and Anne Neuberger. The goal is for AI judges to reach about 90% consensus with human experts. The company was founded after concerns that AI would become the funnel for information and that accuracy was not prioritized by model developers.
Read at TechCrunch
Unable to calculate read time
Collection
[
|
...
]