"The San Francisco-based startup lets customers test their AI agents against vetted metrics and deploy them faster. Evaluations involve assessing an AI system's performance, efficiency, and safety next to industry benchmarks. The product was created when founder Darius Emrani was building a healthcare app that could transcribe patient notes and let doctors have better patient interactions. "I really quickly realized that if it messed up, then it could hurt or kill people," Emrani said."
"referring to his time at Google-owned Waymo, where he oversaw testing and simulations for self-driving cars for four years before leaving in 2022. Emrani said that with so many companies launching AI applications, health, finance, and legal tech startups are under pressure to ship quickly. "They launch and then there's an 'Oh crap' moment where they're like, it doesn't work," he said. "This is happening everywhere in legal. Really prominent companies are literally having failures and lawyers are getting disbarred.""
Scorecard raised $3.75 million to build an AI evaluation platform. The San Francisco-based company provides a platform that lets customers test AI agents against vetted metrics to accelerate and secure deployment. Evaluations assess an AI system's performance, efficiency, and safety relative to industry benchmarks. The platform originated from work on a healthcare app that transcribed patient notes, revealing potential harm if models failed. The service has run millions of tests and is used by enterprise clients including Thomson Reuters to test and deploy legal AI agents such as CoCounsel. Pricing includes a free tier and seat-based rates.
Read at Business Insider
Unable to calculate read time
Collection
[
|
...
]