
"This year, the latest versions of frontier models have been more powerful but slower than previous versions. For us, accuracy is the most important thing, and consequently GOV.UK Chat responses are slower than we'd ideally like."
"GDS reckons the chatbot, which only uses material from GOV.UK and includes links to source material, now scores more highly than mass-market AI assistants when answering government-related questions. Recent research by the Open Data Institute tested 11 LLMs with questions on GOV.UK material and found they often waffled, went beyond official information, or made mistakes."
"The public pilots included 508 attempts to fool the service into providing an inappropriate or harmful response, all of which failed, and the system, which uses Amazon's Bedrock platform and Anthropic's Claude models, coped well with demand."
The UK Government Digital Service developed GOV.UK Chat, a government-specific chatbot that uses only official GOV.UK material with source links. Through public pilots, accuracy improved from 76% to 90%, outperforming mass-market AI assistants on government questions. However, more powerful LLMs that enhance accuracy also slow response times to 10.7 seconds, exceeding user preferences. The service successfully resisted 508 adversarial attempts to generate harmful responses. GDS prioritizes accuracy over speed but is exploring solutions like streaming partial answers while processing complete responses, though this requires substantial safety guardrail development.
Read at Theregister
Unable to calculate read time
Collection
[
|
...
]