
"The late English writer Douglas Adams is best known as the author of the 1979 book The Hitchhiker's Guide to the Galaxy. But there is much more to Adams than what is written in his Wikipedia entry. Whether or not you need to know that his birth sign is Pisces or that libraries worldwide store his books under the same string of numbers - 13230702 - you can if you head to an overlooked corner of the Wikimedia Foundation called Wikidata."
"There, images, text, keywords, and other information related to Adams are stored both in a webpage and, for the robots among us, in formats designed for machines like JSON. Now, Wikidata is getting a new AI-friendly database that makes it easier for large language models to ingest the information. The database comes from the Wikipedia Embedding Project out of the German chapter of the Wikimedia Foundation, Wikimedia Deutschland, which oversees Wikidata."
Wikidata stores images, text, keywords, and structured metadata in both human-readable pages and machine-readable formats like JSON. Wikimedia Deutschland’s Wikipedia Embedding Project built an AI-friendly embedding database to simplify consumption by large language models. A Berlin-based team spent about a year using a large language model to convert roughly 19 million Wikidata entries from clunkily structured records into vector embeddings. Those vectors capture context and meaning around entries, enabling more efficient semantic search, retrieval, and AI-driven use of Wikidata’s knowledge graph.
Read at The Verge
Unable to calculate read time
Collection
[
|
...
]