In the previous lesson, you learned how to turn text into embeddings - compact, high-dimensional vectors that capture semantic meaning. By computing cosine similarity between these vectors, you could find which sentences or paragraphs were most alike. That worked beautifully for a small handcrafted corpus of 30-40 paragraphs. But what if your dataset grows to millions of documents or billions of image embeddings? Suddenly, your brute-force search breaks down - and that's where Approximate Nearest Neighbor (ANN) methods come to the rescue.
A new open-source project, VillageSQL, has been introduced as a tracking fork of MySQL aimed at expanding extensibility and addressing feature gaps increasingly relevant to AI and agent-based workloads. Announced by founder Dominic Preuss, VillageSQL Server for MySQL is positioned as a drop-in replacement that maintains compatibility with upstream MySQL while adding a structured extension framework. The alpha release is now available for experimentation.
The first step in Uber's adoption of OpenSearch was to evaluate it against their existing Lucene-based setup using th HNSW (Hierarchical Navigable Small World) algorithm: We found ourselves limited by the lack of algorithm options, which hindered our ability to fine-tune trade-offs for different scenarios.
The AI Stage at TechCrunch Disrupt 2025, happening October 27-29 in San Francisco, is officially locked and loaded, featuring the powerhouses shaping the future of artificial intelligence. Join the leaders from Character.AI, Hugging Face, Mercor, Runway, Wayve, and many more top tech voices, as they tackle everything from generative AI and developer tools to autonomous vehicles, creative machines, and national security.