Next-word pretraining creates statistical pressure toward hallucination, even with idealized error-free data. Facts lacking repeated support in training data yield unavoidable errors, while recurring regularities do not.
Meta is working on two proprietary frontier models: Avocado, a large language model, and Mango, a multimedia file generator. The open-source variants are expected to be made available at a later date.
A major difference between LLMs and LTMs is the type of data they're able to synthesize and use. LLMs use unstructured data-think text, social media posts, emails, etc. LTMs, on the other hand, can extract information or insights from structured data, which could be contained in tables, for instance. Since many enterprises rely on structured data, often contained in spreadsheets, to run their operations, LTMs could have an immediate use case for many organizations.
But tiny 30-person startup Arcee AI disagrees. The company just released a truly and permanently open (Apache license) general-purpose, foundation model called Trinity, and Arcee claims that at 400B parameters, it is among the largest open-source foundation models ever trained and released by a U.S. company. Arcee says Trinity compares to Meta's Llama 4 Maverick 400B, and Z.ai GLM-4.5, a high-performing open-source model from China's Tsinghua University, according to benchmark tests conducted using base models (very little post training).
Qwen3.5 is available via Hugging Face and is released under an open-source license. With this, Alibaba is explicitly targeting developers and research institutions that want to work with the model themselves. The system can process very long prompts, up to 260,000 tokens, and can be scaled further with additional optimizations. This makes it suitable for complex applications such as extensive document analysis and code generation.