The Hybrid Autoregressive Transformer (HART), developed by MIT, Nvidia, and Tsinghua University, revolutionizes AI text-to-image generation with unprecedented speed, offering image creation times of 1.8 seconds. Unlike traditional diffusion models, HART utilizes an autoregressive approach, enabling step-by-step image generation and improved control over the output. This advancement results in latency reductions of 3.1 to 5.9 times compared to current models. A hybrid tokenizer was also introduced to process images more efficiently, addressing resolution quality issues and setting a new benchmark in speed and throughput for AI-generated visuals.
The Hybrid Autoregressive Transformer (HART) by MIT, Nvidia, and Tsinghua University sets a new standard in AI image generation with remarkable speed and efficiency.
HART's innovative autoregressive model generates images step-by-step, allowing for greater control and speed, achieving generation times of just 1.8 seconds.
Collection
[
|
...
]