DALL-E 2 achieves the highest human-rated alignment score among 26 text-to-image models evaluated, indicating its superior performance in matching textual descriptions with images.
Models fine-tuned with high-quality images, like Dreamlike Photoreal 2.0, performed closely behind DALL-E 2 in text-image alignment, showcasing the importance of training with realistic visuals.
Despite most models having low frequencies of generating inappropriate images, certain models, especially in the I2P scenario, still showed higher instances of such outputs.
OpenJourney, one of the weaker variants, demonstrates higher frequencies of inappropriate image generation, raising concerns about the effectiveness of safety measures in certain text-to-image AI models.
Collection
[
|
...
]