DeepSeek has introduced Janus Pro 1B and 7B, multimodal LLMs that enhance image generation and vision processing, asserting competitiveness against OpenAI's DALL-E 3. With improvements over the past Janus model, Janus Pro reportedly uses advanced architecture, separating visual encoding while maintaining a unified transformer structure. Early performance evaluations suggest Janus Pro 7B marginally exceeds both Stable Diffusion 3 Medium and DALL-E 3 in selected benchmarks, though with constraints on image analysis at 384x384 pixels. The models were developed using efficient GPU training methods, leveraging past models for time efficiency.
DeepSeek's new Janus Pro LLM models excel in image generation, challenging DALL-E 3, addressing limitations of previous releases with new architecture.
Janus Pro is designed not just for image generation, but also for vision processing tasks, showcasing the versatility and capabilities of multimodal LLMs.
Collection
[
|
...
]