Image Captioning, Transformer Mode OnThe CPTR image captioning model enhances the encoder-decoder architecture using both Vision Transformers and full Transformer networks.