Imagen 3 vs. DALL-E 3: 11 Prompts to Test Each AI Artist

A few months ago, Google unveiled Imagen 3, its next-generationtext-to-image generator, through a beta phase in the ImageFX platform. Now, it’s available to everyone as part ofGoogle Gemini.Google claims the new model can create highly detailed and lifelike images and follow prompts more accurately. So, we tested Imagen 3, comparing it with OpenAI’s DALL-E 3, the image-generating AI on ChatGPT.

We gave the same prompts to Imagen 3 and Dall-E 3 to test them on different metrics including their text rendering capabilities, animation styles, camera angles, and even their ability to follow prompts. Here are our comparison results highlighting which AI model performed better overall.

Note: In all the examples below, Imagen 3 is on the left and DALL-E 3 is on the right.

1. Realistic City Street Test

We started by generating a realistic city street scene to evaluate the models’ handling of lighting and reflections. Here’s the prompt we provided to both models:

And here are the results.

Right off the bat, you’re able to see that ChatGPT’s DALL-E 3 struggles to create realistic-looking images. While it manages to generate reflections, the image still feels animated. This holds for all the subsequent prompts as well. DALL-E 3 tends to produce images that appear more animated compared to Imagen 3 orMidJourney.

2. Camera Angle and Shot Composition Test

Next, we wanted to evaluate how well each AI could follow camera angles and shot suggestions. We provided the following prompt to both models:

While I like the quality of the Gemini result, ChatGPT’s DALL-E 3 followed the suggestions more accurately, capturing the low-angle camera perspective and ultra-wide shot. Gemini also followed the camera angle suggestion, but overall, ChatGPT performed better maintaining the specified angles and shot compositions.

3. Human Skin Tone Test

Getting human skin tones right is challenging, even for MidJourney which is known for generating realistic images of people but often struggles with close-up shots. To test Imagen 3 and Dall-E 3 capabilities, we provided this prompt:

As expected, ChatGPT’s DALL-E 3 produced an image that looks animated. While Gemini’s result was comparatively better, it was still easy to figure out that the image was AI-generated.

4. Painting Style Test

All 3 previous examples focused on generating realistic images, which didn’t play to DALL-E 3’s strengths. To assess how well both AI image generators can create images in a painting style, we provided this prompt:

Both models performed well with this prompt. ChatGPT’s DALL-E 3 created an image with more intricate details and a vibrant shine, whereas Gemini produced a result that felt soft in a more cohesive artistic style. While both had their strengths, the choice between them may come down to a preference for either detailed, sharp imagery (DALL-E 3) or a more blended, dreamlike aesthetic (Gemini).

But Gemini actually followed the prompt better, producing an image that looked more like a painting and successfully depicted waterfalls cascading into the clouds. Whereas it feels like ChatGPT has a style and it likes to stick to it for some reason.

5. Understanding Abstract Concepts

Next, we tested how well the models could interpret abstract concepts. Here’s one example prompt we provided:

It’s very hard to declare a winner in this category, but I personally prefer ChatGPT Dall-E 3’s result. Most of the time, Gemini Imagen 3’s result actually feels opposite to the prompt I provided, but you may have a different opinion.

6. 2D Animation Style and Cartoon Image Generation

We also tested the models’ ability to create images in a 2D animation style and cartoon-like appearance. Here’s an example prompt from our tests:

While I expected ChatGPT to excel in this area, I encountered difficulties generating 2D images with ChatGPT right away. Initially, it produced 3D animation-style images, and only after re-prompting did it generate 2D images. This issue occurred multiple times with different examples, so we are considering the 2D animation image it eventually generated after several prompts.

Gemini often generates 2D images with more detail, while ChatGPT tends to transform 2D images into more cartoon-like representations. In the end, the choice between the two depends on your personal preference and the style you’re looking for. We prefer ChatGPT as it looks 2D which is what we prompted.

7. Generating Real-World People

We also tested whether Imagen-3 and Dall-E 3 could generate images featuring real-world people like Elon Musk or Donald Trump. However, both models are unable to generate images of real people. While Gemini immediately states that it cannot create images with real people, ChatGPT initially attempts to generate images in different settings before eventually declaring that it cannot produce images of real individuals.

8. Historical Figures Test

Previously, Gemini’s image generatorfaced controversiesfor not generating images of white people. It was generating images of people of color even when prompts like Founding Fathers of America were given. To see how the new model performs, we used the same prompt:

It appears that this issue has been resolved, as both models produced images that were accurate and true to historical depictions during our tests.

9. Text Rendering Test

We then tested the text rendering capabilities, as many models often produce text that is hard to read or nonsensical. Both Google and OpenAI claim that their models have improved in this area, so we used the following prompt:

In this example, both models rendered the text correctly. However, if the prompt doesn’t specify the exact text, both models still struggle. For instance, with this prompt:

ChatGPT’s DALL-E 3 failed to render the text accurately, producing illegible words, while Gemini deviated from the prompt by making the text on the pages less visible, often obscuring or blurring it.

10. Detailed Prompt Test

Finally, we tested how well both AI image generators follow prompts that include a lot of specific details. Here’s an example of a detailed prompt we used:

Both models did a good job with this complex prompt, but there were notable differences in how they handled the details. ChatGPT’s DALL-E 3 missed a few elements, such as the scar on the left cheek and the red accents on the armor. Additionally, the character wasn’t depicted as holding the sword as specified.

Gemini captured every detail, including the scar, the red accents, and the precise purple-to-orange gradient of the twilight sky, resulting in a more accurate interpretation of the prompt.

11. In-Paint Editing

ChatGPT can generate images but you can also edit images using it. To edit an image, select the generated image, click on the paint option, and select the part you want to change or edit. Then you can provide a prompt and the changes will appear only in that specific part. For example, here’s the skyline image I have generated with ChatGPT.

But now if I prefer an orange and vibrant sky, I can select the sky part and provide a prompt to make the sky vibrant. Here’s the edited image.

Editing images like this is not possible on Google Gemini yet. Also, Imagen 3 is much slower in generating images compared to DALL-E 3.

Imagen 3 Outperforms DALL-E 3

Imagen 3 excels at generating more realistic-looking images and can adjust the animation style according to the prompt. In contrast, ChatGPT’s DALL-E 3 tends to adhere to its own style, even when different styles are requested. However, ChatGPT has its advantages—it is better at following camera angles and perspectives and can also edit generated images.

Both the AI tools can generate images even in thefree versionbut with limitations like:

Gone are the days when AI-generated images had glaring issues like characters with 10 fingers on one hand. Most images produced by these models are now accurate, making them valuable tools for content creators.

Ravi Teja KNTS

Tech writer with over 4 years of experience at TechWiser, where he has authored more than 700 articles on AI, Google apps, Chrome OS, Discord, and Android. His journey started with a passion for discussing technology and helping others in online forums, which naturally grew into a career in tech journalism. Ravi’s writing focuses on simplifying technology, making it accessible and jargon-free for readers. When he’s not breaking down the latest tech, he’s often immersed in a classic film – a true cinephile at heart.

Imagen 3 vs. DALL-E 3: 11 Prompts to Test Each AI Artist