Real or AI? Take the 10-Photo Blind Test and See If You Can Spot the Difference

Key Takeaways

10 photo pairs. One real (sourced from SampleShots‘s public library), one generated by OpenAI’s gpt-image-2 from a short scene description. Your job: pick the AI in each pair.
The AI versions were prompted with ONLY the scene description — no camera body, no EXIF, no photographer reference. We’re testing whether the model produces images that read as plausibly photographic, not whether it imitates a specific camera signature.
The slug names and image filenames give nothing away — q01_a.jpg vs q01_b.jpg per question, with the real/AI assignment randomized per question so pattern-spotters can’t just pick the same letter every time.
Most readers score 7-8/10. The pairs where AI fools people: clean architectural scenes, simple subject + background compositions, abstract macro. The pairs where AI gives itself away: hand details, complex repetitive patterns, hair edges, animal eyes.
This is a snapshot of gpt-image-2 in May 2026 — the same test in six months will likely score higher for AI. The gap is closing each release.

Here is a 10-question blind test. Each question shows two photographs side by side. One was captured by a real working photographer; the other was generated by OpenAI’s gpt-image-2 from a single sentence describing the scene. No camera was named in the AI prompt. No EXIF data, no photographer reference. Just the scene description — and the model’s best guess at what a “photograph of that scene” should look like.

Click the photo you think is AI for each pair. You will see your score at the end with a breakdown by question.

The PhotoWorkout editorial team ran the test internally before publishing — average score across five staff was 7.4/10. The pairs that fool everyone tend to be clean compositions: architectural lines, simple subject + plain background, abstract macro. The pairs where AI gives itself away tend to involve fine human-anatomy detail (hands, hair edges), complex repeating patterns, or animal eyes at close range.

/10

Real vs AI: Can You Tell Which Photos Are Generated?

10-photo blind challenge. Each photo has a partner generated by gpt-image-2 from a scene description. Pick the AI in each pair.

1 / 10

Photo 1 of 10 — Look closely at both images below. Which one was generated by AI?

Image A

Image B

2 / 10

Photo 2 of 10 — Look closely at both images below. Which one was generated by AI?

Image A

Image B

3 / 10

Photo 3 of 10 — Look closely at both images below. Which one was generated by AI?

Image A

Image B

4 / 10

Photo 4 of 10 — Look closely at both images below. Which one was generated by AI?

Image A

Image B

5 / 10

Photo 5 of 10 — Look closely at both images below. Which one was generated by AI?

Image A

Image B

6 / 10

Photo 6 of 10 — Look closely at both images below. Which one was generated by AI?

Image A

Image B

7 / 10

Photo 7 of 10 — Look closely at both images below. Which one was generated by AI?

Image A

Image B

8 / 10

Photo 8 of 10 — Look closely at both images below. Which one was generated by AI?

Image A

Image B

9 / 10

Photo 9 of 10 — Look closely at both images below. Which one was generated by AI?

Image A

Image B

10 / 10

Photo 10 of 10 — Look closely at both images below. Which one was generated by AI?

Image A

Image B

Your score is

How did you score? Here is what gpt-image-2 does well — and where it slips

The score most readers land on is 7 or 8 out of 10 — meaning gpt-image-2’s text-only outputs already cross the threshold of “plausibly photographic” for the majority of scene types. That is a substantial change from where AI image generation sat 18 months ago, when most outputs were obvious even to casual viewers.

What the model handles well

Architectural scenes — straight lines, predictable geometry, plausible material textures. Most readers cannot reliably separate AI from real on buildings, vineyards, urban skylines.
Single subject on plain background — flowers, food shots, donuts on simple plates. The model has plenty of training data for this composition pattern and renders confidently.
Abstract macro and color studies — colour pattern shots and geometric abstractions land in AI’s strong zone. The original photograph and the AI version often look equally “designed.”
Wide landscapes with atmospheric haze — sunsets, distant mountains, coastal horizons. The model defaults to “stock photo aesthetic” which happens to be hard to distinguish from a real photographer’s calibrated edit.

Where the model still gives itself away

Human hands and fingers — finger counts, joint geometry, nail edges. Famous AI tell, still present in May 2026 outputs even at gpt-image-2’s quality.
Eye geometry on animals up close — pupils, reflections, iris patterns. Particularly visible on bird-portrait pairs in this quiz.
Repeating fine patterns — fabric weaves, feather barbules, lattice structures. AI tends to produce almost-but-not-quite regular patterns where small inconsistencies betray the generator.
Text inside the image — license plates, signage, product labels. Even in 2026, AI text rendering remains the most reliable single tell when text is part of the scene.
Specular highlights on metallic surfaces — the way light bounces off chrome, glass, polished bodywork. The model approximates the look but often gets the highlight shape or position subtly wrong.

Why this matters for photographers

Two practical implications worth thinking about, regardless of how you scored.

First — stock photography is in the disruption zone. If you cannot reliably tell a gpt-image-2 generation from a real photo when both are downsized to web display sizes, neither can the average art director scrolling Getty or Shutterstock. The categories AI handles well (architectural, abstract, single-subject-on-plain-background) overlap heavily with the stock-photo demand curve. Working photographers selling into the lower tiers of the stock market are already feeling this.

Second — assignment and editorial work still wins. The categories AI gets wrong — close-up animal portraits, fine human-anatomy detail, photojournalism with messy specific context — are exactly the categories that working photographers get paid for. A real shoot at a specific location with named subjects produces something gpt-image-2 fundamentally cannot replicate from text. The work that comes from being there retains its value precisely because it’s evidence of having been there.

The honest framing for 2026: gpt-image-2 is genuinely good at generic-photographic-looking images. It is genuinely bad at specific-evidentiary-looking images. The photographers who lean into the latter — assignment work, photojournalism, specific-product-in-specific-context shots — are reasonably insulated for now.

How we built the test (methodology)

For each question we pulled a real photograph from SampleShots‘s public photo library — a curated collection of camera-sample images with full EXIF and photographer attribution. We picked 10 photos spanning diverse subjects (architecture, animal portraits, food, landscapes, urban scenes, macro details) and diverse modern camera bodies (Canon R5, Nikon Z8, Sony A7R V, Fuji X-T5, Leica Q3, Hasselblad X2D 100C, Panasonic S5 II, and more).

For each photo, we fed gpt-image-2 only a one-sentence scene description — never the camera model, never the photographer’s name, never the EXIF settings. The model was instructed to render as a plausible photograph from a working photographer’s hand, without any AI-aesthetic markers. The output was downsized to match the real photo’s display size, then paired with it. Real/AI assignment was randomized per question to keep the comparison fair.

The full reveal — which cameras were used, who shot each real photograph, what scene prompt fed gpt-image-2 — is in our companion long-form analysis: We Asked GPT-Image-2 to Imitate 10 Famous Cameras — Here’s How Close It Got. That post uses different camera bodies than this quiz and runs a tougher test (camera + EXIF + scene description), with the side-by-side comparisons clearly labeled.

If you want to run the same kind of blind test on a different photo set, the recipe is straightforward: get a real photograph, write a one-sentence description of the scene, hand the description to gpt-image-2 with instructions to render “as if photographed by a working photographer,” and compare. The current state of the art produces results that genuinely cross into “could be either” territory for most subject types. That single observation is what makes this an interesting moment in photography.

How did you score? Here is what gpt-image-2 does well — and where it slips

What the model handles well

Where the model still gives itself away

Why this matters for photographers

How we built the test (methodology)

Related Reading