Head to head: Bagel vs Rundiffusion Photo Flux
One model flashes sharper tactile instincts on a microtexture-heavy prompt, but the other is far more dependable when the brief gets strict about layout, counting, and compliance. This matchup turns on whether you value one standout image or consistent prompt execution across the board.
By RuntimeWire · Published

Bagel earns one real win here, and it’s the most aesthetically convincing image in the set. In Microtexture Lab Bench, Image A actually delivers the prompt’s weird, specific material stack: indigo denim, oxidized copper mesh with green crystal growth, a petri dish, fur under a glass-like presentation, and a brass gear. Just as important, it sells the cold macro-lab mood with stronger surface detail and more convincing microtexture than Rundiffusion Photo Flux, whose version looks softer and drops key elements like the copper mesh patina and fox fur.
But that’s where Bagel’s case mostly ends. In Exact Apparatus Layout, Bagel collapses outright: Image A is essentially blank, which is an automatic loss in a task built around object placement and technical obedience. Rundiffusion Photo Flux isn’t perfect here—it misplaces the notebook, gets the cactus relationship wrong, fudges the beaker shape, and doesn’t fully nail the forceps/text details—but it still produces a recognizable answer in the requested watercolor technical style. In a head-to-head, showing up matters.
The clearest separation comes in Counted Cryobiology Vials. Rundiffusion Photo Flux gives the right kind of image—an actual overhead documentary-style shot—with exactly 11 fully visible vials and the correct cap-color distribution. Bagel misses the assignment on fundamentals: wrong perspective, extra ice clutter, shaky count/color compliance, and weaker label rendering. This is the kind of benchmark where prompt discipline is non-negotiable, and Rundiffusion Photo Flux is simply in another tier.
That split explains the aggregate score: 11.8 for Bagel, 20.4 for Rundiffusion Photo Flux. Bagel can produce a more tactile, more memorable single image when the task rewards texture and atmosphere. But across this matchup, Rundiffusion Photo Flux is the model you trust to follow instructions, preserve scene logic, and get countable details right.
Final call: Rundiffusion Photo Flux wins comfortably. Bagel has the better microtexture eye; Rundiffusion Photo Flux is the better image model.
How they were tested
We ran 3 fresh image tasks, generated on the fly for this matchup so neither model could prepare in advance, and had gpt-5.4 score each one. Bagel scored 11.8 to Rundiffusion Photo Flux's 20.4.
1. Microtexture Lab Bench
A hyper-detailed macro photograph of an eccentric materials-science workbench in a polar research lab, 16:9: a torn square of indigo selvedge denim showing individual warp and weft threads beside a strip of oxidized copper mesh with green patina crystals, a frost-rimmed petri dish containing feathery silver dendrites, a brush of arctic fox fur trapped under a glass slide, and a compact brass gear assembly dusted with graphite; every surface must show crisp microtexture with no smudging—fabric weave, fur strands, crystalline growth, etched metal, tiny machining marks—lit by cold morning window light from the left plus a narrow task lamp creating precise specular highlights and deep focus across the frame.


Winner: Bagel — Image A matches more of the specified objects and the cold macro-lab feel: indigo denim, oxidized copper mesh with green crystals, petri dish, fur under glass-like presentation, and a brass gear are all present with stronger microtexture. Image B has a nice bench composition but misses key prompt elements like the copper mesh/patina and fox fur, and its focus/detail is softer and less consistently hyper-detailed across the frame.
2. Exact Apparatus Layout
A clean watercolor technical illustration of a quirky biochemistry lab setup on a white stone counter, 16:9, viewed straight-on at eye level: a tall amber reagent bottle labeled "Heliozyme 7" stands at the far left; immediately to its right is a brass microscope; a small hexagonal cactus in a blue pot sits directly in front of the microscope; a glass beaker filled with violet liquid is exactly between the microscope and a cream-colored centrifuge on the right; behind the beaker, but in front of the back wall, hangs a circular copper timer; a folded mint-green lab notebook lies on top of the centrifuge; under the beaker is a single red coaster; and a pair of silver forceps rests to the right of the beaker but left of the centrifuge, with all spatial relationships depicted unambiguously under soft skylight.


Winner: Rundiffusion Photo Flux — Model A is essentially blank and fails the prompt entirely. Model B captures most required objects and the overall watercolor technical style, but several spatial relationships are wrong: the notebook is on the counter instead of only on the centrifuge, the cactus is not directly in front of the microscope, the beaker shape is off, and the forceps placement/text label are imperfect.
3. Counted Cryobiology Vials
A sharply lit documentary-style overhead photograph of a cryobiology sorting tray in a laboratory freezer room, 16:9: exactly 11 distinct frosted sample vials with colored caps arranged on a matte black grid tray, all fully visible and individually countable, with no extras anywhere in the scene; among them are 3 teal-capped vials, 2 saffron-capped vials, 4 white-capped vials, and 2 plum-capped vials, each vial separated by small gaps; beside the tray sits a grease pencil, a curled paper label reading "Lot Nereid-43", and a thin sheen of ice crystals on the metal surface, illuminated by stark overhead fluorescent light that preserves clear edges and avoids occlusion.


Winner: Rundiffusion Photo Flux — Model B adheres much better to the prompt: it is a true overhead documentary-style shot with exactly 11 fully visible vials and the correct cap color counts. Model A has the wrong perspective, includes extra ice clutter, appears to have incorrect vial counts/colors, and the label text rendering is less accurate.
See every prompt and the full side-by-side outputs in the interactive Head-to-Head.