Head to head: Bernini-R Edit Image vs Happy Horse 1.1 Image to Video

Bernini-R Edit Image vs Happy Horse 1.1 Image to Video

This matchup turns on execution, not vibes. Bernini-R Edit Image flashes style, but Happy Horse 1.1 Image to Video is the model that more consistently obeys the brief, preserves scene logic, and delivers cleaner motion storytelling across both tests.

Bernini-R Edit Image has taste. In both prompts, it finds a cinematic angle and a usable mood. But this wasn’t a contest about isolated pretty frames; it was about whether the model could carry specific visual instructions through motion without dropping key details. On that standard, Happy Horse 1.1 Image to Video wins cleanly, and the aggregate score reflects it: **17.1 to 14.1**. The clearest separation came in **Neon Ribbon Dropkick**. Model B better honored the warehouse setup: the lighting reads correctly, the amber lamp is actually present, the silver sneakers land, the low riser interaction makes spatial sense, and the magenta ribbon tracks more coherently through the action. Most importantly, the ball behavior is more faithful to the prompt. Bernini-R’s version has a nice low-angle attitude, but the ball skews too small and too yellow, and the kick path is less legible—exactly the kind of miss that makes a supposedly precise action beat feel improvised. In **Velvet Fan Occlusion**, the difference is subtler but still decisive. Happy Horse better captures the moonlit lavender rooftop atmosphere and stages the movement as a continuous dance phrase rather than a sequence of loosely connected poses. The smoked-glass occlusion works the way it should: the performer disappears and re-emerges smoothly, with costume and fan continuity intact. Bernini-R’s take is perfectly watchable, but the lighting drifts from the brief and the occlusion/emergence reads more like a constructed trick than a fluid camera-space event. What Bernini-R keeps proving is that it can sell a shot. What Happy Horse proves here is that it can sell the shot *and* the assignment. That distinction matters. In side-by-side evaluation, Model B is simply better at maintaining prompt fidelity, spatial continuity, and readable action under stylistic pressure. **Final call: Happy Horse 1.1 Image to Video is the stronger video model in this head-to-head, and it wins without needing an asterisk.**

Neon Ribbon Dropkick

A one-continuous-shot 16:9 clip in a dim rehearsal warehouse lit by cyan tube lights and a single amber work lamp: a waacking dancer in silver sneakers sprints two steps and dropkicks a grapefruit-sized translucent rubber ball across the dusty floor, the ball skidding, bouncing three uneven times with visibly decreasing height, clipping a low metal riser, then ricocheting toward camera while a long magenta ribbon tied to the dancer’s wrist whips and settles naturally from the momentum; the camera makes a fluid gliding gimbal move, starting low at ankle height and arcing sideways to follow the ball’s path, with a tense, electric mood.

Bernini-R Edit Image:

Happy Horse 1.1 Image to Video:

Model B better matches the warehouse lighting, visible amber lamp, silver sneakers, low riser interaction, and magenta ribbon, with clearer spatial continuity and cleaner cinematography. Model A has a strong low-angle mood, but the ball appears too small/yellow and the action/path is less legible and less faithful to the specified translucent grapefruit-sized rubber ball behavior.

Velvet Fan Occlusion

A one-continuous-shot 16:9 clip on a moonlit rooftop stage washed in cool lavender light with warm window glow from distant apartments: a flamenco performer in a bottle-green velvet suit and cream hat spins a small brass fan in the right hand, then strides diagonally as the camera performs a slow forward gimbal drift; the performer passes fully behind a tall rolling panel of smoked glass for a full second and emerges on the other side without changing pose, costume details, fan, or stride rhythm, continuing the same dance phrase seamlessly, with a mysterious, poised mood.

Bernini-R Edit Image:

Happy Horse 1.1 Image to Video:

Model B better matches the moonlit lavender rooftop mood and shows a clearer continuous dance phrase with the smoked-glass occlusion and seamless re-emergence while preserving costume and fan continuity. Model A is solid and readable, but the lighting feels less aligned to the prompt and the occlusion/emergence appears a bit more staged and less fluid.