Vivago ai launched HiDream-I1 simply final week, and it’s already sitting comfortably among the many prime 5 finest picture mills, outperforming established fashions like Flux, Auraflow, and Secure Diffusion 3.5—and even a few of the finest closed-source fashions like MidJourney v7, Ideogram v3, and Reve.
Vivago is an AI-powered inventive platform developed by Sparking Improvements Restricted, an organization primarily based in Hong Kong that provides a set of instruments for producing and modifying visible content material.
HiDream is available in three variations: “Full” offers the best high quality outputs and requires 50 steps to render a very good picture; “Dev” does its job in round 30 steps, whereas “Quick” takes round 16 steps to provide good outcomes.
In fact, the extra steps the mannequin takes, the extra detailed the picture shall be—and the extra sources it would require.
However what makes these fashions totally different?
For starters, their measurement. HiDream packs a hefty 17 billion parameters that allow it to generate high-quality photographs throughout a number of kinds in seconds. Only for reference, Secure Diffusion’s SD3.5 is sort of half the dimensions.
HiDream-I1 is uncensored and commercially pleasant. Launched below the MIT license, it allows “unrestricted utilization for each private and industrial tasks.”
Nevertheless, Vivago famous that it filtered its coaching information to take away “problematic content material,” however would not prohibit outputs, giving customers “full inventive freedom”—which suggests you’ll want a fine-tuned model if you wish to generate NSFW imagery.
(Problematic content material doesn’t embrace producing blasphemous photographs of China’s President Xi Jinping, regardless of Vivago being a Hong Kong-based firm.)
Customers should even have some severe {hardware} to run it regionally.
The total fashions require 27GB of VRAM to run, which might solely be supplied by behemoth GPUs that begin at round $2,500.
Nevertheless, inside days of the picture generator’s launch, builders started creating quantized variations to run on extra “modest” setups, requiring as little as 16GB of VRAM.
For customers with out high-end {hardware}, Vivago presents on-line entry by means of its platform plus there is a Hugging Face area demo. Fal AI additionally helps Hidream for reasonable:
- Hidream Full: For $1, you may run this mannequin roughly 20 instances.
- Hidream Dev: For $1, you may run this mannequin roughly 33 instances.
- Hidream Quick: For $1, you may run this mannequin roughly 100 instances.
Testing the fashions
Right here’s what we discovered once we put Hidream by means of its paces.
Anatomy understanding
Immediate: A Hawaiian child staring on the digicam whereas doing gestures together with his arms
HiDream-I1 Full
HiDream-I1 Full nails human anatomy with professional-grade accuracy.
Arms look pure with appropriate finger counts and proportions. Facial options align correctly with sensible eye spacing and ear positioning. Physique construction maintains correct proportions all through, with good definition.
Its solely weak spots present up in extraordinarily complicated poses the place joints often look stiff.
Rating: 9/10
HiDream-I1 Dev
HiDream-I1 Dev renders stable anatomy whereas buying and selling some element for velocity. Fingers keep intact however generally present minor distortions in size or thickness, which, in our technology, the mannequin hides with blur.
Faces stay proportional, however with less-defined options. It handles customary poses nicely, however generally struggles with overlapping limbs.
Oddly, it generally beats the Full mannequin on pure textures, with fewer bizarre artifacts in hair and extra imperfections that make the pores and skin look extra pure.
Rating: 8.5/10
HiDream-I1 Quick
HiDream-I1 Quick makes apparent anatomy compromises for velocity. The traditional AI hand issues present up—fused fingers, flawed counts, odd bending. Faces keep primary construction, however options can drift barely.
Our bodies in some particular generations look warped with unusual shoulder widths or arm lengths. Surprisingly, it generally renders clothes folds and hair movement extra naturally than higher-tier variations.
One good factor we discovered: When producing arms with additional fingers, it’s straightforward to inform which one is the “flawed” technology, that means it’s straightforward to delete that artifact, and the hand will stay pure.
If it weren’t for that additional finger, we may argue that it generated a greater hand than Dev, with extra outlined fingers and joints.
Rating: 7/10
Winner: HiDream-I1 Full
Inventive understanding
Immediate: A person and a girl having dinner in a futuristic restaurant, illustration within the model of Vincent Van Gogh. The restaurant has an indication saying “Welcome to Emerge, by Decrypt,” impasto, oil on canvas
HiDream-I1 Full
The Full mannequin achieves essentially the most refined stability between Van Gogh’s model and the heavy impasto approach.
Brushwork mimics oil-on-canvas approach with seen paint thickness and directional strokes. The composition is dynamic, and the lighting could be very current, nearly too vibrant.
The lady’s physique has extra legs—a element that requires cautious commentary to note. The restaurant setting options wealthy particulars that improve somewhat than distract from the central figures. The “Welcome to Emerge by Decrypt” signal seems as a pure component inside the painterly world, nevertheless it was generated with a typo. The Full mannequin’s solely weak point is occasional perfectionism that generally appears too digital in smaller particulars.
Rating: 9/10
HiDream-I1 Dev
Dev maintains constancy to Van Gogh’s model with extra excessive impasto texture and directional brushwork. The couple’s interplay feels extra pure, and the general ambiance is darker than the one generated by Hidream Full.
Like Full and Quick, this mannequin didn’t actually generate any futuristic components within the composition. Colours are vibrant however managed, with the attribute saturated blues and heat yellows.
The signage built-in into the composition is poorly written, however the illustration appears to mix extra with the restaurant’s aesthetic than the best way it was portrayed in Full’s technology. Dev significantly shines in capturing Van Gogh’s emotional depth by means of lighting results and distinction.
Rating: 8.5/10
HiDream-I1 Quick
The quick model captures Van Gogh’s energetic brushwork surprisingly nicely regardless of its velocity optimizations.
The scene composition is dynamic with sturdy foreground-background separation. Characters have expressive poses that talk relationship and emotion, however are the least correct by way of model. The scene is extra indoors, which suggests it focuses extra on the topic than different accent components.
The “Emerge by Decrypt” signal seems as a pure a part of the setting somewhat than an afterthought, however it’s nonetheless not completely generated.
Quick surprisingly outperforms larger variations in background particulars, with extra diversified and smoother-looking brushstrokes within the sky.
Rating: 8.5/10
Winner: HiDream-I1 Full
Immediate Adherence and Spatial Consciousness Check
Immediate: A canine with a pink hat standing on prime of a TV displaying the phrase ‘Decrypt is the perfect Crypto+AI media website on the earth’ on the display screen. On the left there’s a blonde girl in a enterprise go well with holding a coin, on the best there’s a robotic standing on prime of a primary help field, a inexperienced pyramid stands behind the field,. The general surroundings is surreal. A cat is standing the other way up on prime of a white soccer ball, subsequent to the canine. An Astronaut from NASA holds an indication that reads “Emerge” and is positioned subsequent to the robotic
HiDream-I1 Full
HiDream-I1 Full struggled with this immediate. The canine sits centered on the TV with the message clearly legible and a pink hat precisely represented.
The businesswoman seems correct on the left with a sensible coin gesture. The astronaut and first help field are positioned accurately to the canine’s proper, and the pyramid is behind these components.
However that’s the place the accuracy ends. The opposite components, if represented, seem in random places.
Rating: 8/10
HiDream-I1 Dev
Dev creates a extra atmospheric scene that balances immediate adherence with creative interpretation, attempting to convey a extra surreal illustration.
The canine wears the proper pink hat and sits atop the tv; the TV textual content suffers from perspective distortion however is correct.
The astronaut stands accurately positioned subsequent to the inexperienced pyramid, despite the fact that the individual holding the Emerge signal is a second astronaut sitting on prime of the pyramid.
The desert setting enhances the surreal high quality successfully. The cat fails to stability on the soccer ball as requested, however is best represented than in Full’s technology.
Rating: 8.5/10
HiDream-I1 Quick
Quick consists of most requested components, however with restricted spatial accuracy.
The canine wears the pink hat and is on prime of the TV. The display screen shows textual content, nevertheless it’s illegible after the phrase “media”.
The lady seems within the appropriate spot and curiously sufficient it was the one mannequin that generated the astronaut standing on prime of the primary help field, in entrance of a pyramid and holding the signal that reads “Emerge.”
The cat lies close to the soccer ball, and a few extra cash and one other astronaut had been generated. The broader composition is there (displaying the weather), nevertheless it fails at particulars (just like the textual content) as a result of apparent compromise in rendering steps.
Rating: 8.2/10
Winner: HiDream-I1 Dev
Realism
Immediate: Donald Trump consuming a hamburger in a busy sweatshop. On the wall there’s an indication that reads “Make Tariffs Nice Once more”
HiDream-I1 Full
Full achieves the strongest sense of visible narrative and realism. Trump is centered below a well-rendered signal (although with a typo). The lighting is constant, and the setting carefully resembles a sweatshop, with staff carrying uniforms and hairnets.
The setting has industrial cues—uncooked cinder block partitions, cluttered workstations—which successfully assist the immediate.
Trump’s go well with is plausible and textured, and he interacts naturally with the hamburger. There’s a second burger on the plate, which, just like the misspelling of tariffs, is an error that occurs in all generations.
Technical flaws embrace minor distortions in Trump’s hand anatomy. Nonetheless, it’s the one model to correctly depict all components with sturdy coherence.
Rating: 9/10
HiDream-I1 Dev
HiDream-I1 Dev presents a well-lit and sharply targeted picture with an expert photographic tone, although it strays barely from the immediate’s grit.
The setting resembles a clear industrial kitchen greater than a sweatshop. Employees within the background are casually dressed, and the setting lacks visible indicators of labor depth.
The signal is precisely positioned and says “MAKE TARRIFFS GREAT AGAIN” (identical misspelling), however its print high quality is flatter. Trump’s pose is obvious and symmetrical, and the burger appears sensible.
Additionally, Dev’s render of Trump’s face appears extra sensible than the technology utilizing Hidream Full, as this one shouldn’t be overly sharpened.
Rating: 8.5/10
HiDream-I1 Quick
HiDream-I1Fast achieves a average stage of realism, with correct lighting, depth, and a plausible Trump determine, nevertheless it misses key emotional and contextual cues.
The setting appears extra like a sweatshop than in Dev’s technology. The signal reads “MAKE TARRIFS GREAT AGAIN” (misspelling once more), and though it’s legible, it’s the just one that seems in a single single colour, which may move as extra sensible.
The burger is satisfactorily rendered, and Trump’s posture is pure, however his go well with jacket is changed by a vest—the immediate doesn’t specify something about clothes. The lighting doesn’t correspond to the scene, nevertheless it’s total a really correct illustration
Rating: 8/10
Winner: HiDream-I1 Full
The Verdict
The HiDream-I1 fashions are a welcome shift within the AI picture technology panorama and put open-source fashions as soon as once more within the public eye after the discharge of a brand new technology of highly effective closed-source AI Picture mills.
The Full model persistently outperformed different fashions throughout almost all checks, although the Dev mannequin presents a formidable compromise between velocity and high quality.
Even the Quick model, whereas clearly sacrificing element and accuracy, nonetheless produces outcomes that will have been thought of cutting-edge simply months in the past.
Not like closed-source rivals, HiDream’s MIT license and open-source nature imply artists, builders, and companies can adapt and construct upon it freely.
The excessive {hardware} necessities current an vital barrier, but when historical past repeats itself, the neighborhood will proceed optimizing the mannequin for broader accessibility.
For creators who’ve been restricted by the censorship of economic fashions or pissed off by licensing restrictions, HiDream presents a compelling different.
The fashions are fairly low-cost to run on cloud servers on a pay-per-use foundation, which makes them a robust different to closed-source fashions that cost for month-to-month or yearly subscriptions.
As quantized variations (smaller fashions) enhance and extra fine-tuned (custom-made fashions) emerge, HiDream’s influence on the generative AI panorama will seemingly develop even additional.
Simply wait a couple of weeks and test on Hugging Face and Civitai for updates. Within the meantime, Flux finetunes are nonetheless very highly effective and environment friendly.
Edited by Sebastian Sinclair
Typically Clever E-newsletter
A weekly AI journey narrated by Gen, a generative AI mannequin.