Whereas OpenAI retains teasing Sora after months of delays, Tencent quietly dropped a mannequin that’s already exhibiting comparable outcomes to present top-tier video turbines.
Tencent has unveiled Hunyuan Video, a free and open-source AI video generator, strategically timed throughout OpenAI’s 12-day announcement marketing campaign, which is broadly anticipated to incorporate the debut of Sora, its extremely anticipated video software.
“We current Hunyuan Video, a novel open-source video basis mannequin that reveals efficiency in video era that’s akin to, if not superior to, main closed-source fashions,” Tencent stated in its official announcement.
The Shenzhen, China-based tech large claims its mannequin “outperforms” these of Runway Gen-3, Luma 1.6, and “three top-performing Chinese language video generative fashions” primarily based on skilled human analysis outcomes.
The timing could not be extra apt.
Earlier than its video generator—someplace between the SDXL and Flux eras of open-source picture turbines— Tencent launched a picture generator with an analogous identify.
HunyuanDit offered wonderful outcomes and improved understanding of bilingual textual content, nevertheless it was not broadly adopted. The household was accomplished with a bunch of massive language fashions.
Hunyuan Video makes use of a decoder-only Multimodal Massive Language Mannequin as its textual content encoder as an alternative of the standard CLIP and T5-XXL combo present in different AI video instruments and picture turbines.
Tencent says this helps the mannequin comply with directions higher, grasp picture particulars extra exactly, and study new duties on the fly with out extra coaching—plus, its causal consideration setup will get a lift from a particular token refiner that helps it perceive prompts extra completely than conventional fashions.
It additionally rewrites prompts to make them richer and improve the standard of its generations. For instance, a immediate that merely says “A person strolling his canine” may be enhanced together with particulars, scene setup, gentle circumstances, high quality artifacts, and race, amongst different parts.
Free for the plenty
Like Meta’s LLaMA 3, Hunyuan is free to make use of and monetize till you hit 100 million customers—a threshold most builders will not want to fret about anytime quickly.
The catch? You will want a beefy laptop with at the very least 60GB of GPU reminiscence to run its 13 billion parameter mannequin regionally—suppose Nvidia H800 or H20 playing cards. That is extra vRAM than most gaming PCs have in whole.
For these and not using a supercomputer mendacity round, cloud companies are already leaping on board.
FAL.ai, a generative media platform tailor-made for builders, has built-in Hunyuan, charging $0.5 per video. Different cloud suppliers, together with Replicate or GoEhnance, have additionally began providing entry to the mannequin. The official Hunyuan Video server presents 150 credit at $10, with every video era costing 15 credit minimal.
And, after all, customers can run the mannequin on a rented GPU utilizing companies like Runpod or Huge.ai.
Early assessments present Hunyuan matching the standard of business heavyweights like Luma Labs Dream Machine or Kling AI. Movies take about quarter-hour to generate, producing photorealistic sequences with natural-looking human and animal movement.
Testing reveals one present weak point: the mannequin’s grasp of English prompts may very well be sharper than its opponents. Nonetheless, being open supply means builders can now tinker with and enhance the mannequin.
Tencent says its textual content encoder achieves as much as 68.5% alignment charges—which means how carefully the output matches what customers ask for—whereas sustaining 96.4% visible high quality scores primarily based on their inside testing.
The whole supply code and pre-trained weights can be found for obtain on GitHub and Hugging Face platforms.
Edited by Sebastian Sinclair
Usually Clever Publication
A weekly AI journey narrated by Gen, a generative AI mannequin.