-- Recently, [Beijing Innovation Center of Humanoid Robotics (X-Humanoid)](https://www.x-humanoid.com/) achieved back-to-back victories in the globally authoritative WorldArena evaluation benchmarks. Following the top ranking of its WoW embodied world model in the data engine track, its first “Embodied Unified” model, Pelican-Unify 1.0, also claimed first place in the comprehensive WorldArena evaluation. X-Humanoid has become the world’s only company to win championships in both core tracks simultaneously, earning the industry’s first “double crown” in embodied intelligence and securing a position among the world’s top tier in embodied brain capabilities.
As a core component of the general embodied intelligence platform “Wise Kaiwu,” Pelican-Unify 1.0 has achieved world-leading performance across understanding, reasoning, imagination, and action. In particular, it demonstrates exceptional capabilities in world modeling. This milestone marks embodied intelligence entering a new phase of “co-evolution,” moving beyond fragmented functional integration and laying a solid technological foundation and development pathway toward Artificial General Embodied Intelligence (AGEI).
Standing Out in the Most Demanding Global Benchmark
WorldArena was jointly launched by eight leading institutions, including Tsinghua University, Princeton University, National University of Singapore, Peking University, The University of Hong Kong, Chinese Academy of Sciences, Shanghai Jiao Tong University, and University of Science and Technology of China. The benchmark spans six major evaluation dimensions, 16 detailed indicators, and three real-world application tasks, attracting participation from nearly all leading global world model teams.
Pelican-Unify 1.0 emerged as the top performer, ranking first in overall EWM Score while maintaining balanced excellence in visual quality, motion quality, and physical compliance. Its near-perfect 3D Accuracy score further validated its precise understanding of spatial geometry and scene relationships.

Authoritative Third-Party Validation Signals a Breakthrough for Unified Embodied Models
Previously, embodied intelligence approaches such as VLMs, VLAs, and world models were often optimized independently, resulting in disconnects between perception, reasoning, and action. Pelican-Unify 1.0 introduces a fundamentally different approach: understanding, reasoning, imagination, and action are not isolated modules, but different manifestations of a single physical intelligence loop.
The model achieves three forms of unification:
* Unified Understanding: Mapping scenes, instructions, visual context, and action history into a shared semantic space;
* Unified Reasoning: Transforming task intentions, action selection, and future consequences into a supervised language-based reasoning process;
* Unified Generation: Jointly generating future video frames and low-level action sequences within the same diffusion decoding process, enabling actions to be shaped by imagined outcomes while imagination is constrained by task reasoning.
Architecturally, Pelican-Unify 1.0 consists of two tightly coupled components. The upper layer functions as a unified VLM-based understanding and reasoning engine, while the lower layer, the Unify Future Generator, jointly generates future video and action chunks conditioned on the same latent variable *z* within a unified diffusion process.
As a result, Pelican-Unify 1.0 does not simply connect VLMs, world models, and action policies in sequence. Instead, it enables them to evolve together under a unified training objective, simultaneously learning how to understand tasks, predict future outcomes, and determine optimal actions.
Robots That “Imagine Before Acting” Demonstrate Closed-Loop Intelligence
One of Pelican-Unify 1.0’s key capabilities is its ability to generate future visual states before executing actions, aligning action prediction with future imagination to maintain fine-grained consistency between motion commands and generated visual frames.
In real-world robot validation experiments, Pelican-Unify 1.0 was deployed on the Tien Kung humanoid robot and the UR5e robotic arm, successfully completing previously unseen long-horizon compositional tasks, such as inserting an RJ45 connector followed by waterproof sealing. These demonstrations showcased strong compositional generalization and zero-shot transfer capabilities.
Real robots represent the ultimate proving ground for the “reasoning–imagination–action” closed loop. Through deployment on Embodied Tien Kung and UR5e systems, X-Humanoid demonstrated the practical value of integrating cognitive intelligence with physical execution in real production scenarios.
Unified Does Not Mean Compromise
One of the most common concerns surrounding unified models is whether integrating multiple capabilities into a single framework weakens each individual capability.
Benchmark results for Pelican-Unify 1.0 show otherwise. In VLM evaluations, the model achieved an average score of 64.7 across eight general and embodied intelligence benchmarks, reaching state-of-the-art (SOTA) performance. The results demonstrate that unified training not only preserves multimodal capabilities, but further strengthens spatial understanding, physical reasoning, and action-related semantics.
In action generation tasks, Pelican-Unify 1.0 achieved an average success rate of 93.5% on the RoboTwin dual-arm benchmark, matching current SOTA systems. The results confirm that unified training enhances rather than compromises execution, spatial cognition, and physical interaction capabilities.
A New Paradigm Toward General Embodied Intelligence
The significance of Pelican-Unify 1.0’s world-leading ranking extends far beyond benchmark performance. More importantly, it proposes a modeling pathway that moves closer to true general embodied intelligence: enabling understanding, reasoning, imagination, and action to share representations, co-train, and mutually shape one another within a complete closed-loop intelligence system.
With the dual foundations of the “Embodied Tien Kung” general robotics platform and the “Wise Kaiwu” embodied intelligence platform, X-Humanoid has established a full-stack ecosystem spanning “body–brain–cerebellum–platform–ecosystem.” Leveraging the technological foundation behind its “double championship” achievements, the company aims to bring top-tier embodied intelligence models into real-world industrial and service scenarios, lowering the barriers to embodied AI adoption and accelerating the evolution of humanoid robots from specialized devices into general-purpose productivity tools.
Paper Link:
[Pelican-Unify 1.0 Research Paper] (https://arxiv.org/pdf/2605.15153)
Contact Info:
Name: Clara Liu
Email: Send Email
Organization: x-humanoid
Website: https://www.x-humanoid.com/
Release ID: 89192100
In the event of detecting errors, concerns, or irregularities in the content shared in this press release that require attention or if there is a need for a press release takedown, we kindly request that you inform us promptly by contacting error@releasecontact.com (it is important to note that this email is the authorized channel for such matters, sending multiple emails to multiple addresses does not necessarily help expedite your request). Our dedicated team will promptly address your feedback within 8 hours and take necessary actions to resolve any identified issues diligently or guide you through the removal process. Providing accurate and dependable information is our utmost priority.
