UMO
Scaling Multi-Identity Consistency for
Image Customization via Matching Reward

Yufeng Cheng Wenxu Wu Shaojin Wu Mengqi Huang Fei Ding Qian He
UXO Team
Intelligent Creation Lab, ByteDance

Paper arXiv UMO Model USO UNO
UMO-UNO UMO-OmniGen2

✨ Press and hold the mouse button to drag and view

We announce UMO, a unified multi-identity optimization framework and the latest addition to the UXO family (UMO, USO, and UNO). UMO can freely combine one-to-many identity with any subjects in any scenarios, delivering outputs with high subject/identity consistency. In line with our past practice, we will open-source the full project, including inference scripts, model weights, and training code, to advance research and empower the open-source community.

Abstract

Recent advancements in image customization exhibit a wide range of application prospects due to stronger customization capabilities. However, since we humans are more sensitive to faces, a significant challenge remains in preserving consistent identity while avoiding identity confusion with multi-reference images, limiting the identity scalability of customization models. To address this, we present UMO, a Unified Multi-identity Optimization framework, designed to maintain high-fidelity identity preservation and alleviate identity confusion with scalability. With "multi-to-multi matching" paradigm, UMO reformulates multi-identity generation as a global assignment optimization problem and unleashes multi-identity consistency for existing image customization methods generally through reinforcement learning on diffusion models. To facilitate the training of UMO, we develop a scalable customization dataset with multi-reference images, consisting of both synthesised and real parts. Additionally, we propose a new metric to measure identity confusion. Extensive experiments demonstrate that UMO not only improves identity consistency significantly, but also reduces identity confusion on several image customization methods, setting a new state-of-the-art among open-source methods along the dimension of identity preserving.

How does it work?

Our UMO unleashes multi-identity consistency and alleviates identity confusion. UMO’s training process follows ReReFL in Algorithm 1 proposed in our paper with Multi-Identity Matching Reward.

Comparison with State-of-the-Art Methods

Qualitative comparison with different methods.

Disclaimer

We open-source this project for academic research. The vast majority of images used in this project are either generated or from open-source datasets. If you have any concerns, please contact us, and we will promptly remove any inappropriate content. Our project is released under the Apache 2.0 License. If you apply to other base models, please ensure that you comply with the original licensing terms.

This research aims to advance the field of generative AI. Users are free to create images using this tool, provided they comply with local laws and exercise responsible usage. The developers are not liable for any misuse of the tool by users.

@article{cheng2025umo,
  title={UMO: Scaling Multi-Identity Consistency for Image Customization via Matching Reward},
  author={Cheng, Yufeng and Wu, Wenxu and Wu, Shaojin and Huang, Mengqi and Ding, Fei and He, Qian},
  journal={arXiv preprint arXiv:2509.06818},
  year={2025}
}

UMO Scaling Multi-Identity Consistency for Image Customization via Matching Reward Yufeng Cheng Wenxu Wu Shaojin Wu Mengqi Huang Fei Ding Qian He UXO Team Intelligent Creation Lab, ByteDance Paper arXiv UMO Model USO UNO UMO-UNO UMO-OmniGen2

Abstract

How does it work?

Comparison with State-of-the-Art Methods

More Results

Disclaimer

UMO
Scaling Multi-Identity Consistency for
Image Customization via Matching Reward

Yufeng Cheng Wenxu Wu Shaojin Wu Mengqi Huang Fei Ding Qian He
UXO Team
Intelligent Creation Lab, ByteDance

Paper arXiv UMO Model USO UNO
UMO-UNO UMO-OmniGen2