arxiv:2604.08364

MegaStyle: Constructing Diverse and Scalable Style Dataset via Consistent Text-to-Image Style Mapping

Published on Apr 9

· Submitted by

Gaojunyao on Apr 10

Tencent

Upvote

Authors:

Junyao Gao ,

Abstract

MegaStyle presents a scalable data curation pipeline for creating high-quality, style-consistent datasets using large generative models and proposes style-supervised contrastive learning for effective style representation extraction.

AI-generated summary

In this paper, we introduce MegaStyle, a novel and scalable data curation pipeline that constructs an intra-style consistent, inter-style diverse and high-quality style dataset. We achieve this by leveraging the consistent text-to-image style mapping capability of current large generative models, which can generate images in the same style from a given style description. Building on this foundation, we curate a diverse and balanced prompt gallery with 170K style prompts and 400K content prompts, and generate a large-scale style dataset MegaStyle-1.4M via content-style prompt combinations. With MegaStyle-1.4M, we propose style-supervised contrastive learning to fine-tune a style encoder MegaStyle-Encoder for extracting expressive, style-specific representations, and we also train a FLUX-based style transfer model MegaStyle-FLUX. Extensive experiments demonstrate the importance of maintaining intra-style consistency, inter-style diversity and high-quality for style dataset, as well as the effectiveness of the proposed MegaStyle-1.4M. Moreover, when trained on MegaStyle-1.4M, MegaStyle-Encoder and MegaStyle-FLUX provide reliable style similarity measurement and generalizable style transfer, making a significant contribution to the style transfer community. More results are available at our project website https://jeoyal.github.io/MegaStyle/.

View arXiv page View PDF Project page Add to collection

Community

Gaojunyao

Paper author Paper submitter about 20 hours ago

Visualizations of our style dataset (a)MegaStyle-1.4M and the stylized results produced by our style transfer model (b)MegaStyle-FLUX. MegaStyle-1.4M contains style pairs that share the style but have different content (intra-style consistency), as well as a large number of diverse styles (inter-style diversity). Trained on MegaStyle-1.4M, MegaStyle-FLUX effectively captures nuances—such as color, light, texture and brushwork—across various styles.