
{"id":11582,"date":"2025-10-06T10:03:48","date_gmt":"2025-10-06T10:03:48","guid":{"rendered":"https:\/\/novelis.io\/?post_type=research-lab&#038;p=11582"},"modified":"2025-10-06T10:09:36","modified_gmt":"2025-10-06T10:09:36","slug":"ovis-u1-comprehension-generation-et-edition-multimodales-unifiees","status":"publish","type":"research-lab","link":"https:\/\/novelis.io\/fr\/research-lab\/ovis-u1-comprehension-generation-et-edition-multimodales-unifiees\/","title":{"rendered":"Ovis-U1 : Compr\u00e9hension, g\u00e9n\u00e9ration et \u00e9dition multimodales unifi\u00e9es"},"content":{"rendered":"\n<p>Dans le monde en constante \u00e9volution de l\u2019intelligence artificielle, la recherche de mod\u00e8les capables de comprendre, g\u00e9n\u00e9rer et manipuler de mani\u00e8re fluide des informations visuelles et textuelles a franchi une \u00e9tape importante avec <strong>Ovis-U1<\/strong>.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full is-resized\"><img fetchpriority=\"high\" decoding=\"async\" width=\"945\" height=\"208\" src=\"https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-2.png\" alt=\"\" class=\"wp-image-11534\" style=\"width:595px;height:auto\" srcset=\"https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-2.png 945w, https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-2-600x132.png 600w, https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-2-250x55.png 250w, https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-2-768x169.png 768w, https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-2-30x7.png 30w\" sizes=\"(max-width: 945px) 100vw, 945px\" \/><\/figure>\n\n\n\n<p>Ce mod\u00e8le marque un v\u00e9ritable bond en avant dans l\u2019IA multimodale unifi\u00e9e, en trouvant l\u2019\u00e9quilibre entre compacit\u00e9 et performance. Avec seulement <strong>3,6 milliards de param\u00e8tres<\/strong> (2,4 milliards d\u00e9di\u00e9s \u00e0 la compr\u00e9hension et 1,2 milliard \u00e0 la g\u00e9n\u00e9ration), Ovis-U1 \u00e9gale ou d\u00e9passe les capacit\u00e9s de mod\u00e8les beaucoup plus grands et sp\u00e9cialis\u00e9s.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full is-resized\"><img decoding=\"async\" width=\"945\" height=\"989\" src=\"https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-1.png\" alt=\"\" class=\"wp-image-11531\" style=\"width:610px;height:auto\" srcset=\"https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-1.png 945w, https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-1-573x600.png 573w, https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-1-239x250.png 239w, https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-1-768x804.png 768w, https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-1-30x30.png 30w\" sizes=\"(max-width: 945px) 100vw, 945px\" \/><\/figure>\n\n\n\n<p>Sa conception illustre la puissance d\u2019un entra\u00eenement unifi\u00e9 soigneusement pens\u00e9, comblant le foss\u00e9 entre efficacit\u00e9 et haut niveau de performance, l\u00e0 o\u00f9 les approches modulaires pr\u00e9c\u00e9dentes paraissent d\u00e9sormais d\u00e9pass\u00e9es.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Une approche v\u00e9ritablement unifi\u00e9e<\/h4>\n\n\n\n<p>Au c\u0153ur d\u2019Ovis-U1 se trouve un moteur de 3 milliards de param\u00e8tres, int\u00e9grant trois capacit\u00e9s essentielles : la <strong>compr\u00e9hension multimodale<\/strong>, la <strong>g\u00e9n\u00e9ration texte-vers-image<\/strong> et l\u2019<strong>\u00e9dition d\u2019images<\/strong>. Contrairement aux mod\u00e8les con\u00e7us pour une t\u00e2che unique, Ovis-U1 exploite pleinement l\u2019interaction entre ces fonctions. Sa compr\u00e9hension du contenu visuel enrichit la g\u00e9n\u00e9ration d\u2019images, tandis que le processus g\u00e9n\u00e9ratif affine \u00e0 son tour la compr\u00e9hension des relations complexes entre texte et image. Inspir\u00e9 par l\u2019ambition de GPT-4o en mati\u00e8re d\u2019intelligence unifi\u00e9e, Ovis-U1 atteint une qualit\u00e9 de g\u00e9n\u00e9ration comparable tout en conservant une architecture plus compacte et \u00e9conome en ressources. Cette synergie entre les t\u00e2ches fait d\u2019Ovis-U1 bien plus qu\u2019un ensemble d\u2019outils sp\u00e9cialis\u00e9s : un v\u00e9ritable <strong>r\u00e9solveur de probl\u00e8mes multimodal coh\u00e9rent<\/strong>.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Repousser les limites de la compr\u00e9hension multimodale<\/h4>\n\n\n\n<p>Lors de son \u00e9valuation sur le benchmark acad\u00e9mique multimodal <strong>OpenCompass<\/strong>, <strong>Ovis-U1<\/strong> a obtenu un score impressionnant de <strong>69,6<\/strong>, se pla\u00e7ant en t\u00eate des mod\u00e8les de la gamme des 3 milliards de param\u00e8tres.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full is-resized\"><img decoding=\"async\" width=\"945\" height=\"662\" src=\"https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-3.png\" alt=\"\" class=\"wp-image-11537\" style=\"width:648px;height:auto\" srcset=\"https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-3.png 945w, https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-3-600x420.png 600w, https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-3-250x175.png 250w, https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-3-768x538.png 768w, https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-3-30x21.png 30w\" sizes=\"(max-width: 945px) 100vw, 945px\" \/><\/figure>\n\n\n\n<p>M\u00eame compar\u00e9 \u00e0 des mod\u00e8les plus grands comme <strong>GPT-4o<\/strong> (75,4), Ovis-U1 affiche une compr\u00e9hension remarquable pour sa taille. Des \u00e9tudes d\u2019ablation r\u00e9v\u00e8lent un point cl\u00e9 : int\u00e9grer les t\u00e2ches de g\u00e9n\u00e9ration et d\u2019\u00e9dition dans l\u2019entra\u00eenement am\u00e9liore la performance de compr\u00e9hension de <strong>1,14 point<\/strong>, soulignant la forte synergie de l\u2019apprentissage multi-t\u00e2ches. Concr\u00e8tement, le mod\u00e8le ne se limite pas \u00e0 \u00ab voir \u00bb ou d\u00e9crire une image : il l\u2019interpr\u00e8te, la contextualise et raisonne dessus avec une agilit\u00e9 autrefois r\u00e9serv\u00e9e \u00e0 des r\u00e9seaux bien plus vastes.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"435\" height=\"682\" src=\"https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-4.png\" alt=\"\" class=\"wp-image-11540\" style=\"width:286px;height:auto\" srcset=\"https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-4.png 435w, https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-4-383x600.png 383w, https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-4-159x250.png 159w, https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-4-19x30.png 19w\" sizes=\"(max-width: 435px) 100vw, 435px\" \/><\/figure>\n\n\n\n<h4 class=\"wp-block-heading\">Capturer l\u2019imaginaire : g\u00e9n\u00e9ration texte-vers-image<\/h4>\n\n\n\n<figure class=\"wp-block-image size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"945\" height=\"783\" src=\"https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-5.png\" alt=\"\" class=\"wp-image-11543\" style=\"width:682px;height:auto\" srcset=\"https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-5.png 945w, https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-5-600x497.png 600w, https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-5-250x207.png 250w, https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-5-768x636.png 768w, https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-5-30x25.png 30w\" sizes=\"(max-width: 945px) 100vw, 945px\" \/><\/figure>\n\n\n\n<p>Les capacit\u00e9s d\u2019Ovis-U1 s\u2019\u00e9tendent au domaine cr\u00e9atif de la g\u00e9n\u00e9ration d\u2019images \u00e0 partir de texte, avec un score de <strong>83,72 sur DPG-Bench<\/strong> et <strong>0,89 sur GenEval<\/strong>, d\u00e9passant m\u00eame GPT-4o sur certaines t\u00e2ches complexes. Que ce soit pour g\u00e9n\u00e9rer des sc\u00e8nes multi-objets, compter des \u00e9l\u00e9ments ou positionner pr\u00e9cis\u00e9ment des objets, Ovis-U1 d\u00e9montre une ma\u00eetrise fine des relations spatiales et contextuelles.<br>Cette fid\u00e9lit\u00e9 est rendue possible par son <strong>d\u00e9codage visuel de 1 milliard de param\u00e8tres<\/strong>, bas\u00e9 sur le <strong>Multimodal Diffusion Transformer (MMDiT)<\/strong> et le <strong>Rotary Positional Embedding (RoPE)<\/strong>. Une strat\u00e9gie d\u2019entra\u00eenement progressive affine encore l\u2019int\u00e9gration des textes, garantissant que chaque image g\u00e9n\u00e9r\u00e9e refl\u00e8te fid\u00e8lement l\u2019intention du prompt.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"1084\" height=\"569\" src=\"https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-6.png\" alt=\"\" class=\"wp-image-11546\" style=\"width:471px;height:auto\" srcset=\"https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-6.png 1084w, https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-6-600x315.png 600w, https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-6-250x131.png 250w, https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-6-768x403.png 768w, https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-6-30x16.png 30w\" sizes=\"(max-width: 1084px) 100vw, 1084px\" \/><\/figure>\n\n\n\n<h4 class=\"wp-block-heading\">Red\u00e9finir l\u2019\u00e9dition d\u2019images<\/h4>\n\n\n\n<p>Ovis-U1 brille tout autant dans l\u2019\u00e9dition d\u2019images, avec des r\u00e9sultats solides sur <strong>ImgEdit-Bench (4,00)<\/strong> et <strong>GEdit-Bench-EN (6,42)<\/strong>.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"945\" height=\"832\" src=\"https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-7.png\" alt=\"\" class=\"wp-image-11549\" style=\"width:668px;height:auto\" srcset=\"https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-7.png 945w, https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-7-600x528.png 600w, https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-7-250x220.png 250w, https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-7-768x676.png 768w, https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-7-30x26.png 30w\" sizes=\"(max-width: 945px) 100vw, 945px\" \/><\/figure>\n\n\n\n<p>Son <strong>raffineur de tokens bidirectionnel<\/strong> joue ici un r\u00f4le central, renfor\u00e7ant le dialogue entre repr\u00e9sentations textuelles et visuelles. Remplacer des objets, supprimer des \u00e9l\u00e9ments ou ajuster finement un style se fait avec pr\u00e9cision, l\u00e0 o\u00f9 il fallait auparavant recourir \u00e0 plusieurs syst\u00e8mes sp\u00e9cialis\u00e9s. En pratique, Ovis-U1 ne se contente pas de g\u00e9n\u00e9rer : il collabore avec l\u2019intention de l\u2019utilisateur pour traduire naturellement les descriptions en modifications visuelles tangibles.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"1254\" height=\"623\" src=\"https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-8.png\" alt=\"\" class=\"wp-image-11552\" style=\"width:709px;height:auto\" srcset=\"https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-8.png 1254w, https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-8-600x298.png 600w, https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-8-250x124.png 250w, https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-8-768x382.png 768w, https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-8-30x15.png 30w\" sizes=\"(max-width: 1254px) 100vw, 1254px\" \/><\/figure>\n\n\n\n<h4 class=\"wp-block-heading\">La force de la diversit\u00e9 dans l\u2019entra\u00eenement<\/h4>\n\n\n\n<p>La performance d\u2019Ovis-U1 repose sur la vari\u00e9t\u00e9 et la richesse de ses donn\u00e9es d\u2019entra\u00eenement. Des jeux publics comme <strong>COYO, Wukong, Laion, ShareGPT4V<\/strong> sont pr\u00e9trait\u00e9s pour plus de clart\u00e9, tandis que de grands ensembles texte-image (<strong>Laion5B, JourneyDB<\/strong>) apportent une guidance esth\u00e9tique de qualit\u00e9. Le mod\u00e8le exploite aussi des jeux sp\u00e9cialis\u00e9s pour l\u2019\u00e9dition, la g\u00e9n\u00e9ration guid\u00e9e par r\u00e9f\u00e9rence, le contr\u00f4le pixel, ainsi que des donn\u00e9es internes pour le transfert de style, la suppression de contenu ou l\u2019am\u00e9lioration d\u2019image.<br>En <strong>six \u00e9tapes d\u2019entra\u00eenement unifi\u00e9<\/strong>, ces sources convergent, apprenant au mod\u00e8le \u00e0 comprendre, g\u00e9n\u00e9rer et \u00e9diter de mani\u00e8re coh\u00e9rente.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"1050\" height=\"693\" src=\"https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-10.png\" alt=\"\" class=\"wp-image-11558\" style=\"width:685px;height:auto\" srcset=\"https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-10.png 1050w, https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-10-600x396.png 600w, https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-10-250x165.png 250w, https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-10-768x507.png 768w, https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-10-30x20.png 30w\" sizes=\"(max-width: 1050px) 100vw, 1050px\" \/><\/figure>\n\n\n\n<h4 class=\"wp-block-heading\">Innovations architecturales<\/h4>\n\n\n\n<p>L\u2019architecture d\u2019Ovis-U1 illustre un \u00e9quilibre pr\u00e9cis entre compacit\u00e9 et capacit\u00e9s. Son <strong>d\u00e9codage visuel<\/strong>, son <strong>raffineur bidirectionnel<\/strong> et son <strong>LLM Qwen3-1.7B<\/strong> travaillent de concert avec un encodeur visuel adapt\u00e9 aux r\u00e9solutions arbitraires. Cette conception favorise l\u2019int\u00e9gration efficace des connaissances multimodales, garantissant que compr\u00e9hension visuelle et g\u00e9n\u00e9ration se renforcent mutuellement. En traitant texte et image comme des processus li\u00e9s et non s\u00e9par\u00e9s, Ovis-U1 atteint une coh\u00e9sion rarement observ\u00e9e dans des mod\u00e8les de cette taille.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"945\" height=\"440\" src=\"https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-11.png\" alt=\"\" class=\"wp-image-11561\" style=\"width:689px;height:auto\" srcset=\"https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-11.png 945w, https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-11-600x279.png 600w, https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-11-250x116.png 250w, https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-11-768x358.png 768w, https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-11-30x14.png 30w\" sizes=\"(max-width: 945px) 100vw, 945px\" \/><\/figure>\n\n\n\n<h4 class=\"wp-block-heading\">Une symphonie d\u2019entra\u00eenement en six \u00e9tapes<\/h4>\n\n\n\n<p>La performance du mod\u00e8le ne doit rien au hasard.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"945\" height=\"698\" src=\"https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-12.png\" alt=\"\" class=\"wp-image-11564\" style=\"width:701px;height:auto\" srcset=\"https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-12.png 945w, https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-12-600x443.png 600w, https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-12-250x185.png 250w, https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-12-768x567.png 768w, https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-12-30x22.png 30w\" sizes=\"(max-width: 945px) 100vw, 945px\" \/><\/figure>\n\n\n\n<p>Son pipeline d\u2019entra\u00eenement en six \u00e9tapes aligne m\u00e9thodiquement compr\u00e9hension, g\u00e9n\u00e9ration et \u00e9dition : du pr\u00e9-entra\u00eenement initial du d\u00e9codeur visuel \u00e0 l\u2019alignement par adaptateurs, en passant par le raffinement des embeddings multimodaux et l\u2019apprentissage progressif en compr\u00e9hension comme en g\u00e9n\u00e9ration. La derni\u00e8re \u00e9tape, un affinage avec des donn\u00e9es d\u2019\u00e9dition d\u2019images, renforce \u00e0 la fois les capacit\u00e9s de g\u00e9n\u00e9ration et d\u2019\u00e9dition. Cette progression structur\u00e9e est la cl\u00e9 pour atteindre une <strong>performance fluide entre les t\u00e2ches<\/strong>, permettant \u00e0 Ovis-U1 de fonctionner comme un cerveau unifi\u00e9 plut\u00f4t qu\u2019un assemblage de syst\u00e8mes sp\u00e9cialis\u00e9s.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"1328\" height=\"675\" src=\"https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-13.png\" alt=\"\" class=\"wp-image-11567\" style=\"width:718px;height:auto\" srcset=\"https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-13.png 1328w, https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-13-600x305.png 600w, https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-13-250x127.png 250w, https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-13-768x390.png 768w, https:\/\/novelis.io\/wp-content\/uploads\/2025\/09\/image-13-30x15.png 30w\" sizes=\"(max-width: 1328px) 100vw, 1328px\" \/><\/figure>\n\n\n\n<h4 class=\"wp-block-heading\">Impact industriel concret<\/h4>\n\n\n\n<p>En pratique, les capacit\u00e9s unifi\u00e9es d\u2019Ovis-U1 se traduisent par des b\u00e9n\u00e9fices tangibles. En regroupant plusieurs mod\u00e8les sp\u00e9cialis\u00e9s en un seul, il simplifie les workflows : it\u00e9rations de design, synth\u00e8se multi-vues, transfert de style, \u00e9dition d\u2019images\u2026 le tout via de simples instructions en langage naturel.<br>Sa polyvalence couvre aussi la d\u00e9tection d\u2019objets, la segmentation, l\u2019estimation de profondeur et plus encore, ce qui en fait un atout pour l\u2019industrie manufacturi\u00e8re, le design, le marketing, la robotique et l\u2019informatique spatiale. Sa taille compacte garantit par ailleurs une <strong>efficacit\u00e9 computationnelle<\/strong> qui facilite le d\u00e9ploiement sans sacrifier les performances.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Perspectives<\/h4>\n\n\n\n<p>L\u2019avenir d\u2019Ovis-U1 s\u2019annonce prometteur. L\u2019agrandissement du mod\u00e8le devrait offrir des sorties encore plus fid\u00e8les, tandis que la curation de jeux de donn\u00e9es image-texte plus riches renforcera sa capacit\u00e9 de g\u00e9n\u00e9ralisation. Des optimisations architecturales et l\u2019usage de l\u2019apprentissage par renforcement pourraient am\u00e9liorer sa compr\u00e9hension de l\u2019intention humaine, produisant des r\u00e9sultats \u00e0 la fois plus justes, s\u00fbrs et align\u00e9s avec les attentes des utilisateurs.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Conclusion<\/h4>\n\n\n\n<p>Ovis-U1 d\u00e9montre que des <strong>mod\u00e8les multimodaux unifi\u00e9s et compacts peuvent rivaliser<\/strong>, voire surpasser, des syst\u00e8mes plus grands et sp\u00e9cialis\u00e9s. Gr\u00e2ce \u00e0 une architecture innovante, un raffinement bidirectionnel, des donn\u00e9es vari\u00e9es et un entra\u00eenement m\u00e9thodique, il atteint un niveau de performance de pointe en compr\u00e9hension, g\u00e9n\u00e9ration et \u00e9dition. Plus qu\u2019une prouesse technique, Ovis-U1 repr\u00e9sente une <strong>solution pratique, scalable et polyvalente<\/strong>, pr\u00eate \u00e0 transformer les workflows industriels comme les usages cr\u00e9atifs \u2014 une nouvelle \u00e9tape dans l\u2019\u00e9volution de l\u2019IA.<\/p>\n","protected":false},"featured_media":11864,"template":"","categories":[510],"custom_tag":[535,536],"class_list":["post-11582","research-lab","type-research-lab","status-publish","has-post-thumbnail","hentry","category-lab-news-2","custom_tag-multimodale","custom_tag-ovis-u1"],"acf":{"externel_link":"","summary":"","filter_opacity":"70","subtitle":"","reading_time":"","authors":"","document_to_download":{"upload_a_file":false,"download_without_form":false,"file":false,"url":""},"show_recent_block_on_the_bottom_of_the_page":false},"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v25.6 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Ovis-U1 : Compr\u00e9hension, g\u00e9n\u00e9ration et \u00e9dition multimodales unifi\u00e9es<\/title>\n<meta name=\"description\" content=\"Dans le monde en constante \u00e9volution de l\u2019intelligence artificielle, la recherche de mod\u00e8les capables de comprendre, g\u00e9n\u00e9rer et manipuler de mani\u00e8re fluide des informations visuelles et textuelles a franchi une \u00e9tape importante avec Ovis-U1.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/novelis.io\/fr\/research-lab\/ovis-u1-comprehension-generation-et-edition-multimodales-unifiees\/\" \/>\n<meta property=\"og:locale\" content=\"fr_FR\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Ovis-U1 : Compr\u00e9hension, g\u00e9n\u00e9ration et \u00e9dition multimodales unifi\u00e9es\" \/>\n<meta property=\"og:description\" content=\"Dans le monde en constante \u00e9volution de l\u2019intelligence artificielle, la recherche de mod\u00e8les capables de comprendre, g\u00e9n\u00e9rer et manipuler de mani\u00e8re fluide des informations visuelles et textuelles a franchi une \u00e9tape importante avec Ovis-U1.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/novelis.io\/fr\/research-lab\/ovis-u1-comprehension-generation-et-edition-multimodales-unifiees\/\" \/>\n<meta property=\"og:site_name\" content=\"Novelis innovation\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/novelis.io\" \/>\n<meta property=\"article:modified_time\" content=\"2025-10-06T10:09:36+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/novelis.io\/wp-content\/uploads\/2025\/10\/image-Site-scaled.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"2560\" \/>\n\t<meta property=\"og:image:height\" content=\"1440\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:site\" content=\"@novelis_io\" \/>\n<meta name=\"twitter:label1\" content=\"Dur\u00e9e de lecture estim\u00e9e\" \/>\n\t<meta name=\"twitter:data1\" content=\"7 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/novelis.io\/fr\/research-lab\/ovis-u1-comprehension-generation-et-edition-multimodales-unifiees\/\",\"url\":\"https:\/\/novelis.io\/fr\/research-lab\/ovis-u1-comprehension-generation-et-edition-multimodales-unifiees\/\",\"name\":\"Ovis-U1 : Compr\u00e9hension, g\u00e9n\u00e9ration et \u00e9dition multimodales unifi\u00e9es\",\"isPartOf\":{\"@id\":\"https:\/\/novelis.io\/fr\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/novelis.io\/fr\/research-lab\/ovis-u1-comprehension-generation-et-edition-multimodales-unifiees\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/novelis.io\/fr\/research-lab\/ovis-u1-comprehension-generation-et-edition-multimodales-unifiees\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/novelis.io\/wp-content\/uploads\/2025\/10\/image-Site-scaled.jpg\",\"datePublished\":\"2025-10-06T10:03:48+00:00\",\"dateModified\":\"2025-10-06T10:09:36+00:00\",\"description\":\"Dans le monde en constante \u00e9volution de l\u2019intelligence artificielle, la recherche de mod\u00e8les capables de comprendre, g\u00e9n\u00e9rer et manipuler de mani\u00e8re fluide des informations visuelles et textuelles a franchi une \u00e9tape importante avec Ovis-U1.\",\"breadcrumb\":{\"@id\":\"https:\/\/novelis.io\/fr\/research-lab\/ovis-u1-comprehension-generation-et-edition-multimodales-unifiees\/#breadcrumb\"},\"inLanguage\":\"fr-FR\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/novelis.io\/fr\/research-lab\/ovis-u1-comprehension-generation-et-edition-multimodales-unifiees\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"fr-FR\",\"@id\":\"https:\/\/novelis.io\/fr\/research-lab\/ovis-u1-comprehension-generation-et-edition-multimodales-unifiees\/#primaryimage\",\"url\":\"https:\/\/novelis.io\/wp-content\/uploads\/2025\/10\/image-Site-scaled.jpg\",\"contentUrl\":\"https:\/\/novelis.io\/wp-content\/uploads\/2025\/10\/image-Site-scaled.jpg\",\"width\":2560,\"height\":1440},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/novelis.io\/fr\/research-lab\/ovis-u1-comprehension-generation-et-edition-multimodales-unifiees\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Accueil\",\"item\":\"https:\/\/novelis.io\/fr\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Ovis-U1 : Compr\u00e9hension, g\u00e9n\u00e9ration et \u00e9dition multimodales unifi\u00e9es\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/novelis.io\/fr\/#website\",\"url\":\"https:\/\/novelis.io\/fr\/\",\"name\":\"Novelis innovation\",\"description\":\"Novelis innovation\",\"publisher\":{\"@id\":\"https:\/\/novelis.io\/fr\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/novelis.io\/fr\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"fr-FR\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/novelis.io\/fr\/#organization\",\"name\":\"Novelis innovation\",\"url\":\"https:\/\/novelis.io\/fr\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"fr-FR\",\"@id\":\"https:\/\/novelis.io\/fr\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/novelis.io\/wp-content\/uploads\/2021\/12\/logo-1.png\",\"contentUrl\":\"https:\/\/novelis.io\/wp-content\/uploads\/2021\/12\/logo-1.png\",\"width\":479,\"height\":98,\"caption\":\"Novelis innovation\"},\"image\":{\"@id\":\"https:\/\/novelis.io\/fr\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.facebook.com\/novelis.io\",\"https:\/\/x.com\/novelis_io\",\"https:\/\/www.linkedin.com\/company\/novelis-consulting\/\",\"https:\/\/www.youtube.com\/channel\/UCJ5eJR22n2GtfKaTWueWRPQ\"]}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Ovis-U1 : Compr\u00e9hension, g\u00e9n\u00e9ration et \u00e9dition multimodales unifi\u00e9es","description":"Dans le monde en constante \u00e9volution de l\u2019intelligence artificielle, la recherche de mod\u00e8les capables de comprendre, g\u00e9n\u00e9rer et manipuler de mani\u00e8re fluide des informations visuelles et textuelles a franchi une \u00e9tape importante avec Ovis-U1.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/novelis.io\/fr\/research-lab\/ovis-u1-comprehension-generation-et-edition-multimodales-unifiees\/","og_locale":"fr_FR","og_type":"article","og_title":"Ovis-U1 : Compr\u00e9hension, g\u00e9n\u00e9ration et \u00e9dition multimodales unifi\u00e9es","og_description":"Dans le monde en constante \u00e9volution de l\u2019intelligence artificielle, la recherche de mod\u00e8les capables de comprendre, g\u00e9n\u00e9rer et manipuler de mani\u00e8re fluide des informations visuelles et textuelles a franchi une \u00e9tape importante avec Ovis-U1.","og_url":"https:\/\/novelis.io\/fr\/research-lab\/ovis-u1-comprehension-generation-et-edition-multimodales-unifiees\/","og_site_name":"Novelis innovation","article_publisher":"https:\/\/www.facebook.com\/novelis.io","article_modified_time":"2025-10-06T10:09:36+00:00","og_image":[{"width":2560,"height":1440,"url":"https:\/\/novelis.io\/wp-content\/uploads\/2025\/10\/image-Site-scaled.jpg","type":"image\/jpeg"}],"twitter_card":"summary_large_image","twitter_site":"@novelis_io","twitter_misc":{"Dur\u00e9e de lecture estim\u00e9e":"7 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/novelis.io\/fr\/research-lab\/ovis-u1-comprehension-generation-et-edition-multimodales-unifiees\/","url":"https:\/\/novelis.io\/fr\/research-lab\/ovis-u1-comprehension-generation-et-edition-multimodales-unifiees\/","name":"Ovis-U1 : Compr\u00e9hension, g\u00e9n\u00e9ration et \u00e9dition multimodales unifi\u00e9es","isPartOf":{"@id":"https:\/\/novelis.io\/fr\/#website"},"primaryImageOfPage":{"@id":"https:\/\/novelis.io\/fr\/research-lab\/ovis-u1-comprehension-generation-et-edition-multimodales-unifiees\/#primaryimage"},"image":{"@id":"https:\/\/novelis.io\/fr\/research-lab\/ovis-u1-comprehension-generation-et-edition-multimodales-unifiees\/#primaryimage"},"thumbnailUrl":"https:\/\/novelis.io\/wp-content\/uploads\/2025\/10\/image-Site-scaled.jpg","datePublished":"2025-10-06T10:03:48+00:00","dateModified":"2025-10-06T10:09:36+00:00","description":"Dans le monde en constante \u00e9volution de l\u2019intelligence artificielle, la recherche de mod\u00e8les capables de comprendre, g\u00e9n\u00e9rer et manipuler de mani\u00e8re fluide des informations visuelles et textuelles a franchi une \u00e9tape importante avec Ovis-U1.","breadcrumb":{"@id":"https:\/\/novelis.io\/fr\/research-lab\/ovis-u1-comprehension-generation-et-edition-multimodales-unifiees\/#breadcrumb"},"inLanguage":"fr-FR","potentialAction":[{"@type":"ReadAction","target":["https:\/\/novelis.io\/fr\/research-lab\/ovis-u1-comprehension-generation-et-edition-multimodales-unifiees\/"]}]},{"@type":"ImageObject","inLanguage":"fr-FR","@id":"https:\/\/novelis.io\/fr\/research-lab\/ovis-u1-comprehension-generation-et-edition-multimodales-unifiees\/#primaryimage","url":"https:\/\/novelis.io\/wp-content\/uploads\/2025\/10\/image-Site-scaled.jpg","contentUrl":"https:\/\/novelis.io\/wp-content\/uploads\/2025\/10\/image-Site-scaled.jpg","width":2560,"height":1440},{"@type":"BreadcrumbList","@id":"https:\/\/novelis.io\/fr\/research-lab\/ovis-u1-comprehension-generation-et-edition-multimodales-unifiees\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Accueil","item":"https:\/\/novelis.io\/fr\/"},{"@type":"ListItem","position":2,"name":"Ovis-U1 : Compr\u00e9hension, g\u00e9n\u00e9ration et \u00e9dition multimodales unifi\u00e9es"}]},{"@type":"WebSite","@id":"https:\/\/novelis.io\/fr\/#website","url":"https:\/\/novelis.io\/fr\/","name":"Novelis innovation","description":"Novelis innovation","publisher":{"@id":"https:\/\/novelis.io\/fr\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/novelis.io\/fr\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"fr-FR"},{"@type":"Organization","@id":"https:\/\/novelis.io\/fr\/#organization","name":"Novelis innovation","url":"https:\/\/novelis.io\/fr\/","logo":{"@type":"ImageObject","inLanguage":"fr-FR","@id":"https:\/\/novelis.io\/fr\/#\/schema\/logo\/image\/","url":"https:\/\/novelis.io\/wp-content\/uploads\/2021\/12\/logo-1.png","contentUrl":"https:\/\/novelis.io\/wp-content\/uploads\/2021\/12\/logo-1.png","width":479,"height":98,"caption":"Novelis innovation"},"image":{"@id":"https:\/\/novelis.io\/fr\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/novelis.io","https:\/\/x.com\/novelis_io","https:\/\/www.linkedin.com\/company\/novelis-consulting\/","https:\/\/www.youtube.com\/channel\/UCJ5eJR22n2GtfKaTWueWRPQ"]}]}},"_links":{"self":[{"href":"https:\/\/novelis.io\/fr\/wp-json\/wp\/v2\/research-lab\/11582"}],"collection":[{"href":"https:\/\/novelis.io\/fr\/wp-json\/wp\/v2\/research-lab"}],"about":[{"href":"https:\/\/novelis.io\/fr\/wp-json\/wp\/v2\/types\/research-lab"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/novelis.io\/fr\/wp-json\/wp\/v2\/media\/11864"}],"wp:attachment":[{"href":"https:\/\/novelis.io\/fr\/wp-json\/wp\/v2\/media?parent=11582"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/novelis.io\/fr\/wp-json\/wp\/v2\/categories?post=11582"},{"taxonomy":"custom_tag","embeddable":true,"href":"https:\/\/novelis.io\/fr\/wp-json\/wp\/v2\/custom_tag?post=11582"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}