Em-t2i: ensemble AI model for text-to-image synthesis using meta AI and microsoft copilot

Shagufta Faryad, Ijaz Ali Shoukat, Ubaid Ullah, Javed Ali Khan, Ayesha Faryad, Muhammad Arslan Rauf

Research output: Contribution to journalArticlepeer-review

Abstract

In the monarchy of AI-powered image creation, single models often struggle with aligning rapid performance and accurate throughput, excelling in one aspect but deficient in overall effectiveness. So far, no ensemble AI model has pooled the strengths of multiple systems to enhance performance. This emphasizes the need for an ensemble AI model to overcome individual limitations and improve overall efficiency. Thus, we propose an Ensemble EM-T2I AI Model integrating Meta AI for agile text elucidation and initial image creation, coupled with Microsoft Copilot for refining visuals to increase clarity and fidelity. The proposed model’s performance is tested across several accuracy metrics: Object Detection Accuracy, Object Attribute Accuracy, Scene Understanding Accuracy, Image Quality Score, and Overall Precision Score. Employing diverse text prompts, we conducted a comprehensive assessment to measure the accuracy and overall quality of the image outputs. Microsoft Copilot achieves an OPS of 73.5 and is proficient in object detection, attribute identification, scene understanding, and overall image quality. Meta AI proved a rapid processing competency with mostly accurate results, albeit with intermittent clarity issues. The presented ensemble EM-T2I model represents a promising direction for advancing AI-powered image creation with proven robust performance accomplishing the foremost OPS of 75.58, signifying its efficacy in augmenting image quality and semantic alignment through diverse text prompts. Future endeavours will further refine these prototypes to come across growing strains in image synthesis and elucidation.

Original languageEnglish
Article number569
JournalSignal, Image and Video Processing
Volume19
Early online date12 May 2025
DOIs
Publication statusE-pub ahead of print - 12 May 2025

Keywords

  • Deep dream generator
  • Ensemble model
  • Meta AI Llama 3
  • MS Copilot
  • Text2Image

Fingerprint

Dive into the research topics of 'Em-t2i: ensemble AI model for text-to-image synthesis using meta AI and microsoft copilot'. Together they form a unique fingerprint.

Cite this