TY - GEN
T1 - Enhancing Arabic Content Generation with Prompt Augmentation Using Integrated GPT and Text-to-Image Models
AU - Elsharif, Wala
AU - She, James
AU - Nakov, Preslav
AU - Wong, Simon
N1 - Publisher Copyright:
© 2023 Owner/Author.
PY - 2023/6/12
Y1 - 2023/6/12
N2 - With the current and continuous advancements in the field of text-to-image modeling, it has become critical to design prompts that make the best of these model capabilities and guides them to generate the most desirable images, and thus the field of prompt engineering has emerged. Here, we study a method to use prompt engineering to enhance text-to-image model representation of the Arabic culture. This work proposes a simple, novel approach for prompt engineering that uses the domain knowledge of a state-of-the-art language model, GPT, to perform the task of prompt augmentation, where a simple, initial prompt is used to generate multiple, more detailed prompts related to the Arabic culture from multiple categories through a GPT model through a process known as in-context learning. The augmented prompts are then used to generate images enhanced for the Arabic culture. We perform multiple experiments with a number of participants to evaluate the performance of the proposed method, which shows promising results, specially for generating prompts that are more inclusive of the different Arabic countries and with a wider variety in terms of image subjects, where we find that our proposed method generates image with more variety 85 % of the time and are more inclusive of the Arabic countries more than 72.66 % of the time, compared to the direct approach.
AB - With the current and continuous advancements in the field of text-to-image modeling, it has become critical to design prompts that make the best of these model capabilities and guides them to generate the most desirable images, and thus the field of prompt engineering has emerged. Here, we study a method to use prompt engineering to enhance text-to-image model representation of the Arabic culture. This work proposes a simple, novel approach for prompt engineering that uses the domain knowledge of a state-of-the-art language model, GPT, to perform the task of prompt augmentation, where a simple, initial prompt is used to generate multiple, more detailed prompts related to the Arabic culture from multiple categories through a GPT model through a process known as in-context learning. The augmented prompts are then used to generate images enhanced for the Arabic culture. We perform multiple experiments with a number of participants to evaluate the performance of the proposed method, which shows promising results, specially for generating prompts that are more inclusive of the different Arabic countries and with a wider variety in terms of image subjects, where we find that our proposed method generates image with more variety 85 % of the time and are more inclusive of the Arabic countries more than 72.66 % of the time, compared to the direct approach.
KW - Arabic culture
KW - Gpt
KW - Integrated systems
KW - Prompt engineering
UR - http://www.scopus.com/inward/record.url?scp=85173220944&partnerID=8YFLogxK
U2 - 10.1145/3573381.3596466
DO - 10.1145/3573381.3596466
M3 - Conference contribution
AN - SCOPUS:85173220944
T3 - IMX 2023 - Proceedings of the 2023 ACM International Conference on Interactive Media Experiences
SP - 276
EP - 288
BT - Proceedings Of The 2023 Acm International Conference On Interactive Media Experiences, Imx 2023
PB - Association for Computing Machinery, Inc
T2 - 2023 ACM International Conference on Interactive Media Experiences, IMX 2023
Y2 - 12 June 2023 through 15 June 2023
ER -