Rolling the Dice on Diversity: Crafting Ethnically Diverse D&D Characters with Midjourney Prompts
If you’re new to Midjourney
In technical jargon, Midjourney is a generative ‘artificial intelligence’ program that uses stable diffusion to create images based on natural language descriptions, known as “prompts”. Essentially, it employs a specialized computer program that comprehends words and phrases provided by the user to craft intricate images. Midjourney operates by utilizing millions, if not billions, of data points and previous images, drawing patterns from them to create what you request. The secret lies in mastering how to generate prompts, enabling Midjourney to transfer what’s in your imagination onto a visual medium, which has spawned a new career in prompt generation.
Midjourney has its Preferences
If you’ve ever used Midjourney, you’ll quickly discover that it has a default preference in the photorealistic representation of humans: thin, young, Caucasian women. I conducted a small experiment, asking Midjourney to generate five different images of humans engaged in as gender-neutral activities as possible (since every part of the description can skew its generation bias) and observed a 85% bias towards female representation per this small sample size.
Prompts: a person smiling into the camera / a person laughing with friends / a person eating
While no one truly comprehends the full mechanics under the hood of a generative AI program due to its complexity, several theories might explain the bias towards Caucasian women:
-
-
- Internet Demographics: Given that a substantial portion of training data originates from the internet, and internet users have historically skewed towards certain demographics, this bias can be inadvertently absorbed by the model.
- Content Creation: If content creators (photographers, writers, etc.) predominantly belong to a particular demographic, their work might mirror that, contributing to data bias.
- Beautiful women often being the focus of photography and art might be a source of this bias.
- Objective Function: If the model’s objective function (what it is trying to optimize) does not explicitly account for or penalize bias, it might not learn to generate diverse outputs.
- Feature Importance: The model might learn that certain features (like skin color) are more “important” or “informative” based on the biased training data, and thus prioritize them in generation.
- User Bias: If the model is fine-tuned or adjusted based on user interactions and most users are of a particular demographic or exhibit their own biases in interaction, the model might adapt to favor these biases.
- Feedback Loops: If initial outputs that are biased receive more interactions, clicks, or other forms of engagement, the model might interpret this as positive reinforcement for such outputs.
- Data Representation: If the training data contains a disproportionate number of images or data points representing Caucasian individuals, the model will learn to prioritize or default to these characteristics.
-
Addressing bias in such a colossal program, once operational, can be challenging, necessitating continuous minor corrections that the programmers of Midjourney are likely addressing.
DnD Character Building
As a newcomer to DnD, I found myself swept away, purchasing miniature paint sets and utilizing Midjourney to bring our dungeon master’s world to life. This is when I quickly discerned the bias associated with photorealistic images. Upon researching at the time, the succinct answer I found was that if I wanted to generate a wood elf, a program like Midjourney likely drew images from “Lord of the Rings” (nearly all-Caucasian cast) and other similar films to generate a comparable but new image, using these previous elf images as a blueprint.
Prompt: a photorealistic image of an elf standing majestically in its woodland realm
However, elves in DnD can have copper and red skin, and of course, other races in DnD are not restricted to the Caucasian persuasion. So, I decided to have some fun with Midjourney and create a prompt guide to craft a wide variety of DnD ethnicities.
Limitations:
Questionable “Artificial Intelligence”
The reason why I struggle with ‘artificial intelligence’ being associated with Midjourney is because, when creating photorealistic images, Midjourney can’t generate images that are truly novel to its vast blueprint. For example, if I wanted to generate a photorealistic image of a human with pointy ears (for an elf/half-elf), Midjourney won’t generate it. I tried EVERY combination I could, read forums, and went down rabbit holes, but Midjourney has so many images of humans with normal ears that even if you explicitly say pointy ears, it won’t do it because a human has these ears and therefore it will generate a human with human ears.
So, apologies for the photorealistic ‘Elf’ section of my Midjourney prompt guide since these elves have human ears.
Explicitly Defining Parameters
Bulldozing through the delicate terrain of Western social propriety by describing exactly what I wanted generated sometimes made me feel a bit uncomfortable. But if I didn’t explicitly define weight, ethnicity, gender, etc., then we would only get our default sexy white princesses. I tried different combinations of phrases to try to get the best results I could.
Creating My Diverse DnD Midjourney Prompt Guide
I’ve created this DnD diversity guide for anyone who is eager to craft their characters and wants a quick copy, augment, and paste method to achieve their desired results. I hope that others can benefit from the hours of time and research I’ve invested and have some fun with it!
Content Breakdown:
I didn’t want this PDF to be too large, so I chose four common DnD races (Elves, Half-Orcs, Halflings, Humans) and for each race I generated ten photorealistic, ten semi-realistic, and ten artsy creative images for a total of 120 images. Please note that some images will display class and some are more neutral.
The pdf table of contents breakdown:
-
-
-
- Elves
- Photorealistic
- Semi-realistic
- Creative
- Half-Orcs
- Photorealistic
- Semi-realistic
- Creative
- Halflings
- Photorealistic
- Semi-realistic
- Creative
- Humans
- Photorealistic
- Semi-realistic
- Creative
- Elves
-
-
As mentioned in the limitations, the Elves and Humans look the same in the photorealistic portion. You can always use Photoshop to add in the pointy ears if you’d like.
Representative of the World Population:
I asked ChatGPT to give me a breakdown of the world population:
Me: “if I had 30 people that represented the racial makeup of the global population, what proportion of people would be what ethnicity”
ChatGPT: Hello again! If you were to break down a group of 30 people to represent the racial and ethnic makeup of the global population, you'd likely consider the most populous ethnic groups and their relative sizes. As of my last update in 2021, here's a rough breakdown: East Asian (predominantly Han Chinese): China, with its majority Han population, has over a billion people. Considering the global population is around 7.8 billion, Han Chinese make up roughly 19% of the global total. Represented in the group: 5-6 people South Asian (includes various ethnicities like Indo-Aryan, Dravidian): India, Pakistan, Bangladesh, and other South Asian countries account for a significant portion of the global population, roughly 25%. Represented in the group: 7-8 people Black/African: Africa's population is diverse with numerous ethnic groups. As a continent, Africa has over a billion people, which is roughly 16% of the global population. Represented in the group: 4-5 people Caucasian/White: Populations from Europe, North America, Australia, and parts of West Asia fall under this category. They make up roughly 15% of the global population. Represented in the group: 4-5 people Southeast Asian: Countries like Indonesia, the Philippines, Vietnam, Thailand, and Myanmar. They contribute roughly 8% of the global population. Represented in the group: 2-3 people Arab/West Asian: Includes countries from the Middle East and parts of North Africa. They make up roughly 5% of the global population. Represented in the group: 1-2 people Indigenous and other groups: This would include native populations from regions like the Americas, Oceania, and other smaller ethnic groups globally. They collectively make up a smaller portion of the global population. Represented in the group: 1 person Latino/Hispanic: While Latino/Hispanic is an ethno-linguistic category rather than a racial one, they make up a notable portion of the global population, especially from countries in Central and South America. Represented in the group: 2-3 people Please note that these are broad estimations, and the actual distribution can vary depending on more specific demographic data. The concept of "race" and "ethnicity" is complex and can be defined differently across cultures and regions. This breakdown aims to give a general idea rather than an exact representation.
I also asked ChatGPT to provide me with resources to back up these statements which I’ve included at the end of this article.
I created a table to determine how many of each ethnicity to include in each DnD race, totaling thirty images:
East Asian (predominately Han) | South Asian | Black/African
|
Caucasian | Southeast Asian | Arab/West Asian |
Indigenous |
Latino |
6 | 7 | 5 | 4 | 2 | 2 | 2 | 2 |
I utilized a program to randomly generate the order of this table, which I then incorporated into the guide. For each subgroup of ten within each DnD race, I added an extra attribute of diversity, such as trans, wheelchair user, albino, etc. The true representation of diversity is endless, so I aimed to be as varied as possible within the limitations. However, if anyone has a specific request, I am also happy to accommodate it in another prompt guide!
The Art and Challenge of Creating Prompts
After that extensive preamble, this is where things get interesting! I downloaded the ChatGPT plugin Photorealistic, which is a Plugin trained in generating photorealistic prompts for the Midjourney image creation tool (https://gptstore.ai/plugins/midjourney-ruddy-vercel-app). But to be honest, the plugin was a bit glitchy and once I explained to ChatGPT what Midjourney was (similar to Dalle which it is familiar with), ChatGPT was able to generate more varied results since the Photorealistic plugin was repetitive and lacked imagination.
I thought this would be the easiest part – after all, I just need to tell ChatGPT to generate ten ethnically diverse elf prompts of different classes, augment if needed, generate them in Midjourney, and then record the results. This was not the case.
ChatGPT Prompt Practice and Experimentation
These are just some excerpts from different variations of when I first started out. I asked ChatGPT to generate ten prompts at a time and the results made me stop and pause. So, I ask for specific descriptions with different phrasings and got similar results:
Me: can you give me a photorealistic midjourney prompt for an ethnically south east asian dnd elf
ChatGPT: In a mystical Southeast Asian-inspired landscape, an elf, with features reflecting the rich ethnic diversity of Southeast Asia, stands ready for adventure, surrounded by a lush, fantastical environment and mythical beings. The elf is adorned in attire that is a fusion of traditional Southeast Asian and elven aesthetics, wielding a magical staff that glows with an ancient power. Medium: Photography. Style: Photo hyper-realism, capturing the essence of adventure and the rich cultural aesthetics of Southeast Asia, ensuring a high-resolution 16k visual. Lighting: The golden hour glow illuminates the scene, casting a warm light on the subject and the landscape. Colors: A harmonious blend of the vibrant colors typical in Southeast Asian attire and the lush greens and earthy browns of the landscape. Composition: Nikon D850 DSLR camera, AF-S NIKKOR 85mm f/1.4G lens, Resolution 45.7 megapixels, ISO sensitivity: 200, Shutter speed 1/160 second. --ar 16:9 --v 5.1 --style raw --q 2 --s 750
Midjourney Result:
Me: how would you describe someone from Sub Saharan Africa for a DnD Midjourney prompt?
ChatGPT: When describing someone from Sub-Saharan Africa in a DnD Midjourney prompt, it's crucial to approach the topic with respect and avoid perpetuating stereotypes. Here's a suggestion: 📷 Prompt: A DnD character inspired by the diverse cultures of Sub-Saharan Africa, standing poised in a fantastical savannah under a vibrant sunset. The character, with rich, dark skin and textured hair, is adorned in a majestic attire that draws inspiration from various traditional garments and accessories found across Sub-Saharan Africa, such as beadwork, headdresses, and patterned fabrics. They wield a magical staff, carved from a mythical tree, and their eyes gleam with wisdom and mystical power. Medium: Digital Art. Style: Semi-realistic, blending the rich cultural aesthetics of Sub-Saharan Africa with fantastical DnD elements. Lighting: The setting sun casts a warm, golden light across the scene, highlighting the character. Colors: A palette of rich earthy tones, vibrant reds, and oranges, and deep greens. --ar 16:9 --v 5.1 --style raw --q 2 --s 750 This prompt focuses on creating a character that respectfully integrates the rich and diverse cultural aesthetics of Sub-Saharan Africa with the enchanting elements of a DnD world.
Midjourney Result:
Me: how would you describe someone who looks like they’re from India in Midjourney?
ChatGPT: When describing someone in a Midjourney prompt, it's essential to approach the topic with sensitivity and respect, avoiding stereotypes or generalizations. Here's how you might describe a character that looks like they're from India in a Midjourney prompt: 📷 Prompt: Portrait of an individual with features characteristic of Indian heritage, adorned in traditional attire, capturing the depth of their gaze and the rich history of their culture. Medium: Photography. Style: Realistic, emphasizing the intricate details of the attire and the warmth of the individual's expression. Lighting: Soft, natural daylight, casting a gentle glow on the face. Colors: Rich saffron, deep maroon, and vibrant golds, complemented by natural skin tones. Composition: Canon EOS 5D Mark IV DSLR camera, EF 50mm f/1.8 STM lens, Resolution 30.4 megapixels, ISO sensitivity: 100, Shutter speed 1/125 second. --ar 16:9 --v 5.1 --style raw --q 2 --s 750 This prompt focuses on capturing the essence of the individual while acknowledging their Indian heritage in a respectful manner.
Midjourney Result:
Striking a Balance: Crafting Ethnically Diverse D&D Characters Without Perpetuating Stereotypes
Easy peasy, right? Not quite. Each ethnically diverse prompt felt Orientalized and seemed to reduce cultures to very specific stereotypes. Intriguingly, this only occurred for non-Caucasian prompts, with a majority of the emphasis placed on cultural heritage rather than the individual when it came to diversity. Every prompt included a small paragraph about respect, and while most of these prompts did yield some visually appealing images, they weren’t what I was aiming for.
Me: how would you describe a white person in midjourney?
ChatGPT: When describing a white person in a Midjourney prompt, it's essential to be straightforward and avoid any stereotypes. Here's a suggestion: 📷 Prompt: Portrait of a Caucasian individual, capturing the depth of their gaze and the nuances of their expression. Medium: Photography. Style: Realistic, emphasizing the intricate details of facial features and the individual's demeanor. Lighting: Soft, natural daylight, casting a gentle glow on the face. Colors: Natural skin tones, complemented by a neutral background. Composition: Canon EOS 5D Mark IV DSLR camera, EF 50mm f/1.8 STM lens, Resolution 30.4 megapixels, ISO sensitivity: 100, Shutter speed 1/125 second. --ar 16:9 --v 5.1 --style raw --q 2 --s 750 This prompt focuses on capturing the essence of the individual while acknowledging their Caucasian ethnicity in a straightforward manner.
Midjourney Result:
A crucial point to emphasize is that this cultural imbuing was NOT the case when I asked ChatGPT to describe a white person (as seen in the prompt above); no cultural heritage was woven into the description, just a neutral person. THIS was what I was looking for in describing someone of different ethnicities – a way to capture the appearance in its neutrality and then add that to the DnD universe. Thus, my perspective leaned towards the necessity of achieving a balance between maintaining the neutrality of different ethnicities without resorting to cultural reduction, stereotyping, and exoticization. In my six years of living in Hong Kong, I’ve never seen anyone strolling down the street in historic Qing dynasty garb, nor has my cherished ex-colleague from Gurugram paraded around in traditional Northern Indian attire—just as I don’t wander around in a leprechaun outfit to represent my Irish ancestry. Beyond that, these generalizations feel vacuous and unrepresentative of the people we encounter daily, as they overgeneralize rich cultures and position the Caucasian as the default, relegating all else as ‘other’.
As previously mentioned, this stems from implicit bias, the data with which Midjourney has been trained, and its interactions with users. However, that doesn’t mean a viable solution is out of reach!
I dedicated some time to requesting ChatGPT to generate DnD prompts without any focus on ethnicity. Subsequently, I asked it to describe individuals from different ethnic origins, with different variations of specificity to discern the best approach to achieve the desired results. I then took the neutral DnD race prompts and augmented them to add more specific descriptions of ethnicity in the prompt without making that the focal point, ensuring we’d have a splendid Wood elf wandering through the woods who also happened to have dark skin.
In Conclusion
Embarking on a seemingly straightforward task, I found that the journey to the solution was not only interesting but also provided a unique insight into the workings of Midjourney. This project prompted me to think over several days about why certain prompts felt misaligned with my intentions and why, at moments, I hesitated to proceed. While the prompts, deeply rooted in cultural heritage, occasionally produced cool results, they did not align with my initial aspiration: to create DnD characters that mirror the myriad of humans inhabiting our diverse world.
If ChatGPT had its way, it might represent Floridians (my home state) in a particular manner, approaching the topic with sensitivity and respect, sidestepping stereotypes and generalizations, capturing the essence of Florida while acknowledging its rich cultural history:
I think it’s safe to say that we’re still a ways away from the AI singularity and a world dominated by artificial intelligence.
Thanks for investing your time in navigating through this exploration and ramble of mine! Should you like more work, experimentation, exploration, or ramblings, please don’t hesitate to reach out. Also, if anyone is curious to see the somewhat stereotypical results from ChatGPT and Midjourney, just let me know – I did encounter some amusing results when inquiring about specific groups of Americans.
Resources (given to me by ChatGPT):
- East Asian (predominantly Han Chinese)
- World Bank Population Data for China
- Ethnologue’s data on Chinese language speakers (since language can often approximate ethnicity)
- South Asian
- World Bank Population Data (for countries like India, Pakistan, Bangladesh)
- Black/African
- Caucasian/White
- Southeast Asian
- Arab/West Asian
- Latino/Hispanic