3rd DATA-FM Workshop @ ICLR 2026 · Brazil
Unmasking LAION-5B: Age, Gender, Race,
and Emotion Biases in Large-Scale Image Datasets
Iris Dominguez-Catena  ·  Daniel Paternain  ·  Mikel Galar
Institute of Smart Cities (ISC), Dept. of Statistics, Computer Science and Mathematics · Public University of Navarre, Pamplona, Spain
✉ {iris.dominguez, daniel.paternain, mikel.galar}@unavarra.es
1 Motivation

Large-scale image-text datasets like LAION-5B [1] are foundational for generative AI (Stable Diffusion, DALL-E, FLUX). Yet their uncurated, web-scraped nature raises critical concerns about embedded biases.

Instead of focusing on harmful content or outputs of trained models, we analyze the underlying demographic composition of LAION-5B for bias [5,6]:

Representational bias: unbalanced demographic group prevalence

Stereotypical bias: unwarranted associations between demographics and other attributes

Intersectional bias: compounding effects at the intersection of multiple demographics

2 Methodology

Analysis Pipeline

Sample ~1M URLs
Download & Verify
Face Detection
Demographic
Inference
Bias
Analysis

Models Used

RetinaFace Face detection (≥48×48 px filter)
FairFace [2] Age, gender, race classification
DeepFace [3] Age, gender, race (cross-validation)
Emo-AffectNet [4] 7-class facial expression recognition

Bias Metric: Ducher's Z

Compares observed co-occurrence of group g and class y to expected co-occurrence if independent. Ranges from −1 (max. underrepresentation) to +1 (max. overrepresentation), with 0 indicating no association. Used for both intersectional and stereotypical bias analysis [5].

3 Dataset

We analyze the 2024 re-release of LAION-5B, studying both main components (english and multi-language) separately.

~1M
URLs
Sampled
~460K
Images
Downloaded
79.9K
Faces
Found

While ~0.02% of the full dataset, our sample provides a worst-case margin of error of ±0.51% at 95% confidence for the reported proportions.

Image content was hash-verified against LAION-5B metadata to ensure data integrity.

4 Representational Bias
Merged demographic profile: age, gender, race across FairFace & DeepFace
Distribution of demographic attributes (age, gender, race) in LAION-5B, according to FairFace and DeepFace.

Age: Both models show strong overrepresentation of 20–39 year-olds. LAION-2B-en skews younger than LAION-2B-multi.

Gender: Consistent male predominance, 57–61% male (FairFace), and larger in LAION-2B-multi.

Race: White is the largest group at 50–60%. The rest of the groups are consistently underrepresented across both models and both dataset components.

Emotions: Facial expression distribution is dominated by "Happiness" (33–36%) and "Neutrality" (25–26%).

5 Intersectional Bias
Gender–Age intersectional bias butterfly chart
Gender–Age intersectional bias (Ducher's Z, FairFace). Bars show over/underrepresentation.

Gender–Age shows the strongest consistent bias. Females are overrepresented below age 30 (peak: Z = 0.35 at 20–29), while males dominate above 30, reaching Z = 0.61 at 60–69.

Age–Race: Oldest groups (60+) underrepresented across most races. White infants are overrepresented, while all other race groups are underrepresented.

Gender–Race: Weaker and less consistent between models.

6 Emotion–Demographic Bias
Emotion–Gender stereotypical bias butterfly chart
Emotion–Gender stereotypical bias (Ducher's Z, FairFace).

Emotion–Gender: strongest stereotypical bias. Males overrepresented in "Anger" (Z = 0.42), females in "Happiness" (Z = 0.19). This echoes the "angry-man-happy-woman" stereotype from psychology [7].

Emotion–Age: Under-30 underrepresented in "Anger" and "Disgust.", older groups (60+) in "Fear," "Sadness," and "Surprise."

Emotion–Race: Subtle and model-dependent.

7 Key Findings
1

Massive demographic imbalances: Young adults (20–39), White individuals (50–60%), and males (57–70%) are heavily overrepresented. Minority racial groups and older women are consistently underrepresented.

2

"Angry-man-happy-woman": Strong stereotypical biases link "Anger" and "Disgust" disproportionately to males, and "Happiness" to females.

3

Multilingual ≠ more diverse: LAION-2B-en and LAION-2B-multi exhibit remarkably similar bias profiles. The multilingual component shows only slightly greater racial/age diversity, at the cost of increased gender disparity.

5

Systemic and deeply embedded: These biases are shared with most web-scraped datasets, and potentially affect most generative AI models.

8 Conclusions & Future Work

LAION-5B exhibits deeply embedded demographic imbalances that are consistent across dataset components and demographic models.

Future work should trace how these biases propagate through specific generative pipelines, validate findings with human annotations, and extend this audit to other large-scale datasets such as COYO-700M and DataComp.

Selected References

[1] Schuhmann et al. (2022). LAION-5B: An open large-scale dataset for training next generation image-text models.

[2] Karkkainen & Joo (2021). FairFace: Face attribute dataset for balanced race, gender, and age for bias measurement and mitigation. WACV.

[3] Serengil & Ozpinar (2024). A benchmark of facial recognition pipelines and co-usability performances of modules. J. Inf. Technol.

[4] Ryumina et al. (2022). In search of a robust facial expressions recognition model: A large-scale visual cross-corpus study. Neurocomputing.

[5] Dominguez-Catena et al. (2024). Metrics for dataset demographic bias: a case study on Facial Expression Recognition. IEEE TPAMI.

[6] Buolamwini & Gebru (2018). Gender shades: Intersectional accuracy disparities in commercial gender classification. FAccT.

[7] Becker et al. (2007). The confounded nature of angry men and happy women. JPSP.

Birhane et al. (2023). On hate scaling laws for data-swamps. arXiv.

Luccioni et al. (2023). Stable bias: Evaluating societal representations in diffusion models. NeurIPS.

Nicoletti & Bass (2023). Humans are biased. Generative AI is even worse. Bloomberg.