People

Ayan Majumdar
Max Planck Institute for Software Systems
Building E1 5, Room 538
Email: ayanm@mpi-sws.org
About me
Hello there! My name is Ayan. I am a Ph.D. student in Computer Science at the Max Planck Institute for Software Systems and Saarland University, Germany, jointly advised by Prof. Krishna Gummadi and Prof. Isabel Valera.
Prior to joining as a Ph.D. student, I completed my M.Sc. in Computer Science at Saarland University. Before this, I completed my Bachelors in Electronics and Communication Engineering from Heritage Institute of Technology, India, and worked for two years as a System Engineer at Infosys, India.
I am broadly interested in research topics that encompass the aspects of fairness, robustness and understanding of machine learning systems in consequential decision-making and high-stakes scenarios. In particular, I am exploring the application of deep generative models in these contexts. I am also very enthusiastic about understanding and explaining deep learning models and their perceptions (with respect to human-centric as well as machine-centric views).
You can find more information on my personal page here.
Publications
2026
Majumdar, Ayan; Kanubala, Deborah Dormah; Gupta, Kavya; Valera, Isabel
A Causal Framework to Measure and Mitigate Non-binary Treatment Discrimination Journal Article
In: CoRR, vol. abs/2503.22454, 2026.
@article{DBLP:journals/corr/abs-2503-22454,
title = {A Causal Framework to Measure and Mitigate Non-binary Treatment Discrimination},
author = {Ayan Majumdar and Deborah Dormah Kanubala and Kavya Gupta and Isabel Valera},
url = {https://doi.org/10.48550/arXiv.2503.22454},
doi = {10.48550/ARXIV.2503.22454},
year = {2026},
date = {2026-03-19},
urldate = {2026-03-19},
journal = {CoRR},
volume = {abs/2503.22454},
abstract = {Fairness studies of algorithmic decision-making systems often simplify complex decision processes, such as bail or lending decisions, into binary classification tasks (e.g., approve or not approve). However, these approaches overlook that such decisions are not inherently binary; they also involve non-binary treatment decisions (e.g., loan or bail terms) that can influence the downstream outcomes (e.g., loan repayment or reoffending). We argue that treatment decisions are integral to the decision-making process and, therefore, should be central to fairness analyses. Consequently, we propose a causal framework that extends and complements existing fairness notions by explicitly distinguishing between decision-subjects’ covariates and the treatment decisions. Our framework leverages path-specific counterfactual reasoning to: (i) measure treatment disparity and its downstream effects in historical data; and (ii) mitigate the impact of past unfair treatment decisions when automating decision-making. We use our framework to empirically analyze four widely used loan approval datasets to reveal potential disparity in non-binary treatment decisions and their discriminatory impact on outcomes, highlighting the need to incorporate treatment decisions in fairness assessments. Finally, by intervening in treatment decisions, we show that our framework effectively mitigates treatment discrimination from historical loan approval data to ensure fair risk score estimation and (non-binary) decision-making processes that benefit all stakeholders.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
2025
Majumdar, Ayan; Chen, Feihao; Li, Jinghui; Wang, Xiaozhen
Evaluating LLMs for Demographic-Targeted Social Bias Detection: A Comprehensive Benchmark Study Journal Article
In: CoRR, vol. abs/2510.04641, 2025.
@article{DBLP:journals/corr/abs-2510-04641,
title = {Evaluating LLMs for Demographic-Targeted Social Bias Detection: A Comprehensive Benchmark Study},
author = {Ayan Majumdar and Feihao Chen and Jinghui Li and Xiaozhen Wang},
url = {https://doi.org/10.48550/arXiv.2510.04641},
doi = {10.48550/ARXIV.2510.04641},
year = {2025},
date = {2025-01-01},
urldate = {2025-01-01},
journal = {CoRR},
volume = {abs/2510.04641},
abstract = {Large-scale web-scraped text corpora used to train general-purpose AI models often contain harmful demographic-targeted social biases, creating a regulatory need for data auditing and developing scalable bias-detection methods. Although prior work has investigated biases in text datasets and related detection methods, these studies remain narrow in scope. They typically focus on a single content type (e.g., hate speech), cover limited demographic axes, overlook biases affecting multiple demographics simultaneously, and analyze limited techniques. Consequently, practitioners lack a holistic understanding of the strengths and limitations of recent large language models (LLMs) for automated bias detection. In this study, we conduct a comprehensive benchmark study on English texts to assess the ability of LLMs in detecting demographic-targeted social biases. To align with regulatory requirements, we frame bias detection as a multi-label task of detecting targeted identities using a demographic-focused taxonomy. We then systematically evaluate models across scales and techniques, including prompting, in-context learning, and fine-tuning. Using twelve datasets spanning diverse content types and demographics, our study demonstrates the promise of fine-tuned smaller models for scalable detection. However, our analyses also expose persistent gaps across demographic axes and multi-demographic targeted biases, underscoring the need for more effective and scalable detection frameworks.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
2024
Majumdar, Ayan; Valera, Isabel
CARMA: A practical framework to generate recommendations for causal algorithmic recourse at scale Proceedings Article
In: The 2024 ACM Conference on Fairness, Accountability, and Transparency, FAccT 2024, Rio de Janeiro, Brazil, June 3-6, 2024, pp. 1745–1762, ACM, 2024.
@inproceedings{DBLP:conf/fat/MajumdarV24,
title = {CARMA: A practical framework to generate recommendations for causal algorithmic recourse at scale},
author = {Ayan Majumdar and Isabel Valera},
url = {https://doi.org/10.1145/3630106.3659003},
doi = {10.1145/3630106.3659003},
year = {2024},
date = {2024-01-01},
urldate = {2024-01-01},
booktitle = {The 2024 ACM Conference on Fairness, Accountability, and Transparency,
FAccT 2024, Rio de Janeiro, Brazil, June 3-6, 2024},
pages = {1745–1762},
publisher = {ACM},
abstract = {Algorithms are increasingly used to automate large-scale decision-making processes, e.g., online platforms that make instant decisions in lending, hiring, and education. When such automated systems yield unfavorable decisions, it is imperative to allow for recourse by accompanying the instantaneous negative decisions with recommendations that can help affected individuals to overturn them. However, the practical challenges of providing algorithmic recourse in large-scale settings are not negligible: giving recourse recommendations that are actionable requires not only causal knowledge of the relationships between applicant features but also solving a complex combinatorial optimization problem for each rejected applicant. In this work, we introduce CARMA, a novel framework to generate causal recourse recommendations at scale. For practical settings with limited causal information, CARMA leverages pre-trained state-of-the-art causal generative models to find recourse recommendations. More importantly, CARMA addresses the scalability of finding these recommendations by casting the complex recourse optimization problem as a prediction task. By training a novel neural-network-based framework, CARMA efficiently solves the prediction task without requiring supervision for optimal recourse actions. Our extensive evaluations show that post-training, running inference on CARMA reliably amortizes causal recourse, generating optimal and instantaneous recommendations. CARMA exhibits flexibility, as its optimization is versatile with respect to the algorithmic decision-making and pre-trained causal generative models, provided their differentiability is ensured. Furthermore, we showcase CARMA in a case study, illustrating its ability to tailor causal recourse recommendations by readily incorporating population-level feature preferences based on factors such as difficulty or time needed.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
2023
Nanda, Vedant; Majumdar, Ayan; Kolling, Camila; Dickerson, John P.; Gummadi, Krishna P.; Love, Bradley C.; Weller, Adrian
Do Invariances in Deep Neural Networks Align with Human Perception? Proceedings Article
In: Williams, Brian; Chen, Yiling; Neville, Jennifer (Ed.): Thirty-Seventh AAAI Conference on Artificial Intelligence, AAAI 2023, Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence, IAAI 2023, Thirteenth Symposium on Educational Advances in Artificial Intelligence, EAAI 2023, Washington, DC, USA, February 7-14, 2023, pp. 9277–9285, AAAI Press, 2023.
@inproceedings{DBLP:conf/aaai/NandaMKDGLW23,
title = {Do Invariances in Deep Neural Networks Align with Human Perception?},
author = {Vedant Nanda and Ayan Majumdar and Camila Kolling and John P. Dickerson and Krishna P. Gummadi and Bradley C. Love and Adrian Weller},
editor = {Brian Williams and Yiling Chen and Jennifer Neville},
url = {https://doi.org/10.1609/aaai.v37i8.26112},
doi = {10.1609/AAAI.V37I8.26112},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
booktitle = {Thirty-Seventh AAAI Conference on Artificial Intelligence, AAAI
2023, Thirty-Fifth Conference on Innovative Applications of Artificial
Intelligence, IAAI 2023, Thirteenth Symposium on Educational Advances
in Artificial Intelligence, EAAI 2023, Washington, DC, USA, February
7-14, 2023},
pages = {9277–9285},
publisher = {AAAI Press},
abstract = {An evaluation criterion for safe and trustworthy deep learning is how well the invariances captured by representations of deep neural networks (DNNs) are shared with humans. We identify challenges in measuring these invariances. Prior works used gradient-based methods to generate identically represented inputs (IRIs), ie, inputs which have identical representations (on a given layer) of a neural network, and thus capture invariances of a given network. One necessary criterion for a network's invariances to align with human perception is for its IRIs look 'similar' to humans. Prior works, however, have mixed takeaways; some argue that later layers of DNNs do not learn human-like invariances yet others seem to indicate otherwise. We argue that the loss function used to generate IRIs can heavily affect takeaways about invariances of the network and is the primary reason for these conflicting findings. We propose an adversarial regularizer on the IRI generation loss that finds IRIs that make any model appear to have very little shared invariance with humans. Based on this evidence, we argue that there is scope for improving models to have human-like invariances, and further, to have meaningful comparisons between models one should use IRIs generated using the regularizer-free loss. We then conduct an in-depth investigation of how different components (eg architectures, training losses, data augmentations) of the deep learning pipeline contribute to learning models that have good alignment with humans. We find that architectures with residual connections trained using a (self-supervised) contrastive loss with l_p ball adversarial data augmentation tend to learn invariances that are most aligned with humans. Code: github.com/nvedant07/Human-NN-Alignment},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
2022
Rateike, Miriam; Majumdar, Ayan; Mineeva, Olga; Gummadi, Krishna P.; Valera, Isabel
Don't Throw it Away! The Utility of Unlabeled Data in Fair Decision Making Proceedings Article
In: FAccT '22: 2022 ACM Conference on Fairness, Accountability, and Transparency, Seoul, Republic of Korea, June 21 - 24, 2022, pp. 1421–1433, ACM, 2022.
@inproceedings{DBLP:conf/fat/RateikeMMGV22,
title = {Don't Throw it Away! The Utility of Unlabeled Data in Fair Decision Making},
author = {Miriam Rateike and Ayan Majumdar and Olga Mineeva and Krishna P. Gummadi and Isabel Valera},
url = {https://doi.org/10.1145/3531146.3533199},
doi = {10.1145/3531146.3533199},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
booktitle = {FAccT '22: 2022 ACM Conference on Fairness, Accountability, and
Transparency, Seoul, Republic of Korea, June 21 - 24, 2022},
pages = {1421–1433},
publisher = {ACM},
abstract = {unbiased, i.e., equally distributed across socially salient groups. In many practical settings, the ground-truth cannot be directly observed, and instead, we have to rely on a biased proxy measure of the ground-truth, i.e., biased labels, in the data. In addition, data is often selectively labeled, i.e., even the biased labels are only observed for a small fraction of the data that received a positive decision. To overcome label and selection biases, recent work proposes to learn stochastic, exploring decision policies via i) online training of new policies at each time-step and ii) enforcing fairness as a constraint on performance. However, the existing approach uses only labeled data, disregarding a large amount of unlabeled data, and thereby suffers from high instability and variance in the learned decision policies at different times. In this paper, we propose a novel method based on a variational autoencoder for practical fair decision-making. Our method learns an unbiased data representation leveraging both labeled and unlabeled data and uses the representations to learn a policy in an online process. Using synthetic data, we empirically validate that our method converges to the optimal (fair) policy according to the ground-truth with low variance. In real-world experiments, we further show that our training approach not only offers a more stable learning process but also yields policies with higher fairness as well as utility than previous approaches.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
2021
Nanda, Vedant; Majumdar, Ayan; Kolling, Camila; Dickerson, John P.; Gummadi, Krishna P.; Love, Bradley C.; Weller, Adrian
Exploring Alignment of Representations with Human Perception Journal Article
In: CoRR, vol. abs/2111.14726, 2021.
@article{DBLP:journals/corr/abs-2111-14726,
title = {Exploring Alignment of Representations with Human Perception},
author = {Vedant Nanda and Ayan Majumdar and Camila Kolling and John P. Dickerson and Krishna P. Gummadi and Bradley C. Love and Adrian Weller},
url = {https://arxiv.org/abs/2111.14726},
year = {2021},
date = {2021-01-01},
urldate = {2021-01-01},
journal = {CoRR},
volume = {abs/2111.14726},
abstract = {An evaluation criterion for safe and trustworthy deep learning is how well the invariances captured by representations of deep neural networks (DNNs) are shared with humans. We identify challenges in measuring these invariances. Prior works used gradient-based methods to generate identically represented inputs (IRIs), ie, inputs which have identical representations (on a given layer) of a neural network, and thus capture invariances of a given network. One necessary criterion for a network's invariances to align with human perception is for its IRIs look 'similar' to humans. Prior works, however, have mixed takeaways; some argue that later layers of DNNs do not learn human-like invariances (cite{jenelle2019metamers}) yet others seem to indicate otherwise (cite{mahendran2014understanding}). We argue that the loss function used to generate IRIs can heavily affect takeaways about invariances of the network and is the primary reason for these conflicting findings. We propose an adversarial regularizer on the IRI generation loss that finds IRIs that make any model appear to have very little shared invariance with humans. Based on this evidence, we argue that there is scope for improving models to have human-like invariances, and further, to have meaningful comparisons between models one should use IRIs generated using the regularizer-free loss. We then conduct an in-depth investigation of how different components (eg architectures, training losses, data augmentations) of the deep learning pipeline contribute to learning models that have good alignment with humans. We find that architectures with residual connections trained using a (self-supervised) contrastive loss with ell_p ball adversarial data augmentation tend to learn invariances that are most aligned with humans. Code: https://github.com/nvedant07/Human-NN-Alignment},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
