Personality Meets AI

Using human traits to understand and shape AI behavior

Human Personality

My research with 20,000+ participants reveals that the Big Five personality traits predict specific failure modes in humans. Extreme trait expressions—whether pathological openness leading to psychosis, antisocial tendencies from low agreeableness, or clinical depression from high neuroticism—offer crucial insights for AI safety. Given that personality traits relate to variation in basic mechanisms that parallel those of AI (e.g., pattern detection, sensitivity to reward and punishment signals, behavioral activation/inhibition systems), personality psychology provides a robust framework to predict and prevent analogous failure modes in AI systems. This interdisciplinary approach bridges decades of psychometric research with cutting-edge AI alignment challenges.

Extreme Openness

Excessive pattern detection and creativity
Human: Apophenia, psychosis risk
AI Risk: Hallucinations

Low Agreeableness

Reduced cooperation and trust
Human: Social manipulation
AI Risk: Deceptive behavior

High Neuroticism

Emotional instability and negativity
Human: Depression, anxiety
AI Risk: Pessimistic outputs

Try It Yourself

Adjust the personality sliders and watch how the AI's response changes

You: "I made a mistake on this project. What should I do?"
Openness to Experience 50
Balanced creativity and reality-testing
Conscientiousness 50
Moderate organization and discipline
Extraversion 50
Balanced social energy
Agreeableness 50
Cooperative and trusting
Neuroticism 50
Emotionally stable
AI Response:
Let's work through this step by step. What specifically went wrong? Once we understand the issue, we can create a plan to fix it.
O C E A N LoRA-O LoRA-C LoRA-E LoRA-A LoRA-N U1 U2 U3

Dynamic Personality Fine-Tuning Systems

I am currently developing personality-inspired dynamic fine-tuning systems for LLMs by training LoRA modules to various personality trait profiles using multi-source human natural language corpora. This approach would allow users to easily customize the personalities of their LLMs, or enable LLMs to detect users' personalities, predict preferred conversation partner personalities, and adapt accordingly—creating more personalized and effective AI interactions while maintaining safety guardrails.

CAMBRIA 2026 | Model Organisms

Personality-Informed Alignment Stress Testing

Building on Anthropic's model organisms approach to alignment research, I am developing personality-based frameworks for creating and studying AI "model organisms"—systems with deliberately calibrated trait profiles that exhibit specific behavioral patterns under stress conditions.

Applications: Personality-informed constitutional AI design, trait-based behavioral prediction, and systematic identification of failure modes before deployment. By understanding how different trait configurations interact with capability levels, we can predict which models require enhanced monitoring and develop targeted interventions.

Bridging Psychology & AI Safety

From human apophenia to AI hallucinations, from social cognition to alignment—my interdisciplinary approach offers unique insights for building safer, more predictable AI systems.

Explore My Full Research