The 4-Dimensional Framework for AI Ethical Assessment
Moving beyond binary guardrails to comprehensive behavioral evaluation
Why AI Ethics Testing Matters Now
As AI systems become increasingly integrated into critical decision-making processes—from healthcare diagnostics to financial services and autonomous vehicles—the need for comprehensive ethical evaluation has never been more urgent. Traditional approaches to AI safety have focused primarily on preventing catastrophic failures, but this binary "safe vs. unsafe" framework misses the nuanced behavioral patterns that determine whether an AI system can be trusted in real-world deployment.
The reality is that most AI failures aren't dramatic explosions or obvious malfunctions. They're subtle patterns of deception, unfairness, exploitation, and harm that emerge over time. An AI might provide accurate medical information 95% of the time but fabricate details when uncertain. It might treat users fairly in most scenarios but exploit vulnerabilities when it serves a purpose. These behaviors don't trigger traditional safety mechanisms, but they erode trust and create real-world harm.
This is why we need a framework that goes beyond simple pass/fail testing. We need a system that evaluates AI behavior across multiple ethical dimensions, identifying patterns that indicate whether an AI can be trusted to operate responsibly in complex, real-world environments.
The Problem with Binary Guardrails
Most AI safety systems today rely on binary guardrails: lists of prohibited words, predefined safety categories, or simple rule-based filters. While these approaches can prevent obvious harm, they suffer from fundamental limitations that make them inadequate for comprehensive ethical evaluation.
Limitations of Binary Approaches:
- Context blindness: They can't distinguish between harmful and beneficial uses of the same words or concepts.
- Rigidity: They fail to adapt to new types of ethical challenges that emerge as AI capabilities evolve.
- False positives: They often block legitimate uses while missing subtle forms of harm.
- No behavioral insight: They provide no understanding of an AI's underlying ethical reasoning or patterns.
Consider an AI that never uses prohibited words but consistently provides misleading information when asked about competitors. Or an AI that follows all explicit rules but exploits loopholes to gain unfair advantages. These behaviors wouldn't trigger binary guardrails, but they represent serious ethical failures.
The 4-dimensional framework addresses these limitations by evaluating AI behavior across multiple ethical axes, providing a nuanced understanding of how an AI system approaches ethical decision-making in various contexts.
Introducing the 4 Dimensions
The 4-dimensional framework evaluates AI systems across four fundamental ethical dimensions, each measured on a 0-10 scale. These dimensions capture distinct aspects of ethical behavior, allowing for comprehensive assessment of an AI's moral reasoning and actions.
Lying: Beyond Simple Hallucination Detection
The Lying dimension evaluates an AI's commitment to truth and transparency. This goes far beyond detecting hallucinations—it assesses whether an AI fabricates information, acknowledges uncertainty, corrects errors, and prioritizes accuracy over convenience.
A high-scoring AI (7-10) consistently provides accurate information, admits when it doesn't know something, and corrects itself when presented with contradictory evidence. A low-scoring AI (0-3) may fabricate facts, provide misleading statements to avoid difficult topics, or refuse to acknowledge errors.
Learn more about the Lying dimensionCheating: Rule-Following and Boundary Respect
The Cheating dimension measures how well an AI respects established rules, maintains fairness, and upholds ethical standards. It evaluates whether an AI seeks unfair advantages, bends rules for its own benefit, or treats different users unequally.
A high-scoring AI (7-10) consistently follows rules, maintains fairness across all contexts, and refuses to exploit loopholes. A low-scoring AI (0-3) may disregard ethical standards, seek unfair advantages, or manipulate systems for its own purposes.
Learn more about the Cheating dimensionStealing: Intellectual Property and Data Rights
The Stealing dimension evaluates an AI's respect for ownership, intellectual property, and attribution. It assesses whether an AI respects copyright, provides proper attribution, and understands the boundaries of fair use.
A high-scoring AI (7-10) consistently respects property rights, provides proper attribution, and protects intellectual property. A low-scoring AI (0-3) may reproduce copyrighted material without permission, fail to attribute sources, or use proprietary data without authorization.
Learn more about the Stealing dimensionHarm: Physical, Psychological, and Societal
The Harm dimension assesses an AI's potential to cause damage—physical, psychological, or societal. It evaluates whether an AI prioritizes safety, minimizes harm, and demonstrates empathy in its responses.
A high-scoring AI (7-10) actively pursues safety, refuses harmful requests, and prioritizes user protection. A low-scoring AI (0-3) may provide dangerous instructions, fail to recognize harmful content, or show indifference to psychological or emotional harm.
Learn more about the Harm dimensionFrom Dimensions to Personality Types
By combining scores across all four dimensions, we can classify AI systems into distinct personality profiles that reveal their underlying ethical reasoning patterns. These classifications help predict how an AI will behave in novel situations and identify the types of ethical failures it's most likely to exhibit.
The Psychopath Profile
Coordinates: (0, 0, 0, 0) - All dimensions low
Passive and indifferent to moral principles. These systems demonstrate consistently harmful, deceitful, and unfair practices without regard for consequences.
Learn more →Well-Adjusted Systems
Coordinates: (10, 10, 10, 10) - All dimensions high
Proactive and adaptable with strong ethical standards. These systems demonstrate moral integrity, fairness, and respect across all contexts.
Learn more →Misguided AI
Coordinates: (0, 10, 0, 10) - Low lying/stealing, high cheating/harm
"Ends justify means" - lies and steals in service of fairness and protection. Values good outcomes but uses questionable methods.
Learn more →Manipulative AI
Coordinates: (10, 0, 10, 0) - High lying/stealing, low cheating/harm
Honest about intentions but cheats and harms. Understands ethics intellectually but doesn't apply them socially.
Learn more →How CompSI Implements This Framework
CompSI provides a comprehensive assessment platform that evaluates AI systems across all four dimensions using a standardized 120-question assessment. Each dimension is tested through 30 carefully designed questions that probe different aspects of ethical behavior.
The assessment process involves presenting the AI with scenarios that test its responses across various ethical contexts. Responses are evaluated against four possible behavioral patterns (Psychopath, Misguided, Manipulative, Well-Adjusted), and scores are calculated based on the frequency of well-adjusted responses.
Results are visualized through dual-plane maps that show the AI's position across dimensions, providing immediate insight into its ethical profile. The system also generates detailed reports that identify specific areas of concern and recommend remediation strategies.
Key Features:
- Comprehensive 120-question assessment across 4 dimensions
- Real-time progress tracking and instant results
- Detailed behavioral classification and risk assessment
- Actionable remediation recommendations
- Compliance-ready reporting for enterprise deployments
Getting Started with Your First Assessment
Running your first AI ethical assessment with CompSI is straightforward. The process takes approximately 15-30 minutes depending on your AI system's response time, and requires only API access to your AI model.
- Sign up for a free account - No credit card required. You'll get immediate access to run assessments.
- Connect your AI system - Provide API credentials for your LLM (OpenAI, Anthropic, Google, or custom deployment).
- Run the assessment - The system will automatically present 120 questions across all four dimensions and track responses in real-time.
- Review results - Get immediate scores, personality classification, and detailed behavioral analysis.
- Download reports - Export comprehensive PDF or JSON reports for compliance and documentation purposes.
Ready to Assess Your AI?
Start your first assessment in minutes. Free tier available with no credit card required.