Architecture

The system follows a layered, deterministic architecture where each component has a clear responsibility and no hidden magic.

High-Level Design

┌─────────────────────────────────────────────────────────────┐
│                    Event Orchestrator                       │
│                                                             │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐        │
│  │  Benign     │  │  Malicious  │  │   Skeptic   │        │
│  │  Analyst    │  │  Analyst    │  │   Analyst   │        │
│  │             │  │             │  │             │        │
│  │  • Evidence │  │  • Evidence │  │  • Evidence │        │
│  │    Extract  │  │    Extract  │  │    Extract  │        │
│  │  • Claims   │  │  • Claims   │  │  • Claims   │        │
│  └──────┬──────┘  └──────┬──────┘  └──────┬──────┘        │
│         │                 │                 │               │
│         └─────────────────┼─────────────────┘               │
│                           │                                  │
│                  ┌────────▼────────┐                        │
│                  │                 │                        │
│                  │  Convergence    │                        │
│                  │    Engine       │                        │
│                  │                 │                        │
│                  │  • Deterministic│                        │
│                  │  • Metrics-based│                        │
│                  │  • No LLM calls │                        │
│                  └────────┬────────┘                        │
│                           │                                  │
│                  ┌────────▼────────┐                        │
│                  │  FINAL OUTPUT   │                        │
│                  │                 │                        │
│                  │  • BENIGN       │                        │
│                  │  • MALICIOUS    │ ◄─ UNCERTAIN           │
│                  │  • UNCERTAIN    │    when thresholds     │
│                  └─────────────────┘    not met             │
└─────────────────────────────────────────────────────────────┘

Core Components

1. Event Orchestrator

Responsibility: Coordinate parallel agent execution while enforcing isolation

Accepts incident text as input
Dispatches to agents in parallel
Enforces reasoning isolation (no shared context)
Aggregates and persists artifacts
No LLM calls - pure coordination logic

2. Agent System (Three Independent Perspectives)

Each agent:

Extracts evidence with exact source spans
Generates claims citing only their own evidence
May contradict default stance if evidence warrants
Outputs structured JSON with confidence scores

Agents:

Benign Analyst: Looks for non-malicious explanations
Malicious Analyst: Searches for indicators of compromise
Skeptic Analyst: Challenges assumptions, emphasizes evidence gaps

3. Evidence Extraction

Structured extraction with source character spans
Normalized evidence items for overlap computation
Atomic evidence (no combined summaries)
Each item must have exact quote from text

4. Convergence Engine (Deterministic)

Key principle: No LLM calls for consensus decisions

Computes evidence overlap (Jaccard similarity)
Calculates disagreement entropy
Applies threshold logic (configurable)
Produces final label with confidence
Fully reproducible from stored artifacts

5. Artifact System

All intermediate results stored as JSONL
Schema-enforced structure (Pydantic)
Replay mode recomputes convergence without LLMs
Hash-verified exports for reproducibility

Data Flow

# Pseudo-code of the deterministic flow
def analyze_incident(incident_text):
    # Step 1: Parallel evidence extraction
    evidence = {
        "benign": extract_evidence(incident_text, "benign"),
        "malicious": extract_evidence(incident_text, "malicious"),
        "skeptic": extract_evidence(incident_text, "skeptic")
    }
    
    # Step 2: Parallel claims generation
    claims = {
        "benign": generate_claims(incident_text, evidence["benign"]),
        "malicious": generate_claims(incident_text, evidence["malicious"]),
        "skeptic": generate_claims(incident_text, evidence["skeptic"])
    }
    
    # Step 3: Deterministic convergence
    metrics = compute_convergence_metrics(evidence, claims)
    
    # Step 4: Threshold-based decision
    if meets_consensus_threshold(metrics):
        return {"label": consensus_label, "confidence": metrics.confidence}
    else:
        return {"label": "UNCERTAIN", "confidence": 1 - metrics.residual_disagreement}

Key Architectural Decisions

1. Deterministic Convergence

Choice: No LLM in decision loop Reason: Ensures reproducibility and avoids hidden reasoning Implementation: Pure Python with configurable thresholds

2. Evidence Isolation

Choice: Agents cannot see each other’s reasoning initially Reason: Preserves epistemic independence Implementation: Parallel execution with no shared context

3. Structured Artifacts

Choice: JSONL storage with strict schemas Reason: Enables replay and audit trails Implementation: Pydantic models with validation

4. Measurable Disagreement

Choice: Quantify rather than eliminate disagreement Reason: Turns uncertainty into signal Implementation: Residual disagreement metric (0-1)

File Structure

self-arguing-analyst/
├── src/
│   ├── agents/           # LLM agents with structured outputs
│   ├── schemas/          # Pydantic models (strict validation)
│   ├── orchestrator.py   # Event coordination
│   ├── convergence_engine.py  # Deterministic logic
│   └── replay/           # Artifact replay system
├── api/
│   └── main.py          # REST API (FastAPI)
├── tests/               # Comprehensive test suite
├── docs/               # This documentation site
└── k8s/                # Production deployment

Deployment Architecture

The system is designed for multiple deployment scenarios:

1. Local Development

python main.py analyze --file incident.txt

2. Docker Compose (Full Stack)

docker-compose up -d  # Includes API, DB, monitoring

3. Kubernetes (Production)

kubectl apply -f k8s/  # Auto-scaling, monitoring, backups

Why This Architecture Works

Testable: Each component has single responsibility
Reproducible: Deterministic convergence, artifact replay
Scalable: Stateless agents, parallel execution
Auditable: Complete artifact trail, no hidden reasoning
Maintainable: Clear boundaries, minimal dependencies

Next Evolution

The architecture supports:

Adding new agent roles (pluggable system)
Custom convergence logic (subclass engine)
External evidence enrichment (MITRE ATT&CK)
Different LLM providers (abstract interface)

This architecture documents the system, not markets it. The value is in the engineering choices, not the presentation.