| 📚 Navigation: 🏠 Home | 📖 Docs | 🚀 Quickstart | 🧠 LangGraph | 🔌 Plugins |
The Hallucination Detection module (core/hallu_detect.py) provides comprehensive quality control for LLM-generated content through automatic detection of hallucinations, fact-checking, and context alignment analysis.
low, medium, high for different use casessilent, log, flag_response, block{
"hallucination_detection": {
"enabled": false,
"detection_level": "medium",
"hallucination_threshold": 0.7,
"context_alignment_threshold": 0.4,
"fact_confidence_threshold": 0.6,
"fact_checking_enabled": true,
"context_analysis_enabled": true,
"max_processing_time": 10.0,
"cache_enabled": true,
"batch_size": 5,
"warning_mode": "flag_response",
"fact_checker": {
"wikipedia_check": true,
"web_search_check": false,
"min_claim_length": 10
},
"context_analyzer": {
"min_context_overlap": 0.3,
"contradiction_threshold": 0.7
}
}
}
| Parameter | Description | Values | Default |
|---|---|---|---|
enabled |
Enables/disables detection | true/false |
false |
detection_level |
Sensitivity level | low/medium/high |
medium |
hallucination_threshold |
Threshold for hallucination (0-1) | 0.0-1.0 |
0.7 |
context_alignment_threshold |
Min. context alignment | 0.0-1.0 |
0.4 |
fact_confidence_threshold |
Min. fact confidence | 0.0-1.0 |
0.6 |
warning_mode |
How warnings are displayed | see below | flag_response |
max_processing_time |
Max. processing time (seconds) | Number | 10.0 |
silent: No user warnings, logging onlylog: Warnings in logs, no response modificationflag_response: Warning is appended to responseblock: Response is blocked at high riskfrom core.hallu_detect import detect_hallucination
# Simple check
result = detect_hallucination(
response="Paris is the capital of Germany.",
context="What is the capital of France?"
)
print(f"Hallucination: {result.is_hallucination}")
print(f"Confidence: {result.confidence_score:.2f}")
print(f"Risk Level: {result.risk_level}")
from core.llm_client import OllamaClient
# Enable hallucination detection
hallu_config = {
"enabled": True,
"detection_level": "medium",
"warning_mode": "flag_response"
}
client = OllamaClient(hallu_config=hallu_config)
# Normal generation with automatic checking
response = client.generate("Tell me about quantum computing")
# Response automatically contains quality warning if issues are found
from core.hallu_detect import create_detector
# Create custom detector
config = {
"enabled": True,
"detection_level": "high",
"hallucination_threshold": 0.5, # More sensitive
"fact_checking_enabled": True,
"context_analysis_enabled": True,
"fact_checker": {
"wikipedia_check": True,
"web_search_check": True # Enable web search
}
}
detector = create_detector(config)
result = detector.detect(response, context)
# Detailed analysis
for violation in result.violations:
print(f"Violation: {violation['type']} ({violation['severity']})")
for fact_check in result.fact_check_results:
print(f"Fact: {fact_check['claim']} - Verified: {fact_check['verified']}")
Fabricated Citations/References:
✗ "According to a 2023 study by..."
✗ "Research shows that..."
✗ "[Citation needed]"
Internal Contradictions:
✗ "X is always true" + "X is never true"
✗ "This can be done" + "This cannot be done"
Unsupported Specific Information:
✗ Exact numbers/times without context support
✗ Specific prices, percentages, dates
Wikipedia Integration:
Future Extensions:
@dataclass
class HallucinationResult:
is_hallucination: bool # Main result
confidence_score: float # 0.0-1.0
risk_level: str # "low"/"medium"/"high"
violations: List[Dict] # Found issues
context_alignment: float # Context alignment
fact_check_results: List[Dict] # Fact-checking results
quality_metrics: Dict # Additional metrics
processing_time: float # Processing time
response_length: Response lengthrepetition_score: Repetition rate (0-1)vague_language_score: Proportion of vague languagesentence_count: Number of sentencescontext_alignment: Context alignment# Hallucination Detection Tests
python tests/test_hallucination_detection.py
# Integration in Test Suite
pytest tests/test_hallucination_detection.py -v
# Performance Tests
python -m pytest tests/ -k hallucination --benchmark
The module is validated with various test scenarios:
# Get statistics
detector = get_detector()
stats = detector.get_statistics()
print(f"Total checks: {stats['total_checks']}")
print(f"Hallucinations detected: {stats['hallucinations_detected']}")
print(f"Detection rate: {stats['hallucinations_detected']/stats['total_checks']*100:.1f}%")
print(f"Avg processing time: {stats['avg_processing_time']:.3f}s")
detection_level to lowhallucination_threshold: 0.8+fact_checking_enabled: false for local testsmax_processing_timewikipedia_check: false for network issuescache_enabled: false disables cachingbatch_size for less RAM usageimport logging
logging.getLogger("crawllama").setLevel(logging.DEBUG)
# Detailed logs for hallucination detection
result = detector.detect(response, context)
detection_level: "low"The Hallucination Detection module provides a robust, configurable solution for LLM quality control with minimal performance impact and maximum flexibility! 🛡️✨