Version: 1.2.0 Status: Production Ready Last Updated: 2025-01-24
The OSINT (Open Source Intelligence) module provides advanced search capabilities, email/phone intelligence, and AI-powered query enhancement for investigative research.
IMPORTANT: OSINT features are provided exclusively for legitimate purposes:
โ Permitted Use:
โ Prohibited Use:
All OSINT queries are logged for compliance and audit purposes.
Parse and execute advanced search queries:
from core.osint import OSINTQueryParser
parser = OSINTQueryParser()
query = parser.parse('site:github.com inurl:python filetype:md')
print(query.site) # 'github.com'
print(query.inurl) # 'python'
print(query.filetype) # 'md'
Supported Operators:
site: - Search specific domaininurl: - Text in URLintext: - Text in page contentintitle: - Text in page titlefiletype: - File type (pdf, doc, etc.)email: - Search for email addressphone: - Search for phone number- - Exclude termComprehensive email analysis:
from core.osint import EmailIntelligence
email_intel = EmailIntelligence()
result = email_intel.analyze_email('test@example.com')
print(result['valid']) # True/False
print(result['domain']) # 'example.com'
print(result['mx_records']) # List of MX records
print(result['disposable']) # True if disposable email
print(result['variations']) # Email variations
print(result['confidence']) # Confidence score (0.0-1.0)
Capabilities:
Phone number analysis and validation:
from core.osint import PhoneIntelligence
phone_intel = PhoneIntelligence()
result = phone_intel.analyze_phone('+49 151 12345678', region='DE')
print(result['valid']) # True/False
print(result['formatted']) # '+49 151 12345678'
print(result['country']) # 'Germany'
print(result['carrier']) # Carrier name (if available)
print(result['type']) # 'mobile', 'fixed_line', etc.
print(result['variations']) # Format variations
Capabilities:
Note: Full phone intelligence requires phonenumbers library:
pip install phonenumbers
LLM-powered query optimization:
from core.osint import QueryEnhancer
from core.llm_client import OllamaClient
llm = OllamaClient()
enhancer = QueryEnhancer(llm)
# Generate query variations
variations = enhancer.generate_variations("John Doe security researcher")
# Output: ["John Doe cybersecurity", "John Doe infosec", ...]
# Suggest operators
operators = enhancer.suggest_operators("find John Doe LinkedIn")
# Output: {'site': 'linkedin.com', 'inurl': 'profile'}
# Identify entity type
entity_type = enhancer.identify_entity_type("test@example.com")
# Output: 'email'
# Suggest sources
sources = enhancer.suggest_sources("Max Mustermann developer", "person")
# Output: ['linkedin.com', 'github.com', 'xing.de', ...]
Comprehensive social media profile analysis and discovery:
from core.osint import SocialIntelligence
social = SocialIntelligence()
# Analyze username across platforms
result = await social.analyze_username("john_doe")
print(f"Found on {result['summary']['platforms_with_presence']} platforms")
print(f"Confidence: {result['summary']['confidence_score']:.1f}%")
# Generate detailed report
report = social.generate_social_report(result)
print(report)
# Discover profiles by email
email_result = await social.discover_profiles_by_email("john@example.com")
print(f"Email-based matches: {len(email_result['username_matches'])}")
Supported Platforms:
Features:
Built-in compliance checks and rate limiting:
from core.osint import OSINTCompliance
compliance = OSINTCompliance()
# Check if user accepted terms
if not compliance.check_terms_accepted("user123"):
print(compliance.display_terms())
# Accept terms
compliance.accept_terms("user123")
# Check query compliance
allowed, reason = compliance.check_query(
query="email:test@example.com",
user_id="user123",
query_type="email_search"
)
if not allowed:
print(f"Query blocked: {reason}")
# Get usage stats
stats = compliance.get_usage_stats("user123")
print(f"Requests this hour: {stats['total_requests_last_hour']}")
print(f"Remaining limits: {stats['remaining_limits']}")
Rate Limits (per hour):
from tools.osint_tool import OSINTTool
from core.llm_client import OllamaClient
# Initialize
llm = OllamaClient()
osint = OSINTTool(llm, config)
# Accept terms (first time)
if not osint.check_terms():
osint.accept_terms()
# Process OSINT query
result = osint.process_query("email:test@example.com site:linkedin.com")
print(result['query_type']) # 'email_intelligence'
print(result['intelligence']) # Email analysis results
print(result['suggestions']) # AI suggestions
# In main.py or interactive mode
query = "email:max.mustermann@example.com"
# The agent will automatically detect OSINT operators
response = agent.query(query)
Example Queries:
# Email intelligence
email:test@example.com
# Phone intelligence
phone:"+49 151 12345678"
# Social media username search
social:john_doe
# Advanced search
site:github.com inurl:python "machine learning"
# Combined searches
email:john@example.com site:linkedin.com inurl:profile
social:john_doe platforms:twitter,github,instagram
core/osint/
โโโ __init__.py # Module exports
โโโ query_parser.py # Advanced operator parsing
โโโ email_intel.py # Email intelligence
โโโ phone_intel.py # Phone intelligence
โโโ social_intel.py # Social media intelligence
โโโ query_enhancer.py # AI query enhancement
โโโ compliance.py # Compliance & rate limiting
โโโ README.md # This file
tools/
โโโ osint_tool.py # Unified OSINT tool for agent
data/osint_logs/ # Audit logs (auto-created)
โโโ osint_queries_YYYY-MM.jsonl
โโโ violations.jsonl
โโโ terms_accepted.json
Add to config.json:
{
"osint": {
"enabled": true,
"log_queries": true,
"rate_limits": {
"email_search": 50,
"phone_search": 50,
"general_osint": 100
}
}
}
from core.osint import EmailIntelligence
intel = EmailIntelligence()
# Analyze email
result = intel.analyze_email("john.doe@company.com")
if result['valid']:
print(f"Domain: {result['domain']}")
print(f"Disposable: {result['disposable']}")
print(f"MX Records: {result['mx_records']}")
# Generate variations
print("Possible variations:")
for var in result['variations']:
print(f" โข {var}")
from core.osint import PhoneIntelligence
intel = PhoneIntelligence()
# Analyze German phone number
result = intel.analyze_phone("+49 151 12345678", region="DE")
if result['valid']:
print(f"Formatted: {result['formatted']}")
print(f"Country: {result['country']}")
print(f"Type: {result['type']}")
print(f"Carrier: {result['carrier']}")
import asyncio
from core.osint import SocialIntelligence
async def social_analysis_example():
social = SocialIntelligence()
# Username analysis across platforms
result = await social.analyze_username("john_doe",
platforms=["twitter", "github", "instagram"])
print(f"Analysis Results:")
print(f"โโ Platforms found: {result['summary']['platforms_with_presence']}")
print(f"โโ Confidence: {result['summary']['confidence_score']:.1f}%")
print(f"โโ Risk level: {'HIGH' if len(result['summary']['risk_indicators']) > 2 else 'LOW'}")
# Show found profiles
for profile in result['platforms_found']:
verified = "โ" if profile['profile_data'].get('verified') else ""
print(f" ๐ {profile['platform']}: {profile['url']} {verified}")
# Run the analysis
asyncio.run(social_analysis_example())
from core.osint import QueryEnhancer, OSINTQueryParser
from core.llm_client import OllamaClient
llm = OllamaClient()
enhancer = QueryEnhancer(llm)
parser = OSINTQueryParser()
# Original query
query = "Max Mustermann security"
# Get AI suggestions
variations = enhancer.generate_variations(query)
operators = enhancer.suggest_operators(query)
# Build enhanced query
enhanced = f"{query} {' '.join([f'{op}:{val}' for op, val in operators.items()])}"
print(f"Enhanced: {enhanced}")
# Parse and execute
parsed = parser.parse(enhanced)
The OSINT module is designed with privacy laws in mind:
Queries containing these terms are automatically blocked:
password, hack, crack, exploitstalk, spy, surveillanceAll OSINT operations are logged:
{
"timestamp": "2025-01-24T10:30:00",
"user_id": "user123",
"query": "email:test@example.com",
"query_type": "email_search",
"status": "approved"
}
Logs are stored in: data/osint_logs/osint_queries_YYYY-MM.jsonl
# Run OSINT tests
pytest tests/test_osint.py -v
# Test specific module
pytest tests/test_email_intel.py -v
# Core (required)
pip install requests beautifulsoup4
# Phone intelligence (optional but recommended)
pip install phonenumbers
# Full installation
pip install -r requirements.txt
Q: Do I need API keys? A: No API keys required for basic features. Optional integrations (HaveIBeenPwned, etc.) require keys.
Q: Is phone intelligence library required?
A: No, but phonenumbers library provides advanced features (carrier, type detection).
Q: Are queries stored permanently? A: Only audit logs are stored (timestamp, user_id, query type). No sensitive data persisted.
Q: What if I exceed rate limits?
A: Wait 1 hour or increase limits in config.json (use responsibly).
Q: Can I use this for commercial purposes? A: Yes, but ensure compliance with local laws and terms of service of searched platforms.
Remember: With great power comes great responsibility. Use OSINT ethically! ๐ก๏ธ