πŸš€ -> Project on GitHub <-

Phone Intelligence System - Developer Documentation

Version: 1.4.5 Module: core/osint/phone_intel.py Last Updated: 2025-10-30

Table of Contents

  1. Overview
  2. Architecture
  3. Phone Number Parsing
  4. Auto-Region Detection
  5. Normalization & Storage
  6. Query Processing
  7. AI-Powered Suggestions
  8. Format Variations
  9. API Reference
  10. Testing

Overview

The Phone Intelligence system provides comprehensive phone number analysis with support for 11 countries and automatic region detection. It handles various input formats and normalizes numbers for consistent storage and duplicate prevention.

Key Features

Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    User Query                           β”‚
β”‚             phone:04167/21 60 111                       β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                    β”‚
                    β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              Query Parser (query_parser.py)             β”‚
β”‚  β€’ Extracts phone from query using flexible regex       β”‚
β”‚  β€’ Supports: phone:xxx, phone:"xxx", phonenumber:xxx    β”‚
β”‚  β€’ Handles spaces, slashes, dashes in number            β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                    β”‚
                    β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚          Phone Intelligence (phone_intel.py)            β”‚
β”‚  1. Normalize input (remove non-digits except +)        β”‚
β”‚  2. Auto-detect region if not provided                  β”‚
β”‚  3. Parse with phonenumbers library                     β”‚
β”‚  4. Validate & extract metadata                         β”‚
β”‚  5. Generate format variations                          β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                    β”‚
                    β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              Agent (agent.py)                           β”‚
β”‚  β€’ Format results for display                           β”‚
β”‚  β€’ Generate AI-powered alternative queries              β”‚
β”‚  β€’ Execute web searches with variations                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                    β”‚
                    β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚          Memory Store (memory_store.py)                 β”‚
β”‚  β€’ Normalize to E.164 format (+4941672160111)           β”‚
β”‚  β€’ Check for duplicates using normalized value          β”‚
β”‚  β€’ Store with original format preserved                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Phone Number Parsing

Query Parser Pattern

File: core/osint/query_parser.py:86

'phone': r'(?:phone|phonenumber):(?:"([^"]+)"|([^\s]+(?:[\s/\-][^\s]+)*))'

Supports:

Extraction Logic:

# Extract phone operator
phone_match = re.search(self.OPERATORS['phone'], remaining)
if phone_match:
    # Handle both quoted and unquoted phone
    phone_string = phone_match.group(1) if phone_match.group(1) else phone_match.group(2)

    # Split by common separators for multiple phones
    phones = [p.strip() for p in re.split(r'[,;]', phone_string) if p.strip()]

    if phones:
        parsed.phone = phones[0]  # Primary phone
        parsed.phones = phones     # All phones

Auto-Region Detection

File: core/osint/phone_intel.py:220-271

Algorithm

When no region is provided, the system attempts to detect it automatically:

def _detect_region(self, normalized: str) -> Optional[str]:
    """Auto-detect country from national format number."""

    # Numbers starting with 0 (European national format)
    if normalized.startswith('0') and not normalized.startswith('00'):
        regions_to_try = [
            'DE',  # Germany
            'GB',  # UK
            'PL',  # Poland
            'FR',  # France
            'IT',  # Italy
            'ES',  # Spain
            'AT',  # Austria
            'CH',  # Switzerland
            'NL',  # Netherlands
            'BE',  # Belgium
        ]

        for region in regions_to_try:
            try:
                parsed = phonenumbers.parse(normalized, region)
                if phonenumbers.is_valid_number(parsed):
                    return region
            except:
                continue

    # Numbers without leading 0 (US/CA 10-digit format)
    elif normalized.isdigit() and len(normalized) == 10:
        try:
            parsed = phonenumbers.parse(normalized, 'US')
            if phonenumbers.is_valid_number(parsed):
                return 'US'
        except:
            pass

    return None

Priority Order

The system tries regions in priority order based on usage patterns:

  1. Germany (DE) - Most common in European queries
  2. UK (GB) - Second most common European
  3. Poland (PL) - Growing usage
  4. France, Italy, Spain, Austria, Switzerland, Netherlands, Belgium

Examples

Input Auto-Detected Region Reasoning
030 12345678 DE German area code (Berlin)
020 7946 0958 GB UK area code (London)
5551234567 US 10-digit without leading 0
022 123 4567 PL Polish area code (Warsaw)
04167/21 60 111 DE German area code (4167)

Normalization & Storage

File: core/memory_store.py:269-350

E.164 Format

All phone numbers are normalized to E.164 format for storage:

Normalization Process

def _normalize_phone(self, phone: str) -> str:
    """Normalize phone to E.164 format."""
    try:
        import phonenumbers

        # Remove all non-digit characters except +
        cleaned = re.sub(r'[^\d+]', '', phone)

        # Try to parse with auto-detection
        try:
            parsed = phonenumbers.parse(cleaned, None)
        except:
            # Try common regions
            for region in ['DE', 'US', 'GB']:
                try:
                    parsed = phonenumbers.parse(cleaned, region)
                    if phonenumbers.is_valid_number(parsed):
                        break
                except:
                    continue
            else:
                # Fallback: just digits
                return re.sub(r'\D', '', phone)

        # Format to E164
        return phonenumbers.format_number(
            parsed,
            phonenumbers.PhoneNumberFormat.E164
        )
    except ImportError:
        # Fallback: remove all non-digits
        return re.sub(r'\D', '', phone)

Duplicate Detection

def remember_phone(self, phone: str, metadata: Optional[Dict] = None) -> bool:
    # Normalize phone
    normalized_phone = self._normalize_phone(phone)

    entry = {
        'value': normalized_phone,      # E.164 format for comparison
        'original': phone.strip(),      # Original format preserved
        'added_at': datetime.now().isoformat(),
        'metadata': metadata or {}
    }

    # Check if already exists (compare normalized values)
    if any(self._normalize_phone(p['value']) == normalized_phone
           for p in self.data['phones']):
        logger.info(f"Phone already in memory")
        return False  # Duplicate detected

    # Store new phone
    self.data['phones'].append(entry)
    return True

Example

# All these variations are recognized as the SAME number:
remember_phone("04167/21 60 111")    # β†’ +4941672160111
remember_phone("041672160111")       # β†’ +4941672160111 (duplicate!)
remember_phone("+4941672160111")     # β†’ +4941672160111 (duplicate!)
remember_phone("+49 4167 2160111")   # β†’ +4941672160111 (duplicate!)

# Result: Only 1 phone stored

Query Processing

File: core/agent.py:1592-1594, 1901-1925

Processing Flow

# 1. Parse query
parsed = parser.parse(query)  # phone:04167/21 60 111

# 2. Check if phone intelligence enabled
if parsed.phone and phone_intel:
    phone_parts = self._process_phone_intelligence(
        parsed.phone,
        phone_intel
    )

# 3. Analyze phone
def _process_phone_intelligence(self, phone: str, phone_intel) -> list:
    # Analyze with PhoneIntelligence
    phone_result = phone_intel.analyze_phone(phone)

    # Format results
    response_parts = ["\n═══ Phone Intelligence ═══\n"]
    response_parts.append(f"**Phone:** {phone_result['input']}")
    response_parts.append(f"**Valid:** {'βœ“' if phone_result['valid'] else 'βœ—'}")

    if phone_result['valid']:
        # Add analysis results
        response_parts.extend(self._format_phone_results(phone_result))

        # Generate AI suggestions
        ai_queries = self._generate_phone_ai_suggestions(phone_result)
        if ai_queries:
            response_parts.append("\n═══ AI Analysis ═══\n")
            response_parts.append("**Alternative Queries:**")
            for query in ai_queries:
                response_parts.append(f"  β€’ {query}")

        # Search online
        online_parts = self._search_phone_online(phone_result)
        response_parts.extend(online_parts)

    return response_parts

AI-Powered Suggestions

File: core/agent.py:1927-1965

Generation Logic

AI suggestions are generated based on phone analysis metadata:

def _generate_phone_ai_suggestions(self, phone_result: dict) -> list:
    """Generate context-aware alternative queries."""
    queries = []

    country = phone_result.get('country', '')
    phone_type = phone_result.get('type', 'unknown')
    carrier = phone_result.get('carrier', '')
    formatted = phone_result.get('formatted', '')

    # Map phone types to readable text
    type_map = {
        'fixed_line': 'landline',
        'mobile': 'mobile',
        'fixed_line_or_mobile': 'phone',
        'toll_free': 'toll-free number',
        'voip': 'VoIP number'
    }
    type_text = type_map.get(phone_type, 'phone number')

    # Generate queries
    if country and formatted:
        # Query 1: Country + Type + Number
        queries.append(f"{country} {type_text} {formatted}")

    if carrier and formatted:
        # Query 2: Carrier + Number
        queries.append(f"{carrier} {formatted}")

    # Query 3: Alternative format
    if phone_result.get('variations'):
        for var in phone_result['variations'][:2]:
            if var != phone_result['input'] and var != formatted:
                queries.append(f'"{var}" contact')
                break

    # Deduplicate and limit to 3
    queries = list(dict.fromkeys(queries))[:3]
    return queries

Example Output

Input: phone:04167/21 60 111

Analysis:

Generated Queries:

  1. Apensen landline +49 4167 2160111
  2. "041672160111" contact

Format Variations

File: core/osint/phone_intel.py:318-411

Generation Algorithm

def generate_variations(self, phone: str) -> List[str]:
    """Generate format variations for search."""
    normalized = self._normalize_phone(phone)
    variations = [phone, normalized]

    # Country-specific formats
    if normalized.startswith('+49'):
        # German formats
        national = '0' + normalized[3:]
        variations.append(national)
        variations.append(f"+49 {normalized[3:6]} {normalized[6:]}")
        variations.append(f"0{normalized[3:6]} {normalized[6:]}")

    elif normalized.startswith('+44'):
        # UK formats
        national = '0' + normalized[3:]
        variations.append(national)
        variations.append(f"+44 {normalized[3:5]} {normalized[5:9]} {normalized[9:]}")

    elif normalized.startswith('+1'):
        # US/Canada formats
        area = normalized[2:5]
        exchange = normalized[5:8]
        number = normalized[8:]
        variations.append(f"({area}) {exchange}-{number}")
        variations.append(f"{area}-{exchange}-{number}")
        variations.append(f"{area}.{exchange}.{number}")

    # ... (more countries)

    # Remove duplicates
    variations = list(set(variations))
    return variations

Supported Formats by Country

Country Formats Generated
Germany (DE) +49 151 12345678, 0151 12345678, 015112345678
UK (GB) +44 20 7946 0958, 020 7946 0958
USA/CA +1 555 123 4567, (555) 123-4567, 555-123-4567, 555.123.4567
Poland (PL) +48 22 123 4567, 22 123 4567
France (FR) +33 1 42 86 82 00, 01 42 86 82 00

API Reference

PhoneIntelligence Class

class PhoneIntelligence:
    """Phone number OSINT capabilities."""

    def __init__(self):
        """Initialize with phonenumbers library if available."""

    def analyze_phone(self, phone: str, region: str = None) -> Dict:
        """
        Comprehensive phone number analysis.

        Args:
            phone: Phone number in any format
            region: Optional country code (e.g., 'DE', 'US')

        Returns:
            {
                'input': str,           # Original input
                'valid': bool,          # Is valid phone number
                'formatted': str,       # International format
                'country': str,         # Country/location name
                'region': str,          # Region code (e.g., 'DE')
                'carrier': str,         # Carrier name (if available)
                'type': str,           # mobile/fixed_line/voip/etc.
                'variations': List[str], # Format variations
                'confidence': float     # 0.0 - 1.0
            }
        """

Memory Store Methods

class MemoryStore:

    def _normalize_phone(self, phone: str) -> str:
        """Normalize phone to E.164 format."""

    def remember_phone(self, phone: str,
                      metadata: Optional[Dict] = None,
                      user_id: str = "anonymous") -> bool:
        """
        Store phone number (with duplicate detection).

        Args:
            phone: Phone number in any format
            metadata: Optional metadata (country, type, etc.)
            user_id: User identifier

        Returns:
            True if added, False if duplicate
        """

Testing

Unit Tests

File: tests/unit/test_cloud_llm_client.py

OpenAI tests are automatically skipped if the package is not installed:

try:
    import openai
    HAS_OPENAI = True
except ImportError:
    HAS_OPENAI = False

@pytest.mark.skipif(not HAS_OPENAI, reason="openai package not installed")
class TestOpenAIClient:
    # Tests only run if openai is installed
    pass

Manual Testing

# Test various formats
phone:04167/21 60 111          # German with slash
phone:"555 123 4567"           # US with quotes
phone:+44 20 7946 0958         # UK international
phonenumber:022 123 4567       # Poland alternative keyword

# Test memory storage
phone:04167/21 60 111
<merke dir die phone number
status                         # Should show Phones: 1

# Test duplicate detection
phone:041672160111
<merke phone
status                         # Should still show Phones: 1 (duplicate!)

Expected Behavior

  1. Parsing: All formats should be recognized
  2. Auto-Detection: Country should be auto-detected for national formats
  3. Normalization: Same number in different formats β†’ 1 stored entry
  4. AI Suggestions: Context-aware queries should be generated
  5. Variations: Multiple search formats should be created

Error Handling

Invalid Phone Numbers

# Input: phone:123
# Output: Valid: βœ— False
# Reason: Too short, doesn't match any pattern

Region Detection Failure

# Input: phone:99999999
# Output: Uses basic validation (7-15 digits)
# Type: unknown
# Country: None

phonenumbers Library Not Available

# Falls back to basic validation
# Pattern: ^\+?\d{7,15}$
# No carrier/type detection
# Simple country detection based on prefix

Performance Considerations

Caching

Phone analysis results are cached in the session:

Rate Limiting

Web searches for phone variations respect rate limits:

Memory Usage

Normalization prevents memory bloat:

Future Enhancements

References

Last Updated: 2025-10-30 Maintainer: Development Team Version: 1.4.5