πŸš€ -> Project on GitHub <-

Social Media Intelligence (OSINT)

Navigation: Home Docs OSINT Guide Context Usage Health

Overview

The Social Intelligence module extends CrawlLama’s OSINT capabilities with comprehensive social media analysis and monitoring.

Features

1. Username Analysis

2. Email-based Profile Discovery

3. Activity Monitoring

4. Risk Assessment

Supported Platforms | Platform | Status | API Integration | Username Pattern |

|———–|——–|β€”β€”β€”β€”β€”-|β€”β€”β€”β€”β€”β€”| | Twitter | | Optional | 1-15 characters, A-Z, 0-9, _ | | Instagram | | Optional | 1-30 characters, A-Z, 0-9, _, . | | LinkedIn | | Optional (API available) | 3-100 characters, A-Z, 0-9, - | | Facebook | | Optional | 5-50 characters, A-Z, 0-9, . | | GitHub | | | 1-39 characters, A-Z, 0-9, - | | Reddit | | | 3-20 characters, A-Z, 0-9, _, - | | YouTube | | Optional | 1-100 characters, A-Z, 0-9, _, - | | TikTok | | Optional | 1-24 characters, A-Z, 0-9, _, . |

Usage

Basic Username Analysis

from core.osint.social_intel import SocialIntelligence

async def analyze_user():
 social = SocialIntelligence()

 # Analyze a username
 results = await social.analyze_username(
 username="john_doe",
 platforms=["twitter", "instagram", "github"]
 )

 print(f"Found on {results['summary']['platforms_with_presence']} platforms")

 # Generate report
 report = social.generate_social_report(results)
 print(report)
async def search_by_email():
 social = SocialIntelligence()

 # Discover profiles based on email
 results = await social.discover_profiles_by_email("john.doe@company.com")

 print(f"Username matches: {len(results['username_matches'])}")
 for match in results['username_matches']:
 print(f" - {match['platform']}: {match['url']}")

Activity Monitoring

async def monitor_activity():
 social = SocialIntelligence()

 # Monitor social media activity
 activity = await social.monitor_social_activity(
 username="target_user",
 platforms=["twitter", "instagram"]
 )

 print(f"Activity level: {activity['activity_level']}")
 print(f"Sentiment: {activity['overall_sentiment']}")

CLI Integration

The Social Intelligence module is integrated into the CrawlLama CLI:

# Analyze username
python main.py --osint --social-username "john_doe"

# Email-based search
python main.py --osint --social-email "john@example.com"

# Activity monitoring
python main.py --osint --social-monitor "target_user" --platforms twitter,instagram

API Configuration (Optional)

For advanced features, API keys can be configured:

{
 "social_apis": {
 "twitter": {
 "api_key": "your_twitter_api_key",
 "api_secret": "your_twitter_api_secret",
 "access_token": "your_access_token",
 "access_secret": "your_access_secret"
 },
 "instagram": {
 "access_token": "your_instagram_token"
 }
 }
}

LinkedIn API (Optional)

By default, LinkedIn profile detection uses web scraping (no extra dependencies, no credentials required). For enhanced LinkedIn intelligence, you can optionally install the linkedin-api package.

Default: Web Scraping

Optional: LinkedIn API

Installation:

pip install linkedin-api==2.3.1 lxml==5.3.0

Configuration (environment variables in .env):

LINKEDIN_EMAIL=your_linkedin_email@example.com
LINKEDIN_PASSWORD=your_linkedin_password

Security Warning:

Terms of Service Notice: Using the linkedin-api library involves unofficial access to LinkedIn data. This may violate LinkedIn’s Terms of Service. This feature is provided for authorized security research, threat intelligence, and compliance/due diligence purposes only. Users are responsible for ensuring their use complies with applicable laws and LinkedIn’s ToS.

How It Works

When linkedin-api is installed and credentials are configured:

  1. LinkedIn lookups first try the API for richer data
  2. If the API call fails, web scraping is used as fallback
  3. All other platforms continue using web scraping as before

When linkedin-api is not installed:

  1. LinkedIn lookups use web scraping (same as all other platforms)
  2. No errors or warnings - the system gracefully falls back

Privacy & Compliance

Important Notes on Legal Use:

Output Formats

JSON Structure

{
 "username": "john_doe",
 "platforms_found": [
 {
 "platform": "github",
 "username": "john_doe",
 "url": "https://github.com/john_doe",
 "exists": true,
 "profile_data": {
 "display_name": "John Doe",
 "verified": false,
 "follower_count": 150
 },
 "last_checked": 1698765432.0
 }
 ],
 "summary": {
 "total_platforms_checked": 8,
 "platforms_with_presence": 3,
 "confidence_score": 0.375,
 "risk_indicators": ["Multiple username variations found"]
 }
}

Report Format

╔══════════════════════════════════════════════════════════════╗
β•‘ SOCIAL MEDIA INTELLIGENCE REPORT β•‘
β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•

Target Username: john_doe
Analysis Date: 2025-10-24 15:30:45

SUMMARY:
β”œβ”€ Platforms Found: 3/8
β”œβ”€ Confidence Score: 0.375 (37.5%)
└─ Risk Level: LOW

PLATFORMS WITH PRESENCE:
β”œβ”€ GITHUB: https://github.com/john_doe
β”œβ”€ TWITTER: https://twitter.com/john_doe
└─ LINKEDIN: https://linkedin.com/in/john_doe

USERNAME VARIATIONS FOUND:
β”œβ”€ john_doe_2024: 2 platform(s)
└─ john_doe_official: 1 platform(s)

Performance & Limits

Testing

# Run social intelligence tests
python tests/test_social_intel.py

# Unit tests
pytest tests/test_social_intel.py -v

# Coverage report
pytest tests/test_social_intel.py --cov=core.osint.social_intel

Troubleshooting

Common Issues

  1. Timeout Errors:
    • Solution: Increase session_timeout in configuration
    • Default: 10 seconds
  2. Rate Limiting:
    • Solution: Implement longer pauses between requests
    • Use API keys for higher limits
  3. False Positives:
    • Solution: Use stricter validation patterns
    • Cross-reference with multiple indicators

Debug Mode

import logging
logging.getLogger("crawllama").setLevel(logging.DEBUG)

Roadmap

Planned Features (v1.5+)

API Extensions