This is the first chapter in the OSINT section of SystemLog. It presents my approach to building a private, secure and fully self-hosted OSINT toolkit, embracing structured methodology, automation and local AI — without sending sensitive queries or data to external cloud providers.
The goal is not to replicate law-enforcement tools or closed platforms. The goal is to build a practical, field-ready OSINT workflow that relies exclusively on:
- open-source intelligence,
- passive techniques,
- automated processing via n8n,
- and a private AI running locally on the HOME server.
This creates a controlled, ethical and non-leaking analysis environment.
1. What OSINT means in practice
Open-Source Intelligence is not “hacking” and it is not privileged access.
OSINT is:
- collecting publicly available data
- analysing signals, metadata and patterns
- correlating information from open sources
- understanding digital footprints
- building context
- drawing conclusions from evidence, not assumptions
The vast majority of actionable intelligence comes not from secret access — but from the ability to notice what others overlook.
2. Ethical boundaries & principles
Before any tool or technique, OSINT must follow strict rules:
Only publicly accessible information
No interaction with targets
No exploitation, intrusion or bypassing protection
Preserve privacy where possible
Avoid unnecessary collection
Document sources transparently
These principles define the difference between:
OSINT (legal, passive, public)
vs
Intrusion / exploitation (illegal, active)
The SystemLog OSINT toolkit is built entirely on the legal, passive side.
3. Core OSINT methods I rely on
My workflow is based on the following passive intelligence categories:
1. Web & content intelligence
- historical snapshots
- redirects
- server metadata
- robots.txt, sitemaps
- fingerprinting technologies
2. Domain & network intelligence
- WHOIS lookups
- DNS records
- name server chains
- certificate transparency logs
- subdomain enumeration
- passive scanning databases
3. Metadata & file analysis
- EXIF data
- document metadata
- archive structure
- hashing & comparison
4. Infrastructure signals
- headers
- TLS fingerprints
- routing changes
- hosting provider footprints
5. Social & contextual signals
(no direct user data, only open profiles)
- post timing
- network patterns
- organisation structure
- linked resources
These represent the “raw materials” that the toolkit processes.
4. Tool stack – private, modular, extensible
The OSINT lab is composed of open-source tools running entirely in my own infrastructure.
Local tools
- dnsx, httpx, subfinder
- whois, dig, curl
- hashing & metadata utilities
- local storage for evidence
- Python scripts & custom parsers
n8n automation workflows
Used for:
- periodic scans
- snapshotting URLs
- collecting passive fingerprints
- exporting evidence into files
- parsing large datasets
- sending alerts
- correlation tasks
Local AI (Sim AI + Ollama)
Used for:
- text classification
- summarisation
- recognising patterns
- comparing changes over time
- grouping related data
- writing human-readable reports
No cloud LLM is used — this keeps all input private and controlled.
PRIVATE DNS + Pi-hole
For:
- resolving OSINT targets cleanly
- logging DNS behaviour
- anonymising traffic
- blocking telemetry
Secure backbone (WireGuard)
Ensures:
- remote OSINT done via HOME stack
- no leaks
- no cloud exposure
The entire OSINT workflow is isolated within the private infrastructure.
5. How a private AI transforms the OSINT workflow
Using an offline LLM instead of cloud AI services provides several benefits:
No data leaves my network
No third-party logging
No rate limits
Unlimited usage
Custom prompts and templates
Ability to process sensitive scenarios safely
The AI is not “making up intelligence” — its job is to organise findings, detect patterns and create structured reports.
Anonymised examples:
- turning raw DNS data into a concise summary
- grouping discovered subdomains by similarity
- comparing site changes between two snapshots
- detecting technology stacks from headers
- generating reports for SystemLog automatically
This creates a human-AI hybrid workflow far more powerful than manual OSINT alone.
6. Automation: OSINT through n8n
n8n acts as the engine behind the OSINT toolkit.
Examples of automated tasks:
Scheduled metadata collection
Take snapshots of:
- headers
- status codes
- redirects
- certificates
Domain intelligence
- fetch WHOIS
- check NS changes
- enumerate passive subdomains
- analyse certificate logs
Evidence bucket
- save every scan to timestamped folders
- hash results for integrity
- correlate changes over time
AI report generation
Sim AI takes structured data and writes:
- daily summaries
- full OSINT reports
- change analysis
- human-readable explanations
This automation frees time for interpretation, not typing.
7. Anonymised workflow example
Here is an example (anonymised) OSINT flow:
- Input:
target.example - n8n fetches DNS, WHOIS, subdomains
- Tools fingerprint technologies and server signatures
- Certificates checked in CT logs
- HTML + headers archived
- Metadata extracted
- All results sent to the local AI
- AI summarises findings into a SystemLog-ready report
No public cloud is involved. No personal or sensitive information is processed.
Everything stays within the HOME → WireGuard → Edge private loop.
Conclusion
This OSINT toolkit is not built to impress with exotic exploits or intrusive techniques. It is built to be:
- ethical
- passive
- private
- automated
- AI-enhanced
- resilient
- and fully self-hosted
It allows me to analyse digital signals, detect patterns, and document findings — without relying on external providers or leaking data.
This marks the beginning of the SystemLog OSINT series.