About - MapTheNet.org

The Project

MapTheNet.org is a non-profit research initiative that systematically crawls the public web to record domain-level link relationships. By analysing which domains link to which other domains, we construct a graph that represents the connective tissue of the internet.

The resulting dataset is released openly so that researchers, journalists, educators, and policymakers can study how the web is organized, identify structural vulnerabilities, and track changes over time.

Why It Matters

The internet is often described as a network, yet relatively few people have seen its actual structure. Commercial search engines index pages for relevance; we index domains for structure. Our data answers questions like:

Which domains serve as critical hubs or bridges?
How do national webs differ in density and connectivity?
How has the web's structure changed year over year?
Where are single points of failure in online infrastructure?

Methodology

How we build the map, from crawl to visualization.

1. Crawling

Our distributed crawler visits publicly accessible domains, following outbound links while respecting robots.txt and enforcing per-domain rate limits. Only the source and target domain of each link are recorded.

2. Aggregation

Raw link observations are deduplicated and aggregated into a weighted directed graph. Each edge carries a weight representing the number of distinct pages on the source domain that link to the target domain.

3. Enrichment

Domains are enriched with metadata: TLD, country of registration (via public WHOIS), category (via heuristic classification), and basic HTTP response headers such as server software and status codes.

4. Visualization

The graph is rendered using a force-directed layout algorithm. Nodes are positioned so that heavily connected domains cluster together, revealing the natural communities and hierarchies of the web.

Team

MapTheNet is maintained by a small group of volunteers.

Project Lead

Responsible for overall direction, architecture, and data governance.

Crawl Engineering

Builds and operates the distributed crawling infrastructure.

Data & Visualization

Processes crawl data, generates exports, and maintains the map UI.

Want to contribute?

We welcome contributors of all skill levels. Use the contact form to reach the maintainers and we will reply as soon as possible.

Sponsors & Partners

We are grateful for the support of organizations that share our mission.

Sponsorship slots are available. If your organization supports open data, internet research, or digital rights, please reach out via our contact form.