Friday, June 20, 2025

The Digital Press

All the Bits Fit to Print

Ruby Web Development Artificial Intelligence Urban Planning Astronomy

New System Detects Dead Websites and Domain Ownership Changes

System for detecting dead websites and ownership changes improves crawler efficiency and data quality

From Hacker News Original Article Hacker News Discussion

Marginalia Search has implemented a new system to detect when websites are offline or have undergone significant changes, such as ownership transfers or domain parking, using minimal server requests to avoid burdening web servers. This system relies mainly on HTTP HEAD requests and DNS queries to gather data on site availability and changes, improving the quality of search results and crawler efficiency.

Why it matters: Detecting dead or changed websites prevents serving broken links and helps decide when to recrawl or archive domains.

The big picture: The web’s complexity and inconsistent standards make reliable uptime and change detection difficult but crucial for search engines and crawlers.

Stunning stat: Over 1 million domains checked in 8 hours revealed 777,062 successes and tens of thousands of various connection errors.

Commenters say: Users appreciate the nuanced approach and challenges but note edge cases like domain re-registration, geopolitical effects, and the usefulness of integrating archived web content like the Wayback Machine.