Semantic Sitemap

Semantic Web Crawling: A Sitemap Extension extends the Sitemap protocol to efficiently discovery and use RDF data.

XML Sitemaps facilitate the effective crawling of GoodRelations in shop sites.

When publishing a sitemap indicate the proper date and time of the last modification for each resource (using the "lastmod" attribute in the sitemap). Some simple sitemap tools use the date of generating the sitemap for all entries instead of the actual date for each resource from the resource's meta-data. This means that the wholesite has to be crawled each time you generate a new sitemap. This is a lesser problem for a small or medium-size site (less than 5,000 pages / articles), but becomes a major issue for sites with tens of hundreds of thousands of pages.

Example Sitemaps

Sitemap Generation Tools

PingTheSemanticWeb for Semantic Sitemaps provides a Python script that uses semantic sitemaps to PingTheSemanticWeb with the available RDF data.

Web crawlers should respect the Robots exclusion standard (robots.txt).