Home > Drupal 6 XML Sitemap for Nodes

Drupal 6 XML Sitemap for Nodes

After upgrading to Drupal 6 I opted for a quick and dirty XML sitemap approach. Before I was using the XML Sitemap module which is currently available for Drupal 6 as a development snapshot or directly from CVS. The module offers settings for priority and change frequency. Moreover the module allows for adding taxonomy term and user URLs to the sitemap.

I only wanted nodes and the front page to appear in the sitemap's XML output without priority or change frequency information. Having the path and pathauto modules enabled, which ensure that every node gets a meaningful and search engine friendly URL, a simple database query joining two tables is enough to get the necessary data for all published nodes.

Code Snippets

To make the sitemap reachable via a URL a menu item of the type MENU_CALLBACK goes into the menu hook of a module named custom. The menu hook changed in Drupal 6 and so did the whole menu system which was completely rewritten by chx. To learn more about it, read the menu module documentation.

<?php
function custom_menu() {
 
$items = array();
 
$items['sitemap'] = array(
   
'title' => 'XML Sitemap',
   
'access arguments' => array('access content'),
   
'type' => MENU_CALLBACK,
   
'page callback' => 'custom_sitemap'
 
);
  return
$items;
}
?>

When the URL /sitemap is requested the function custom_sitemap() is called. The sql query joins the node and url_alias tables to retrieve all modified dates and URL aliases of published nodes which are stored in an associative array called urls. The URL aliases 403 and 404, that are used for custom error pages, are omitted from the array.

What follows is putting together the XML output string in a here document and a foreach loop and printing it out.

<?php
function custom_sitemap() {
 
$base = ($_SERVER['HTTPS'] ? 'https://' : 'http://') . $_SERVER['SERVER_NAME'] . base_path();
 
$urls = array();
 
$result = db_query("SELECT ua.dst, n.changed FROM {node} n INNER JOIN {url_alias} ua ON ua.src = CONCAT( 'node/', n.nid ) WHERE n.status =1 ORDER BY n.changed DESC");
  while(
$r = db_fetch_object($result)) {
    if (
$r->dst != 404 && $r->dst != 403) {
     
$urls[$base . $r->dst] = $r->changed;
    }
  }
 
 
header('Content-Type: text/xml');
 
$xml =<<<EOF
<?xml version="1.0" encoding="UTF-8"?>

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url><loc>$base</loc><changefreq>daily</changefreq></url>
EOF;
  foreach ($urls as $url => $t) {
    $xml .= '<url>';
    $xml .= '<loc>' . $url . '</loc>';
    $xml .= '<lastmod>' . date("Y-m-d", $t) . '</lastmod>';
    $xml .=  '</url>';
  }
  $xml .= '</urlset>';
  print $xml;exit();
}
?>
Exactly what I was looking for. And then I see that it's you. Nice. Was it also you who was mentioned in the sprydev podcast? I'm looking forward to the next DUB meeting! I'll see you there. Bob

When the next DUG meeting takes place I am not in Berlin. See you at Linuxtag. Greets Ramiro

Thank you very much for that information! I am planning to update my Drupal 5.X portals to 6.2 in the next days and was looking for such a sitemap solution.

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <p> <br> <img> <h2> <h3> <h4> <h5>

More information about formatting options

Loading