Drupal 6 XML Sitemap for Nodes
Posted: Fri 05/02/2008 by ramiroAfter upgrading to Drupal 6 I opted for a quick and dirty XML sitemap approach. Before I was using the XML Sitemap module which is currently available for Drupal 6 as a development snapshot or directly from CVS. The module offers settings for priority and change frequency. Moreover the module allows for adding taxonomy term and user URLs to the sitemap.
I only wanted nodes and the front page to appear in the sitemap's XML output without priority or change frequency information. Having the path and pathauto modules enabled, which ensure that every node gets a meaningful and search engine friendly URL, a simple database query joining two tables is enough to get the necessary data for all published nodes.
Code Snippets
To make the sitemap reachable via a URL a menu item of the type MENU_CALLBACK goes into the menu hook of a module named custom. The menu hook changed in Drupal 6 and so did the whole menu system which was completely rewritten by chx. To learn more about it, read the menu module documentation.
<?php
function custom_menu() {
$items = array();
$items['sitemap'] = array(
'title' => 'XML Sitemap',
'access arguments' => array('access content'),
'type' => MENU_CALLBACK,
'page callback' => 'custom_sitemap'
);
return $items;
}
?>When the URL /sitemap is requested the function custom_sitemap() is called. The sql query joins the node and url_alias tables to retrieve all modified dates and URL aliases of published nodes which are stored in an associative array called urls. The URL aliases 403 and 404, that are used for custom error pages, are omitted from the array.
What follows is putting together the XML output string in a here document and a foreach loop and printing it out.
<?php
function custom_sitemap() {
$base = ($_SERVER['HTTPS'] ? 'https://' : 'http://') . $_SERVER['SERVER_NAME'] . base_path();
$urls = array();
$result = db_query("SELECT ua.dst, n.changed FROM {node} n INNER JOIN {url_alias} ua ON ua.src = CONCAT( 'node/', n.nid ) WHERE n.status =1 ORDER BY n.changed DESC");
while($r = db_fetch_object($result)) {
if ($r->dst != 404 && $r->dst != 403) {
$urls[$base . $r->dst] = $r->changed;
}
}
header('Content-Type: text/xml');
$xml =<<<EOF
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url><loc>$base</loc><changefreq>daily</changefreq></url>
EOF;
foreach ($urls as $url => $t) {
$xml .= '<url>';
$xml .= '<loc>' . $url . '</loc>';
$xml .= '<lastmod>' . date("Y-m-d", $t) . '</lastmod>';
$xml .= '</url>';
}
$xml .= '</urlset>';
print $xml;exit();
}
?>- Login to post comments

When the next DUG meeting takes place I am not in Berlin. See you at Linuxtag. Greets Ramiro
Thats fine with me Dinaiz. The reason why I did not put this code into a module of its own, is because there is an XML sitemap module for Drupal, that gives you more control than my code snippet.
Anyone, including myself, who is comfortable with such a simple approach can use this code or now your module, which is good.
My url's look like this:
http://www.mydomain.ch/en/productsSo I had to add the language path-prefix to the sitemaps url's.
I applied the following changes:
replaced:
$result = db_query("SELECT ua.dst, n.changed FROM {node} n INNER JOIN {url_alias} ua ON ua.src = CONCAT( 'node/', n.nid ) WHERE n.status =1 ORDER BY n.changed DESC");$urls[$base . $r->dst] = $r->changed;with the following:
$result = db_query("SELECT ua.dst, n.changed, n.language FROM {node} n INNER JOIN {url_alias} ua ON ua.src = CONCAT( 'node/', n.nid ) WHERE n.status =1 ORDER BY n.changed DESC");$urls[$base . $r->language . '/' . $r->dst] = $r->changed;In the example this code goes into a module called custom.module which should be placed in the directory where you install contributed modules, e.g.
/sites/all/modules. You also need an .info file which would be calledcustom.info. See the documentation on .info files.When do you get that message?
Do you see the XML output? Which browser are you using?
As an aside, this code could be better optimized by putting the while loop in place of the foreach statement. As it is written, you are parsing through the returned result set twice: once to create an array, and then looping through that same array to generate the XML. On large pages, that could be resource costly.
Good point, the additional foreach loop should be avoided.
I use it on this site where it contains entries for all nodes with url aliases. The number is significantly higher than 22 or 40.
Are you sure you understand what this code does?
what's code like: #mydiv { position:absolute; top: 50%; left: 50%; width:30em; height:18em; margin-top: -9em; /*set to a negative number 1/2 of your height*/ margin-left: -15em; /*set to a negative number 1/2 of your width*/ border: 1px solid #ccc; background-color: #f3f3f3; }Check out this post http://dinaiz-two-dot-zero.blogspot.com/2008/07/easily-add-sitemap-to-dr... on Dinaiz' blog. He put this code into a module. I have not tried it, but I guess it is what you are looking for.
Hi Jonny checkout the XML Sitemap module for Drupal http://drupal.org/project/xmlsitemap which offers configuration options for priority and content types.