Arne Brachhold

Google Sitemaps FAQ (Sitemap Issues And Errors)

Filed under: SEO,Sitemaps — arne on April 7, 2006

In June 2005, Google announced a new service called Google Sitemaps. This program allows webmaster to submit an index of URLs which they want to have included in Googles web search. It’s free to use and helps Google to get a more complete overview of your pages.

This FAQ answers the most asked questions reached me per mail. Feel free to contact me if you have suggestions or corrections for this page!

Overview

General Google Sitemap questions

Technical questions

Help with error messages

Statistics and verification

Sitemap Generator for WordPress Plugin FAQ

About this FAQ

General Questions

What is or are Google Sitemaps?

Basically, a Google Sitemap is a file which contains URLs and some additional information for all public pages or documents of your website. Google can read this file and add the defined pages to their index. The Google Sitemaps program is part of the "Google webmaster tools".

And what’s "Google Sidemaps"?

It’s just a typing error. The correct name is "Google Sitemaps".

How can I create Google Sitemap for my website?

If you are using the blogging software WordPress, you can use the Google Sitemap Generator for WordPress plugin. Otherwise, you can check the list of third party programs at code.google.com.

Do I need a Google Account to use Google Sitemaps?

There are two ways to notify Google about your sitemap.

  • You can register to the Google Sitemaps program and submit your sitemap. This will also allow you to see some interesting statistics about your site like the most used keywords and spidering problems.
  • If you don’t have a Google account or don’t want to create one, you can notify Google about your sitemap by "pinging" their sitemap server. All you need to is to point your browser to http://www.google.com/webmasters/sitemaps/ping?sitemap=http://www.name.com/sitemap.xml
    Google will check your sitemap for updates regularly, so you don’t need to do this more than once.

Will Google Sitemaps affect my ranking?

There are no evidences that a Google Sitemap will directly affect your ranking. However, it can help Google to index and crawl your page better which may result in a more complete index of your page.

Where can I submit my German (or international) sitemap?

There are no different sitemap programs for languages or countries. Just register to the international Google Sitemaps Program, which is localized in many languages.

Technical questions

What is the maximum size of a Google Sitemap?

According to the Google Sitemap FAQ, your sitemap can contain up to 50,000 URLs or reach a file size of 10MB (uncompressed!). However, I would recommend you to split such large sitemaps into various smaller ones which allows Google to retrieve only the latest ones regularly. This will save you a lot of traffic.

Can I use RSS as my Google Sitemap?

Yes, Google Sitemaps supports RSS 2.0 and Atom 0.3 feeds. However, a Google Sitemap should contain as much pages of your site as possible but a RSS feed contains only the latest ones normally.

Do I have to resubmit my sitemap every time I change it?

Google will check your sitemap for updates regularly, so you don’t need to inform them. However, you can resubmit your sitemap by clicking the "Resubmit" button on the Google Sitemaps Site Overview (Google Account required) or by pointing your browser to http://www.google.com/webmasters/sitemaps/ping?sitemap=http://www.name.com/sitemap.xml
If you are using a program to generate your sitemap it’s likely that there is an option to notify Google about changes automatically.

Can I hide my sitemap from other people?

I don’t know any reason why you should hide your sitemap from other people. They can find most of your URLs by searching with the "site:www.exampe.com" operator in Google. Of course, you don’t need to call it sitemap.xml but my-completely-hidden-sitemap.xml for example.

Does my Google Sitemap have to end with .xml?

No, you can name it whatever you like, just make sure you are sending the correct mime type (text/xml for xml data). You can configure your Apache server to send "text/xml" for your favourite extension by adding "AddType text/xml .yourext" to your .htaccess file or httpd.conf.

Can you give me an example of a Google Sitemap?

Basically, a Google Sitemap looks like the following sample. A "real" sitemap can be found here. (Note that the linked example is styled via XSLT to make it more readable. Use the "Show Source"e; function of your browser to see the actual XML code.) All fields except of "loc" are optional.

  1. <?xml version="1.0" encoding="UTF-8"?>
  2. <urlset>
  3.     <url>
  4.         <loc>http://www.arnebrachhold.de/</loc>
  5.         <lastmod>2006-05-22T12:31:11+00:00</lastmod>
  6.         <changefreq>weekly</changefreq>
  7.         <priority>1</priority>
  8.     </url>
  9.     <url>
  10.         <loc>http://www.arnebrachhold.de/imprint/</loc>
  11.         <lastmod>2006-05-22T13:31:11+00:00</lastmod>
  12.     </url>
  13.     <url>
  14.         <loc>http://www.arnebrachhold.de/foo/</loc>
  15.     </url>
  16.     <!-- This is a comment -->
  17. </urlset>

How do I know if my sitemap.xml is correct?

You can just submit your sitemap to the Sitemaps program and wait until it’s downloaded the first time. Google will tell you if there are any errors. If you want to validate your sitemap before submitting it, you can use a XML validator to validate the structure of your sitemap file.

Help with error messages

What means the error message "This url is not allowed for a sitemap at this location"?

This error means that your sitemap contains URLs which are not alllowed in your sitemap. Your sitemap can only include URLs which point to files on same domain and the same or deeper directories. Let’s take a small example to illustrate it:
Your sitemap file is saved at http://www.example.com/herbert/sitemap.xml
Your sitemap is allowed to contain URLs like

  • http://www.example.com/herbert/
  • http://www.example.com/herbert/home.html
  • http://www.example.com/herbert/test/index.html
  • http://www.example.com/herbert/info/about/guestbook

but you can’t include URLs like

  • http://www.example.com/ (Higher directory level than your sitemap)
  • http://www.example.com/herbert (Higher directory level, missing trailing slash so "herbert" is treated like a file)
  • http://www.example.com/lusie/ (Higher directory level than your sitemap)
  • http://www.herbert.com (Not the same domain)

You can include the 3 first denied URLs by moving your sitemap one directory higher to http://www.example.com/sitemap.xml
Now you can include all URLs which point to http://www.example.com/ like

  • http://www.example.com/herbert
  • http://www.example.com/
  • http://www.example.com/lusie/
  • http://www.example.com/lusie/all-about-google.html

What means the error message "Invalid date"?

This error means that your sitemap contains an entry which has an invalid last modified date. Google Sitemaps requires the ISO-8601 encoding which has two variations:
- 2005-02-21
This one just contains the year (4 digit), the month (2 digit) and the day (2 digit)
- 2005-02-21T18:00:15+00:00
This one is more complex and includes the year (4 digit), the month (2 digit) and the day (2 digit) followed by the character "T", the hour (2 digit), colon, minute (2 digit), colon, second (2 digit) and finally the time zone with the character "+" OR "-", the timezone offset in hours (2 digit) and minutes (2 digit).
It’s important to include ALL parts of the chosen date format and pay attention to the correct order. See the ISO-8601 specification for more examples.

What means the error message "We couldn’t find your verification file."?

Double check that you named your verification file correctly (like Google told you) and that there are no spaces in front or after the file name. If it still doesn’t work, make sure your server returns correct status headers for existing documents (200) and not existing documents (404). You can check this by using a sniffer or HTTP Request Tool.
Enter the URL to your verification file and check that your server returns "200 OK" as the status code. Then try a not existing page and verify that your server returns "404 Not found". If you got a "200 OK" again, review your server configuration about error documents.

Statistics and verification

May I delete my verification file after I verified my site?

You can delete the verification file but Google will check the existence of the file regularly so you will have to create it again soon.

Why does Google check the existence of my verification file regularly or why should I not delete it?

This will allow Google to ensure that you are the current owner of the domain or have the permission to use it. If you buy a domain from another person he will lose the access to your Google Sitemaps statistics after you deleted his verification file.

Can other people view my statistics?

Only people who have write access to your webserver, via FTP for example, can view your statistics after they successfully verified their Google account with putting a file with a specified name on your web space. The file name is given by Google and depends on the Google account name. As long as your webserver is secure, nobody beside you can view your Google Sitemaps statistics.

Do I need a Google Sitemap to view statistics about my website?

No, just sign up to Google Webmaster Tools and add your website. As soon as you verified your submission, you are able to view statistics like crawling errors, top keywords and so on. You don’t need to submit a sitemap for that.

Sitemap Generator for WordPress Plugin FAQ

WordPress needs to long to create my sitemap or I get a timeout error / blank page

Try to increase the memory and time limits which are located under Advanced options at the sitemap configuration page. See also this question.

I have no comments (or disabled them) and all my postings have a priority of zero!

Disable automatic priority calculation and define a static priority for posts! There is also an option to define a minimum post priority.

Do i need a Google Account for this plugin?

Maybe. If the “Auto-Ping Google Sitemaps” Feature works, you don’t need a Google Account. If your host disabled the required PHP functions for this feature, you need a Google Account and submit your sitemap once. Google will check the sitemap file periodically for changes.

Do I always have to click on "Rebuild Sitemap" if I modified a post?

No need to do that. If you edit/publish/delete a post, your sitemap gets automatically regenerated!

So much configuration options… Do I need to change them?

No, only if you want. Default values should be ok!

Does it work with all WordPress versions?

This plugin works with WordPress 1.5.1.1 or higher only. Please upgrade if your are using an older version.

I get an fopen error and / or permission denied

If you get permission errors make sure that the script has writing rights in your blog directory. Try to create the sitemap.xml resp. sitemap.xml.gz manually, upload them with a ftp program and set the rights to 777 with CHMOD. Then restart sitemap generation on the administration page. A good tutorial for changing file permissions can be found on the WordPress Codex.

Which MySQL versions are supported?

This plugin works with all MySQL 4, 5 and newer MySQL 3 builds.

Do I need cronjobs to run this plugin?

No, you don’t need any cronjobs. The sitemap gets rebuilt if you edit a post.

The sitemap files could not be written!

Make sure that the files "sitemap.xml" and / or "sitemap.xml.gz" are writable. You have two options to ensure this:

  • Make your blog root writable
    You can make your whole blog root folder writable and the plugin will create the files for you. You can do this by using a FTP program and setting CHMOD to 755 or 777 to your web-root folder. This folder is often named "htdocs", "html", "public" or "httpdocs".
  • Create the sitemap files and make them writable
    If you can’t make your blog root folder writable or don’t want to do so, you can create two new files, name them "sitemap.xml" and "sitemap.xml.gz", upload them to your blog root and use a FTP program to apply CHMOD 755 or 777 to them. To create these files, simply open Notepad, click on "File > Save As", choose "Filetype: All File" and enter "sitemap.xml" as the name. Repeat this step for "sitemap.xml.gz".
  • More information about changing file permissions
    Please look at the WordPress Codex or this toturial to get step by step advices for your FTP program or check the manual.

The last run didn’t finish or I just get a white screen

It could be that your server is not configured to run memory-heavy scripts like a sitemap generator.

  • Try to increase the memory limit on the sitemap options page (start by "4M" and raise the value if it doesn’t help).
  • Try to increase the time limit on the sitemap options page (start by "20" and raise the value if it doesn’t help).
  • If it’s still not working you may not have the permission to change these settings so you need to ask your hoster to raise the limit

What’s the difference between the "sitemap.xml" and "sitemap.xml.gz" files?

The "sitemap.xml.gz" is a compressed version of the "sitemap.xml" file. It has the same content, but is significantly smaller than the other one. This helps you and the search engines to save a lot of traffic. Since all search engines support compressed sitemaps, you actually don’t need the "sitemap.xml", but maybe you or your visitors want to view them from time to time so keeping it doesn’t hurt.

What are the different building modes?

You can choose when your sitemap gets regenerated:

  • Rebuild sitemap if you change the content of your blog
    Your sitemap gets automatically refreshed when you publish or delete a post. If you have really much post, the process may take some time and you have to wait on the posting screen until it’s finished.
  • Enable manual sitemap building via GET Request
    This option allows you to refresh your sitemap using a special URL which is displayed when you click on the "[?]" sign. This url can be used with a cron job for example which refreshed the sitemap every day or every hour. This mode is prefered if you have thousands of post and the automatic building needs to long

What is the update notification?

This plugin can automatically notify Google and YAHOO when the content of your blog changes. This service is free to use, YAHOO just requires an API key which can be freely obtained here. After the search engines recieved the "ping" they may come and crawl your site again. Since the sitemap files contain the last change of every post or page, the spiders should just retrieve the changed ones and save your traffic. The plugin stops the time the notification of every search engines needs and might recommend you to disable this service if it slows down the building process significantly.

About the advanced options

  • Limit the number of posts
    If you have problems with the maximum execution time or memory limit you can limit the number of posts which will be included in the sitemap. Newer posts are included first so your sitemap will stay up-to-date.
  • Increase the memory limit
    Building the sitemap needs a lot of memory. If the memory size is limited via configuration and the script can’t finish the sitemap, you can try to increase this limit by entering a higher value. The values are in megabytes so you can start with "2M" for smaller sites and raise the number until it works. However, it could be that you don’t have the permission to change this value so if it still doesn’t work and you already tried a very high value like 16M, you will need to contact your webhoster and ask him to raise it for you.
  • Increase the maximum execution time
    Like the memory, the maximum execution time can also be limited. If the script doesn’t finish, try to set the time limit to "0" which means unlimited or a high value like "30" seconds.
  • Include a XSLT stylesheet
    Since version 3.05b, the plugin ships a default XSLT stylesheet which makes your XML sitemap human readable. You can specify you own by entering a full or relative URL. Please note that the XSLT stylesheet must be on the same server for security reasons.
  • Enable MySQL standard mode
    Per default, the plugin uses a separate MySQL connection to query the post data in a very effective, memory-saving way. If this doesn’t work on your hosting configuration you can enable the MySQL Standard mode which uses much more memory but should always work.
  • Build the sitemap in a background process
    If your blog contains a large number of posts you may experience a delay after editing or saving a new post or page since it needs some time to generate the sitemap. If you activate this option, your sitemap will be built in background using wp-cron which will avoid the delay. Your sitemap will be generated a few seconds after you’ve hit the save button so the sitemap status at the administration panel won’t show the changes immediately.
  • Exclude posts or pages
    Here you can enter the IDs of posts or pages which will not be included in your sitemap. You can see the IDs of the post or pages in the corresponding management pages. Separate multiple IDs by comma.
  • Allow anonymous statistics
    This will send some anonymous statistics to the author of the plugin. It will send the following data: Plugin Version, WordPress Version, PHP Version, language and a unique string to avoid duplicates. Why is this useful? I can optimize the plugin for the most used WordPress / PHP versions and improve the translations for the most common languages. The plugin will NEVER send anything personal, for example your blog url, title, name or email address. There is no way to find out who is using the plugin for what.

Google Sitemaps and robots.txt

You can use the robots.txt file to inform search engines about your sitemap. If you activate this option at the administration panel, the plugin will try to create the file in your blog root. The "File permissions" status below the checkbox will give you a hint if this is possible or not. If the robots.txt file cannot be generated due to insufficient file permissions, please create the robots.txt file by yourself and make it writable via CHMOD. A good tutorial for changing file permissions can be found on the WordPress Codex. The plugin will NOT delete your existing robots.txt file but append the new values at the end.

About this FAQ

I didn’t find an answer to my problem, are there any additional resources?

You can look at the Official Google Sitemaps FAQ, join the Google Sitemaps Group or leave a comment on this post.

What can I do if I have suggestions or corrections for this FAQ?

Suggestions and corrections are always welcome, please write me a mail if you have some and I will update this page .

15 Comment(s)

Comment by Greg

Posted on April 16, 2006

I am gettig this WordPress database error… when using your great Google sitemaps plugin.

"Got a packet bigger than ‘max_allowed_packet’." What would be the cause of this error. Any help would be appreciated

Comment by Gerry

Posted on April 27, 2006

The urls generated in my google site map are missing the www and google is complaining.

When the url for a post is generated it shows as http://name.com/name instead of http://www.name.com/name

Comment by arne

Posted on April 28, 2006

@Greg: I don’t know this MySQL error. Looks like a configuration issue. You could ask your webhoster to increase the max_allowed_packet value.

@Gerry: Could you provide me an URL to your sitemap? The plugin uses the URL defined in WordPress administration page as the blog URL. Is that one correct?

Best regards,

Arne

Comment by James Wolf

Posted on May 8, 2006

I am using 2.7.1, is there a way to have it not show protected pages? It will be pretty hard for Google to index something if it doesn’t know the password. :P

Comment by dan

Posted on June 1, 2006

I\’ve tried both 2.7 and 3.0beta but both result in the problem you already know about with the blank page. I made the change you suggested in the FAQ but it hasn\’t fixed it. Has there been any more progress on that?

Moderated second comment: Never mind. I read some more and adjusted the memory limit in php.ini and it\’s working now.

Comment by Luke

Posted on June 9, 2006

I can’t seem to get google to accept the sitemap without errors I have no problem generating the sitemap but I keep getting the “This url is not allowed for a Sitemap at this location”. I first tried the sitemap where I had wordpress installed on the root of the domain and it wouldnt accept. Then I Installed wordpress in a seperate directory /blog and got the same errors. Then I put the sitemap on the root of the site where wpress was on /blog and still wont accept any of the URLS. I have tried all of the options I can think of is there some step I am missing? Do I need to have a certain permalink struture.

Comment by Matthias

Posted on June 10, 2006

Fatal error: Call to undefined function: get_option() in /kunden/104592_96120/projekt-shop/sitemap/sitemap.php on line 345

is the error, i have 15 Sitegenerators downloaded, but nothing run.
Please help me.
Thanks

Comment by arne

Posted on June 11, 2006

Hi,

@James Wolf: The current beta supports this option.

@dan: Did you already upgrade to WP 2.0.3?

@Luke: Did you submit the correct domain? If your sitemap contains links with www, your sitemap must be submitted with the www. If this is correct, please post your website URL.

@Matthias: This plugin is for WordPress only. Check this list for other ones.

Best regards,

Arne

Comment by Phil Scoville

Posted on June 14, 2006

I am having trouble with the “We couldn’t find the Sitemap at the location you provided.” Can you please help me? I tried to use the info in your Sitemaps FAQ but but I am still running into errors.

Thanks.

Comment by Tom

Posted on June 23, 2006

Arne,

I have tried a couple times to reach you somehow…I have 3.01b running on WP 1.5…..It is creating a sitemap and properly pinging Google, but I can’t change the priorities from 0…they won’t save no matter what I do…Are there any suggestions that you could give me?

I would really appreciate it!

Tom

Comment by Scott

Posted on June 27, 2006

I am receiving the following error every time I update a post or rebuild my sitemap.

Warning: mktime() expects parameter 1 to be long, string given in /home/public_html/wp-content/plugins/sitemap.php on line 1885

I feel like it was working for a while, but then stopped. I have tried disabling all of my plugins except this one, but that didn’t work either. I am using version 3.0b1 against WordPress 2.0.3

Comment by Jonathan

Posted on July 13, 2006

Arne, I love the plugin. It works beautifully. According to this answer we are limited to the size of our sitemaps. What changes do I need to make to the plugin to make it automatically generate a new sitemap at that point and also a sitemap index pointing to the various sitemaps? I haven’t the need for this functionality yet, so it’s not terribly important. But it would be nice to automate it once I get there.

Comment by Henk Koning

Posted on July 21, 2006

Hi,

I got a blank page after I hit rebuild for the firsttime. I even tried to upload a empty sitemap.xml and cmod it to 666. Nothing works. Maybe the timeout is to fast? I got probably 1000+ posts. I use wp 2.03

any ideas?

Thx!

Comment by Reuben

Posted on August 3, 2006

Hello,

Since I installed sitemap, google’s bot has stopped indexing my site properly — it just hits wp-comments.php and then leaves again. Can you take a look at my site (and sitemap) and see if you see any misconfiguration?

Thanks
Reuben

Comment by Roy

Posted on August 4, 2006

I am having the same problem as someone above. I submitted my site to Google with http://www.sitename.com . My front page begins with www, my permalinks begin with www, but the site map is showing http//sitename.com with no www.

Google is showing errors because of the discrepancy.

I should say that the front page is showing up as http://www.sitename.com but the posts are not showing up with www.

Thanks

27 Trackbacks

Sorry, the comment form is closed at this time.