XML Sitemap: What It Is & How to Generate One

Rachel Handley

Mar 21, 20245 min read
Contributors: Zach Paruch and Christine Skopec
XML Sitemap
Share

TABLE OF CONTENTS

What Is an XML Sitemap?

An XML sitemap is a file that tells search engines like Google which URLs on your website should be indexed (added to its database of possible search results).

It may also provide additional information about each URL, including:

  • When the page was last modified
  • How often the page is updated
  • The relative importance of the page

This information can help search engines crawl (explore) your site more effectively and efficiently. And better match your pages with relevant search queries.

That’s why XML sitemaps are important in search engine optimization (SEO).

What Does an XML Sitemap Look Like?

An XML sitemap (or sitemap.xml file) looks something like this:

Nested xml tags that include data about various URLs on a coffee website

It’s called an XML sitemap because it’s written in Extensible Markup Language (XML). 

If you’re interested in the details, the main tags used are:

  • <urlset>: Encloses all the tags for each sitemap
    • <url>: Encloses all the tags for each URL
      • <loc>: Specifies the page’s complete URL
      • <lastmod>: Specifies when the page was last updated (optional) 
      • <changefreq>: Specifies how frequently the page is likely to change (optional) 
      • <priority>: Specifies the relative importance of the page from 0.0 to 1.0 (optional) 

Webmasters can also create dedicated image, video, and news sitemaps. To help search engines understand these specific types of content.

If you need to create more than one sitemap, you need a sitemap index. Which essentially acts as a sitemap for your sitemaps.

Do You Need an XML Sitemap?

An XML sitemap is highly recommended if you want your pages to show in search engine results.

If you don’t provide an XML sitemap, search engines have to rely on hyperlinks (on your own site or elsewhere) to discover pages on your site. This is inefficient and it can lead to pages being missed.

Now, let’s learn how to create an XML sitemap.

How to Generate an XML Sitemap

It’s likely that the platform you use to manage your website’s content automatically generates and updates your XML sitemap. 

You may be able to find yours by going to yourdomain.com/sitemap.xml in your browser.

Like this:

The webpage for semrush.com/sitemap.xml showing an XML sitemap

(This should work if you’re using WordPress, Wix, Squarespace, or Shopify.)

Otherwise, refer to the help center for your website builder or content management system (CMS). Or contact your platform’s support team.

If your platform doesn’t provide an XML sitemap, you can use a sitemap generator tool

These tools can also prove helpful if you want more control over your sitemap. For example, you can customize your WordPress sitemap with the Yoast SEO plugin.

If you use a tool outside of your platform to create a sitemap, make sure to publish it to your site to make it live.

How to Submit Your XML Sitemap to Google

It’s best practice to submit your sitemap to Google. (Rather than waiting for Google’s website crawlers to discover the file on their own.)

But first, make sure there are no issues with your XML sitemap.

With Semrush’s Site Audit tool, you can check whether your sitemap.xml file:

  • Can’t be found
  • Has formatting errors
  • Contains non-canonical or non-200 URLs
  • Isn’t specified in robots.txt
  • Is too large
  • Contains HTTP rather than HTTPS URLs

The tool also checks whether your SEO sitemap contains orphaned pages—URLs that aren’t linked to from anywhere on your site. (It’s best practice to add internal links to pages that should be indexed.)

Simply go to the “Issues” report after setting up your audit. And enter “sitemap” into the search bar.

Results for "sitemap" in Site Audit "Issues" report

If the tool flags an error, warning, or notice, you can read “Why and how to fix it.”

Rerun the audit after implementing any fixes. So you can check they’re working correctly.

“Rerun campaign” button in Site Audit

When you’re happy with your sitemap audit, sign into Google Search Console

And go to “Indexing” > “Sitemaps.”

“Indexing” > “Sitemaps" buttons in Google Search Console

Enter your sitemap’s URL into the “Add a new sitemap” section.

And click “Submit” when you’re done.

URL entered into the “Add a new sitemap” section

You should then see your file in the “Submitted sitemaps” section.

When Google has crawled your sitemap, you’ll see a “Success” notice in the “Status” column.

“Success” notice in the “Status” column under “Submitted sitemaps” section

Google will periodically re-crawl your SEO sitemap to check for any changes.

But if you make major changes that you want to be discovered quickly, you can re-submit your sitemap with a new request.

XML Sitemap Best Practices

If you’re using a sitemap.xml file generated by your website platform or a specialized tool, it’ll probably meet XML sitemap best practices.

But if you want to make sure, read and understand these guidelines.

First, your sitemap should only reference URLs that:

  • You want to be indexed. For example, you shouldn’t include pages from your staging environment. Or the URL for an order confirmation page.
  • Return a 200 status code. You shouldn’t attempt to index pages that return other http status codes. Such as 301 redirects (which indicate permanent redirects) or 404 errors (which indicate a page can’t be found).
  • Are fully qualified and absolute. In other words, make sure to specify the entire URL with the scheme, authority, and path (e.g., “https://sem.vipacademictools.com/blog/”).
  • Are canonicals. Canonical URLs represent the sole version of a page or the primary version of a duplicated page. 

And your sitemap file should:

  • Be UTF-8 encoded. This is a system that ensures search engines can understand all the characters you’re using. For example, you’ll need to use & amp; (without the space) in place of a "&" symbol.
  • Be less than 50MB or 50,000 URLs. If necessary, you can create multiple sitemaps and a sitemap index file.
  • Specify the correct namespace. A namespace is like a label that tells the search engine what kinds of rules the sitemap follows. Most sitemaps use the “http://www.sitemaps.org/schemas/sitemap/0.9” namespace to show that the file conforms to standards set by sitemaps.org.
  • Include language and region variants for each URL (where applicable). You can learn more in this resource from Google.

Lastly, make sure to link to your sitemap from your robots.txt file. This is a website file that tells search engines which pages they should and shouldn’t crawl.

Ensure Your SEO Sitemap Is Up to Code

With Semrush’s Site Audit, you can easily check for issues related to your XML sitemap.

The tool also checks for dozens of other issues that can harm your SEO results.

Create your free Semrush account and see what needs fixing on your site.

Share