How to create and submit Sitemaps. Detailed Guide to Sitemap File Sitemap xml File - Available Directives

This article will teach you how to create a Sitemap and provide Google access to him.

Creating and Submitting Sitemaps

Sitemap file formats

Google supports several sitemap file formats, described below. All formats should use a standard protocol. Google does not currently support the attribute In Sitemap files.

All formats are subject to the following restrictions: a Sitemap can contain a maximum of 50,000 URLs and its uncompressed size must not exceed 50 MB. If the size of the file or the number of addresses listed in it exceeds these limits, split it into several parts. You can create a Sitemap index file listing all your Sitemaps and submit them to Google all at once.

Text file

If your sitemap only has page addresses, you can send Google a plain text file with those URLs (one on each line). Example:

Http://www.example.com/file1.html http://www.example.com/file2.html

  • You must use UTF-8 encoding.
  • The file should not contain anything other than a list of URLs.
  • This text file can be given any name, but must use the .txt extension (for example, sitemap.txt).

Google Sites

If your site was created and verified using Google Sites, a sitemap is created automatically. You can't change it, but you can send it to Google to get reporting information. Please note that if there are more than 1000 pages in a single subdirectory, the Sitemap may not display correctly.

  • If your pages are hosted on Google Sites, your sitemap should be located at http://sites.google.com/site/ VashSait /system/feeds/sitemap .
  • If the site was created using Google Apps, the Sitemap URL should be: http://sites.google.com/ YourDomain /VashSait /system/feeds/sitemap .

Sitemap file extensions

Google supports extended syntax in the Sitemap for the following types of information. With it, you can add descriptions of videos, images, and other content to improve its indexing.

XML Sitemap is a website"s URLs list in XML format. The Sitemap file is designed to inform search engines (such as Google, Bing, Yahoo, Yandex, MSN, and others) about the pages on the website, which should be indexed. Sitemap significantly speeds up site scans.In addition, the Sitemap allows you to transmit information about all pages of your site, including those that search engines can not get with the usual crawl.

Creating a Sitemap is especially useful when:

  • On your site new pages are generated automatically and frequently.
  • Your site is new, and it indicates a small number of links.
  • Your site has a large archive of content pages that are poor or not at all related to each other.

XML Sitemaps Protocol: What does a Sitemap contain?

In accordance with the XML Sitemaps Protocol and the requirements of search engines, the Sitemap file should contain no more than 50,000 pages and not exceed the size of 10 MB. This means that if your site contains more than 50,000 pages and file size Sitemap than 10 MB, you must create multiple Sitemap.

... https://whatsappss.ru/en/URL ... ...

In addition to the required URL parameter, the XML Sitemaps protocol provides additional tags for each page:

Lastmod– indicates the date of last modification date.

Date (ISO 8601)

Changefreq– the probable frequency of the page content. Valid values ​​are:

  • always – every time the page loads
  • hourly – every hour
  • daily – every day
  • weekly – once a week
  • monthly – once a month
  • yearly – once a year
  • never – means that the page content remains unchanged.
Syntax:

Valid value

Priority– the priority of the page relative to other pages on your site. The valid range - from 0.0 to 1.0. This tag lets the search engines know which pages you think are most valuable.

Decimal from 0.0 to 1.0

This page contains almost all the information you need to know about Sitemap.

Sitemap is a site map designed to help search engine robots index a site. Name Sitemap is standard, that is, accepted by default.

The sitemap is usually stored on the hosting server in the directory public_html site. Sitemap is usually written in the last two lines of the file - there you can set other names for the two sitemap files, as well as a different location for the files in order to try to hide these files from malicious programs and people. And the most popular search engines are informed of the name and coordinates of files individually, sending other search engines further into the forest.

I consider these to be cheap tricks, because the file must be in the directory public_html site. Even if it is removed, since it is not necessary, an attacker who can get to this directory can also replace three files in order to redirect site visitors to any other site and its pages. I think that this is how some sites are sometimes attacked.

Sitemap my site is created by a plugin Google XML Sitemaps version 3.4 On the hosting server, two files are stored in the public_html directory of the site: Sitemap.xml And Sitemap.xml.gz, and both files are created almost simultaneously.

File Sitemap.xml, which currently has a length of 103 KB, is generated by the above-mentioned plugin when any page of the site is changed.

File Sitemap.xml.gz, having a length of 10 KB, is auxiliary and contains in encrypted form some information necessary for the robot.

On 03/07/14, after searching for advice on the Internet, I managed to pull out a decrypted map of my site from the Internet. Today I was unable to repeat this, and I didn’t think of writing down the card extraction algorithm yesterday. However, now it doesn’t matter; a little later you’ll understand why.

Here is the beginning and end of the file I converted yesterday:
http://site/ 2014-03-07T19:23:22+00:00 daily 1.0
http://site/stroitelstvo/sayt/cms-wordpress 2014-03-07T19:23:22+00:00 daily 0.6
http://site/posadki/ogorod/pomidoryi 2014-03-07T18:06:27+00:00 daily 0.6
…… http://site/voprosyi/otvet-15 2013-03-19T13:25:35+00:00 daily 0.6
http://site/sample-page/roshhi/hvoynyie/listvennitsa 2013-03-05T13:01:35+00:00 daily 0.6
http://site/sample-page/roshhi/listvennyie/lipyi 2013-03-05T12:30:19+00:00 daily 0.6

In the resulting file, the entries relating to individual pages were continuous and were only separated by two spaces. I was not too lazy to divide it in the editor Notepad file into lines and saved in TXT format. Then copied the contents of the file twice into the blank sheet columns Excel. I found out that 591 records were created with the addresses of the site pages. In the second column I sorted the entries alphabetically.

Since in the editor Notepad contains a primitive replacement command, copied the entire contents of the converted file to Word. Then, at the replacement command " Ctrl+H"replaced http with htp, and then vice versa. In both cases, 591 replacements were made.

Thus, it was possible not to waste time dividing into lines, but to immediately copy to Word and make a replacement to find out the number of site pages included in Sitemap.

Here are the beginning and two lines of the end of the file Sitemap.xml, copied from hosting:





http://сайт/
2014-03-08T18:55:00+00:00
daily
1.0


http://сайт/stroitelstvo/sayt/sitemap
2014-03-08T18:55:00+00:00
daily
0.6


http://сайт/voprosyi/otvet-15
2013-03-19T13:25:35+00:00
daily
0.6


http://сайт/sample-page/roshhi/hvoynyie/listvennitsa
2013-03-05T13:01:35+00:00
daily
0.6

The last entry refers to a page created more than a year ago!

The header of the file contains information about the means by which this file was created: the version of WordPress, the name of the plugin and, judging by the entry in the header, the name of an external site located in Germany, where the standard program is located that creates records related to the pages of the site, and also the date and time and date of creation of the site map. In addition, it contains information in accordance with what standards the site map was created.

The records themselves are probably created by the plugin, using the laptop processor from which changes are made to the site pages.

These entries are located between и .

4 lines contain the address of the site page, the time of the last modification, the recommended frequency of page viewing for the robot and the recommended viewing priority. What the 6 characters “+00:00″ present in every second line of a separate entry mean and why they are needed is completely unclear. I think this can be understood by examining the recording standard used. But do we need it?

It is significant, firstly, that the records are sorted by the time the file pages were created - this helps robots save time by not looking through those pages that have not changed since the last indexing.

Secondly, you can always copy from the hosting to Word all contents of a long file Sitemap and quickly find out the number of pages included in the site map. I did this in about one minute - now there are 593 pages in the site map, since I added two pages today.

Thirdly, it is quite obvious that the file Sitemap.xml is excessively long and that the creators of the standards for some reason chose to create it in a form that people can understand. Why? If you are interested, look for information on the Internet or ask luminaries, gurus and other experts.

Fourthly, search robots probably do not read the entire file, but only up to the entry of the page that has not changed since the robot’s last visit to the site map. And therefore, the redundancy of a long sitemap file is not important for them with current processors.

Fifthly, the above-mentioned plugin does an excellent job of creating a sitemap - it can and should be safely used.

Sixth, you can always look at the names of old pages in the site map and copy them into address bar browser, cause pages to be redesigned. I need to do this with many pages in order to rid the pages of an excessive number of saved revisions, and at the same time double-check them.

Seventh, I’m sure that we can come up with many more useful ways using a clear site map. I'll let you know as I come up with ideas.

I invite everyone to speak out in

Which are needed for search robots. Some will say that it is not needed, because all sections are already displayed. However, the need for such a page exists if the site contains fifty pages or more. For search engines and users, it will serve as a guide to help them understand where this or that information is contained.

XML and HTML files

Since it is used not only for search robots, but also for users visiting the site, two maps are usually compiled: in XML and HTML formats.

To create a Sitemap for search robots, use an XML file. Thanks to it, robots add new ones to their search database. In the absence of a map on a multi-page site, a large number of pages may not be indexed for sometimes a very long time.

An HTML file is used to create a sitemap for users. The importance of this map lies in the fact that its convenience directly determines whether the user will find the information he is interested in or not. Therefore, such a map is created for those Internet projects in which all sections and their subsections do not fit in the main menu.

How to create a Sitemap XML

There are three ways to solve this problem:

    Buying a generator for a sitemap.

    Create a Sitemap using online services.

    Manually writing a file.

To significantly save time, it is proposed to purchase generators. Therefore, if twenty to thirty dollars to purchase a license is a small waste of money for a webmaster, then buying it, especially for a large Internet resource, still won’t hurt, since then you won’t need to create a site manually.

For a site containing several hundred pages, online services are recommended, where in order to create a Sitemap, you only need to indicate the address of the Internet resource and download the result.

The best option is to manually create a map. To do this, you need to know tags such as url, urlset, loc, lastmod, changefreg and priority. In this case, the first three tags are considered mandatory, but the last three can be dispensed with.

Creating a Sitemap in Joomla

To create a Sitemap on a website, Joomla and Wordpress have special add-ons, like most known systems administration, thanks to which a site map is created manually or automatically. For large Internet projects that constantly update materials, this addition is very convenient.

In Joomla it is called Xmap, in Wordpress it is called Google XML Sitemaps.

Automatic sitemap creation

Free online servers help you create a Sitemap automatically if your site has no more than five hundred pages. Here's how easy it is to generate a sitemap:

    Having visited one of these Internet resources, you need to find the “Generate Sitemap” item, click on the “Create” button and create a Sitemap file automatically.

    Find “Site URL” and enter there the address of the site for which the map is being created.

    The system may require you to enter a verification code. You must also enter it and click “Start”.

    Upload the finished map to the website.

Manual way to create a map

This method is, on the one hand, the most difficult, taking up precious time, but on the other hand, it is the most the right way, used in cases where other options are not suitable. So, for example, if there are many pages that are not particularly necessary to be included in the site map, but they automatically end up there, of course, the manual method will save the map from the “overdose” of such pages. Another reason for choosing this method is poor site navigation.

To implement manual map creation you must:

    Collect pages to include in a map.

    In the excel file, insert all addresses in the third column.

    Insert both url and loc in the 1st and 2nd columns.

    In the 4th and 5th columns, insert the closing url and loc.

    Use the “link” function to connect five columns.

    Create a sitemap.xml.

    Add both urlset and /urlset tags to this file.

    Insert a connected column between them.

The resulting file must be checked. This can be done, for example, in Yandex, in the webmaster panel.

How to create a Sitemap for Yandex and Google

After the site is created, it is added to the site. For this purpose, the file with the site map should be called Sitemap.xml and added to the root directory. To find it quickly, Google and Yandex have special tools. They are called “Webmaster Tools” (in Google) and “Yandex Webmaster” (in Yandex).

Adding a Sitemap to Google

Adding a Sitemap to Yandex

Likewise, you must first log in to Yandex Webmaster. Then go to Indexing/Sitemap files, specify the file path there and click the “Add” button.

    Search robots today will only take those files that contain no more than fifty thousand URLs.

    If the card exceeds ten megabytes, it is better to split it into several files. Thanks to this, the server will not be overloaded.

    To create Sitemap xml correctly, if there are several files, you need to register them all in the index file, using the sitemapindex, sitemap, loc and lastmod tags.

    All pages must be written either with or without the “www” prefix.

    The required file encoding is UTF8.

    You also need to add an indication of the language namespace in the file.

How to create a sitemap for users

Since such a map is created for users, it should be as simple and clear as possible. Despite this, it is necessary to accurately convey all the information about the structure of the site being used.

HTML maps generally have a familiar user structure consisting of sections and subsections highlighted in a specific way, e.g. CSS styles and graphic elements.

To create a Sitemap for a large Internet project, as in the case of an XML map, splitting is also recommended here. In this case, it is carried out in the form of separate tabs, eliminating the bulkiness of the map.

It will improve the functionality of the page JavaScript language, which is allowed to be used in this map, since it is created not for search engine robots, but for users.

Order for a sitemap file

It is advisable that the created file containing the Sitemap always be clean and tidy, especially if the site has a large number of pages. Since search engine robots scan sitemaps very quickly, there may simply not be enough time to view the entire file of a large Internet resource.

Therefore, if you get used to adding pages to the site map not at the bottom, but at the top, then, on the one hand, there is no doubt that the search robot will have time to view the addresses of new pages, and on the other hand, in this way it will be much easier to control all pages.

Using our sitemap generator, create XML files that can be submitted to Google, Yandex, Bing, Yahoo and other search engines to help them index your site.

Do it in three simple steps:

  • Enter the full website URL into the form.
  • Click the "Start" button and wait until the site is fully crawled. At the same time, you will see the full number of working and broken links.
  • By clicking the "Sitemap.xml" button, save the file in a convenient location.

  • A sitemap is a site map in XML format, which in 2005 the Google search engine began to use to index website pages. A sitemap file is a way to organize a website, identifying the address and data for each section. Previously, sitemaps were primarily aimed at site users. The XML format was developed for search engines, allowing them to find data faster and more efficiently.

    The new Sitemap protocol was developed in response to the increasing size and complexity of websites. Business websites often contain thousands of products in their catalogs; the popularity of blogs, forums, and message boards force webmasters to update their materials, at least once a day. It is becoming increasingly difficult for search engines to track all the material. Via XML protocol search engines can track addresses more efficiently, optimizing their search by placing all the information on one page. XML also shows how often a particular website is updated and records last changes. XML maps are not a tool for search engine optimization. This does not affect rankings, but it does allow search engines to make more accurate rankings and search queries. This happens by providing data that is easy for search engines to read.

    The general acceptance of the XML protocol means that website developers no longer need to create Various types site maps for various search engines. They can create one file for the view and then update it when they make changes to the site. It simplifies the whole process fine tuning and website extensions. Webmasters themselves began to see the benefits of using this format. Search engines rank pages according to the relevance of specific content keywords, but before the XML format, often the contents of pages were not represented correctly. This is often frustrating for webmasters who realize that their efforts to create a website have gone unnoticed. Blogs, additional pages, adding multimedia files take several hours. Through XML file these hours will not be wasted, they will be seen by all famous search engines.

    To create your XML Sitemap and keep search engines up to date with any changes to your site, try our free sitemap generator.