{"id":27066,"date":"2024-03-14T16:58:52","date_gmt":"2024-03-14T11:28:52","guid":{"rendered":"https:\/\/www.7boats.com\/academy\/?p=27066"},"modified":"2024-03-14T17:27:31","modified_gmt":"2024-03-14T11:57:31","slug":"how-to-write-robots-txt-manually","status":"publish","type":"post","link":"https:\/\/www.7boats.com\/academy\/how-to-write-robots-txt-manually\/","title":{"rendered":"How to Write Robots.txt Manually"},"content":{"rendered":"\n<p>If you have a website, you&#8217;ve probably heard of the robots.txt file before. But what exactly is it and why is it important? In this guide, we&#8217;ll cover everything you need to know about the robots.txt file, including what it does, why you need one, and how to create one manually for your site.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_82_2 counter-hierarchy ez-toc-counter ez-toc-custom ez-toc-container-direction\">\n<div class=\"ez-toc-title-container\">\n<p class=\"ez-toc-title\" style=\"cursor:inherit\">Page Contents<\/p>\n<span class=\"ez-toc-title-toggle\"><a href=\"#\" class=\"ez-toc-pull-right ez-toc-btn ez-toc-btn-xs ez-toc-btn-default ez-toc-toggle\" aria-label=\"Toggle Table of Content\"><span class=\"ez-toc-js-icon-con\"><span class=\"\"><span class=\"eztoc-hide\" style=\"display:none;\">Toggle<\/span><span class=\"ez-toc-icon-toggle-span\"><svg style=\"fill: #999;color:#999\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" class=\"list-377408\" width=\"20px\" height=\"20px\" viewBox=\"0 0 24 24\" fill=\"none\"><path d=\"M6 6H4v2h2V6zm14 0H8v2h12V6zM4 11h2v2H4v-2zm16 0H8v2h12v-2zM4 16h2v2H4v-2zm16 0H8v2h12v-2z\" fill=\"currentColor\"><\/path><\/svg><svg style=\"fill: #999;color:#999\" class=\"arrow-unsorted-368013\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"10px\" height=\"10px\" viewBox=\"0 0 24 24\" version=\"1.2\" baseProfile=\"tiny\"><path d=\"M18.2 9.3l-6.2-6.3-6.2 6.3c-.2.2-.3.4-.3.7s.1.5.3.7c.2.2.4.3.7.3h11c.3 0 .5-.1.7-.3.2-.2.3-.5.3-.7s-.1-.5-.3-.7zM5.8 14.7l6.2 6.3 6.2-6.3c.2-.2.3-.5.3-.7s-.1-.5-.3-.7c-.2-.2-.4-.3-.7-.3h-11c-.3 0-.5.1-.7.3-.2.2-.3.5-.3.7s.1.5.3.7z\"\/><\/svg><\/span><\/span><\/span><\/a><\/span><\/div>\n<nav><ul class='ez-toc-list ez-toc-list-level-1 ' ><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/www.7boats.com\/academy\/how-to-write-robots-txt-manually\/#What_is_Robotstxt\" >What is Robots.txt?<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/www.7boats.com\/academy\/how-to-write-robots-txt-manually\/#Why_You_Need_a_Robotstxt_File\" >Why You Need a Robots.txt File<\/a><ul class='ez-toc-list-level-4' ><li class='ez-toc-heading-level-4'><a class=\"ez-toc-link ez-toc-heading-3\" href=\"https:\/\/www.7boats.com\/academy\/how-to-write-robots-txt-manually\/#Avoid_Indexing_Sensitive_Content\" >Avoid Indexing Sensitive Content<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-4'><a class=\"ez-toc-link ez-toc-heading-4\" href=\"https:\/\/www.7boats.com\/academy\/how-to-write-robots-txt-manually\/#Save_Bandwidth_and_Resources\" >Save Bandwidth and Resources<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-4'><a class=\"ez-toc-link ez-toc-heading-5\" href=\"https:\/\/www.7boats.com\/academy\/how-to-write-robots-txt-manually\/#Manage_Your_Crawl_Budget\" >Manage Your Crawl Budget<\/a><\/li><\/ul><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-6\" href=\"https:\/\/www.7boats.com\/academy\/how-to-write-robots-txt-manually\/#How_to_Create_a_Robotstxt_File_Manually\" >How to Create a Robots.txt File Manually<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-7\" href=\"https:\/\/www.7boats.com\/academy\/how-to-write-robots-txt-manually\/#The_Syntax\" >The Syntax<\/a><ul class='ez-toc-list-level-4' ><li class='ez-toc-heading-level-4'><a class=\"ez-toc-link ez-toc-heading-8\" href=\"https:\/\/www.7boats.com\/academy\/how-to-write-robots-txt-manually\/#Creating_Your_Rules\" >Creating Your Rules<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-4'><a class=\"ez-toc-link ez-toc-heading-9\" href=\"https:\/\/www.7boats.com\/academy\/how-to-write-robots-txt-manually\/#Block_Private_Areas\" >Block Private Areas<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-4'><a class=\"ez-toc-link ez-toc-heading-10\" href=\"https:\/\/www.7boats.com\/academy\/how-to-write-robots-txt-manually\/#Block_Files_You_Dont_Want_Indexed\" >Block Files You Don&#8217;t Want Indexed<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-4'><a class=\"ez-toc-link ez-toc-heading-11\" href=\"https:\/\/www.7boats.com\/academy\/how-to-write-robots-txt-manually\/#Block_Duplicate_Content\" >Block Duplicate Content<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-4'><a class=\"ez-toc-link ez-toc-heading-12\" href=\"https:\/\/www.7boats.com\/academy\/how-to-write-robots-txt-manually\/#Block_Low_Value_Pages\" >Block Low Value Pages<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-4'><a class=\"ez-toc-link ez-toc-heading-13\" href=\"https:\/\/www.7boats.com\/academy\/how-to-write-robots-txt-manually\/#Block_Crawlers_on_Purpose\" >Block Crawlers on Purpose<\/a><\/li><\/ul><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-14\" href=\"https:\/\/www.7boats.com\/academy\/how-to-write-robots-txt-manually\/#Other_Useful_Rules\" >Other Useful Rules<\/a><ul class='ez-toc-list-level-4' ><li class='ez-toc-heading-level-4'><a class=\"ez-toc-link ez-toc-heading-15\" href=\"https:\/\/www.7boats.com\/academy\/how-to-write-robots-txt-manually\/#Sitemaps\" >Sitemaps<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-4'><a class=\"ez-toc-link ez-toc-heading-16\" href=\"https:\/\/www.7boats.com\/academy\/how-to-write-robots-txt-manually\/#Crawl_Delay\" >Crawl Delay<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-4'><a class=\"ez-toc-link ez-toc-heading-17\" href=\"https:\/\/www.7boats.com\/academy\/how-to-write-robots-txt-manually\/#In_Conclusion\" >In Conclusion<\/a><\/li><\/ul><\/li><\/ul><\/nav><\/div>\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"What_is_Robotstxt\"><\/span>What is Robots.txt?<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p><br>The robots.txt file is a text file that goes in the root directory of your website. Its purpose is to give instructions to web robots or crawlers (like the Googlebot) about which areas of your website they should crawl and index, and which areas they should avoid.<\/p>\n\n\n\n<p>Crawlers are automated programs that browse the web to create listings of webpage content for search engines. When a crawler arrives at your site, the first thing it does is check for this robots.txt file to see if there are any rules about what parts of the site it can access.<\/p>\n\n\n\n<p>The robots.txt file allows you to stop web crawlers from accessing certain pages or folders. This can be useful for a few key reasons:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>It prevents crawlers from accessing and indexing certain pages that you don&#8217;t want shown up in search results, like personal content or private files.<\/li>\n\n\n\n<li>It stops crawlers from wasting resources crawling pages that don&#8217;t need to be indexed, like admin pages or duplicate content.<\/li>\n\n\n\n<li>It provides a way to keep sites from being overwhelmed by too many crawlers if needed.<\/li>\n<\/ol>\n\n\n\n<p>While blocking crawlers keeps content out of search results, it doesn&#8217;t actually prevent people from accessing those pages if they have the direct URL. The robots.txt just instructs well-behaved crawlers to avoid those areas.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/www.7boats.com\/academy\/schedule-a-meeting\/\"><img fetchpriority=\"high\" decoding=\"async\" width=\"1000\" height=\"100\" src=\"https:\/\/www.7boats.com\/academy\/wp-content\/uploads\/2024\/02\/4.gif\" alt=\"\" class=\"wp-image-27046\" title=\"\"><\/a><figcaption><\/figcaption><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Why_You_Need_a_Robotstxt_File\"><\/span>Why You Need a Robots.txt File<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p><br>Having a robots.txt file is critically important for any website. Even if you don&#8217;t customize it yourself, your server will automatically generate a default one that allows all crawlers to access all areas.<\/p>\n\n\n\n<p>However, you&#8217;ll likely want to create a custom robots.txt file for a few key reasons:<\/p>\n\n\n\n<p><\/p>\n\n\n\n<h4 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Avoid_Indexing_Sensitive_Content\"><\/span>Avoid Indexing Sensitive Content<span class=\"ez-toc-section-end\"><\/span><\/h4>\n\n\n\n<p><br>There are certain types of pages and files on your site that you&#8217;ll want to prevent crawlers from indexing, like admin login areas, personal user accounts, checkout pages with customer info, and more. The robots.txt allows you to specify rules to block those private sections.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<h4 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Save_Bandwidth_and_Resources\"><\/span>Save Bandwidth and Resources<span class=\"ez-toc-section-end\"><\/span><\/h4>\n\n\n\n<p><br>Another benefit is that you can use your robots.txt to save bandwidth and server resources by preventing crawlers from unnecessarily crawling areas like image directories, PDF files, or archives that don&#8217;t provide much SEO value when indexed. This keeps your site running efficiently.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<h4 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Manage_Your_Crawl_Budget\"><\/span>Manage Your Crawl Budget<span class=\"ez-toc-section-end\"><\/span><\/h4>\n\n\n\n<p><br>Most major search engines have &#8220;crawl budgets&#8221; that limit the maximum number of URLs from a site that their crawlers will visit over a given time period. By using your robots.txt to point crawlers only to the most important areas you want indexed, you can better manage which pages get crawled within that budget.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"How_to_Create_a_Robotstxt_File_Manually\"><\/span>How to Create a Robots.txt File Manually<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p><br>Now that you understand the importance of having a customized robots.txt file, let&#8217;s go over how to actually create one manually. While many platforms and CMSs allow you to generate this file graphically, knowing how to create it manually is useful.<\/p>\n\n\n\n<p>The first step is to create a new plain text file and name it &#8220;robots.txt&#8221; (without quotes). It&#8217;s important not to add any file extension like .txt to the end.<\/p>\n\n\n\n<p>Next, open the file in a basic text editor. You&#8217;ll want to add your rules inside this text file following the proper robots.txt syntax and formatting.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"The_Syntax\"><\/span>The Syntax<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p><br>The syntax of the robots.txt file consists of one or more &#8220;user-agent&#8221; lines specifying which crawler the rules apply to, optionally preceded by a &#8220;allow&#8221; or &#8220;disallow&#8221; line stating which directories can and cannot be crawled.<\/p>\n\n\n\n<p>Each rule set for a user agent crawler follows this format:<\/p>\n\n\n\n<p>User-agent: [crawler name]<br>Disallow: [directory path]<br>Allow: [directory path]<\/p>\n\n\n\n<p>You can have multiple user agent lines for different crawlers with different allow\/disallow rules for each one. There&#8217;s also a &#8220;catch-all&#8221; user agent value of * that applies rules to all crawlers.<\/p>\n\n\n\n<p>Some examples:<\/p>\n\n\n\n<p>User-agent: *<br>Disallow: \/example-directory\/<\/p>\n\n\n\n<p>This tells all web crawlers not to crawl the \/example-directory\/ and any files\/pages within it.<\/p>\n\n\n\n<p>User-agent: Googlebot<br>Disallow: \/example-directory\/secret\/<br>Allow: \/example-directory\/public\/<\/p>\n\n\n\n<p>This tells just the Googlebot crawler that it cannot crawl anything within the \/example-directory\/secret\/ folder, but it is allowed to crawl anything within the \/example-directory\/public\/ folder.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/www.7boats.com\/academy\/schedule-a-meeting\/\"><img decoding=\"async\" width=\"1000\" height=\"100\" src=\"https:\/\/www.7boats.com\/academy\/wp-content\/uploads\/2024\/02\/3.gif\" alt=\"\" class=\"wp-image-27045\" title=\"\"><\/a><figcaption><\/figcaption><\/figure>\n\n\n\n<h4 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Creating_Your_Rules\"><\/span>Creating Your Rules<span class=\"ez-toc-section-end\"><\/span><\/h4>\n\n\n\n<p><br>So how do you decide what rules to include in your robots.txt file? Here are some common use cases for disallow rules:<\/p>\n\n\n\n<h4 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Block_Private_Areas\"><\/span>Block Private Areas<span class=\"ez-toc-section-end\"><\/span><\/h4>\n\n\n\n<p><br>You&#8217;ll want to block access to any private or sensitive areas of your site that contain things like user accounts, payment processing, admin areas, etc. For example:<\/p>\n\n\n\n<p>User-agent: *<br>Disallow: \/user-accounts\/<br>Disallow: \/admin\/<br>Disallow: \/checkout\/<\/p>\n\n\n\n<h4 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Block_Files_You_Dont_Want_Indexed\"><\/span>Block Files You Don&#8217;t Want Indexed<span class=\"ez-toc-section-end\"><\/span><\/h4>\n\n\n\n<p><br>Similarly, there may be certain file types or directories that don&#8217;t provide value to have indexed, like image folders, PDF files, zipped archives, and more. For example:<\/p>\n\n\n\n<p>User-agent: *<br>Disallow: \/images\/<br>Disallow: \/*.pdf$<\/p>\n\n\n\n<h4 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Block_Duplicate_Content\"><\/span>Block Duplicate Content<span class=\"ez-toc-section-end\"><\/span><\/h4>\n\n\n\n<p><br>If your site generates duplicate content like filtered archives, print-friendly pages, etc. you can disallow those from being indexed:<\/p>\n\n\n\n<p>User-agent: *<br>Disallow: \/print\/<br>Disallow: \/*?<\/p>\n\n\n\n<h4 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Block_Low_Value_Pages\"><\/span>Block Low Value Pages<span class=\"ez-toc-section-end\"><\/span><\/h4>\n\n\n\n<p><br>There may be certain low-quality or low-value pages on your site that you don&#8217;t want crawlers wasting resources on, like crawler traps, dummy pages, test areas, etc.<\/p>\n\n\n\n<p>User-agent: *<br>Disallow: \/test\/<br>Disallow: \/tr4p\/<\/p>\n\n\n\n<h4 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Block_Crawlers_on_Purpose\"><\/span>Block Crawlers on Purpose<span class=\"ez-toc-section-end\"><\/span><\/h4>\n\n\n\n<p><br>In some cases, you may want to temporarily block all or certain crawlers from your site, like when relaunching, doing major site updates, or experiencing excessive loads. For example:<\/p>\n\n\n\n<p>User-agent: *<br>Disallow: \/<\/p>\n\n\n\n<p>This tells all crawlers not to crawl any part of the site at all.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Other_Useful_Rules\"><\/span>Other Useful Rules<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p>In addition to disallow rules, there are some other handy rules you can include:<\/p>\n\n\n\n<h4 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Sitemaps\"><\/span>Sitemaps<span class=\"ez-toc-section-end\"><\/span><\/h4>\n\n\n\n<p><br>You can include lines to tell crawlers where your XML sitemaps are located:<\/p>\n\n\n\n<p>Sitemap: https:\/\/example.com\/sitemap.xml<\/p>\n\n\n\n<h4 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Crawl_Delay\"><\/span>Crawl Delay<span class=\"ez-toc-section-end\"><\/span><\/h4>\n\n\n\n<p><br>You can add a &#8220;Crawl-delay:&#8221; line to specify how long (in seconds) crawlers should wait between requests to avoid overtaxing your server:<\/p>\n\n\n\n<p>Crawl-delay: 10<\/p>\n\n\n\n<p>Specifying Allowed Crawlers<br>You can allow only specific listed crawlers and disallow all others by specifying:<\/p>\n\n\n\n<p>User-agent: Googlebot<br>Allow: \/<\/p>\n\n\n\n<p>User-agent: *<br>Disallow: \/<\/p>\n\n\n\n<p>Testing and Submission<\/p>\n\n\n\n<p>Once you&#8217;ve created your robots.txt file with your desired rules, be sure to upload it to the root directory of your website (e.g. example.com\/robots.txt).<\/p>\n\n\n\n<p>You can then use Google&#8217;s robots.txt Tester tool to test it for any syntax issues. Most search engines also have submission processes to specifically submit your new robots.txt for recrawling when making major changes.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"In_Conclusion\"><\/span>In Conclusion<span class=\"ez-toc-section-end\"><\/span><\/h4>\n\n\n\n<p>The robots.txt file is a small but powerful part of any website. By creating a customized one with the right rules, you can better control what content gets indexed in search engines, block access to private areas, save server resources, and manage your crawl budget.<\/p>\n\n\n\n<p>While not the most complex file, the robots.txt syntax takes some practice to fully understand. But once you have it set up properly, you&#8217;ll be ensuring both search engine crawlers and your human visitors have the best possible experience with your site.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>If you have a website, you&#8217;ve probably heard of the robots.txt file before. But what exactly is it and why &hellip;<\/p>\n","protected":false},"author":5513,"featured_media":27067,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_vibebp_attr":"","_vibebp_dimensions":"","_vibebp_responsive_height":"","_vibebp_accordion_ie_support":"","footnotes":""},"categories":[1],"tags":[],"class_list":["post-27066","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-blog"],"_links":{"self":[{"href":"https:\/\/www.7boats.com\/academy\/wp-json\/wp\/v2\/posts\/27066","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.7boats.com\/academy\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.7boats.com\/academy\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.7boats.com\/academy\/wp-json\/wp\/v2\/users\/5513"}],"replies":[{"embeddable":true,"href":"https:\/\/www.7boats.com\/academy\/wp-json\/wp\/v2\/comments?post=27066"}],"version-history":[{"count":3,"href":"https:\/\/www.7boats.com\/academy\/wp-json\/wp\/v2\/posts\/27066\/revisions"}],"predecessor-version":[{"id":27071,"href":"https:\/\/www.7boats.com\/academy\/wp-json\/wp\/v2\/posts\/27066\/revisions\/27071"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.7boats.com\/academy\/wp-json\/wp\/v2\/media\/27067"}],"wp:attachment":[{"href":"https:\/\/www.7boats.com\/academy\/wp-json\/wp\/v2\/media?parent=27066"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.7boats.com\/academy\/wp-json\/wp\/v2\/categories?post=27066"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.7boats.com\/academy\/wp-json\/wp\/v2\/tags?post=27066"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}