Solving technical SEO is often frustrating, especially when you got to solve the problem of the same site over and over again. And I guess I’m not alone; there are several websites suffering from this issue. In this post I have tried to outline some common problems I have come across with viable solutions.
Issue # 1: Uppercase URL vs. Lowercase URL
The .NET websites are the most common victim of this problem. The server is configured in such a way that it will respond to uppercase URLs without redirecting them (nor you have to rewrite) to the lowercase URLs. However, the search engines now have got better at choosing such canonical version as well as at ignoring the duplicates. This solved the problem up to an extent, though there are many examples where search engines failed to do this task properly. Thus, it is still essential to make it explicit rather than relying on the search engines for figuring it out themselves.
Solution: Take help of the URL rewrite module to solve this issue on IIS 7 servers. This tool comes with an option within its interface allowing the user to enforce lowercase URLs. Once you do this, the tool will add a rule to the web config file, solving the problem.
Issue # 2: Homepage’s Multiple Versions
This too is more common with .NET websites, but also happens on other platforms. While auditing your site, check if there is any duplicate version of your home page. To check it simply type www.yourwebsite.com/ default.aspx OR www.yourwebsite.com/ index.html OR www.yourwebsite.com/home
The search engines can find this duplicate of your homepage via XML sitemaps or navigation. You can easily solve this problem without getting into much details of how these pages are generated.
Solution: A solution to this problem often depends on a guessing game, as different platforms generate different URL structures making it difficult to find these pages. The best solution is to do a crawl of your website and explore it into a CSV. Now, filter the crawl by the META title column to search your homepage title. This way you can easily find the duplicates pages of your website’s homepage. In addition, you can add a 301 redirect to these duplicate pages pointing to your correct page. I have also seen many solving this problem with rel=canonical tag.
Many use tools like Screaming Frog to crawl the site for finding internal links to this duplicate page. This is a good idea to go in and edit these duplicate pages directly so that they point to the correct URL. This helps to avoid losing link equity which is a common issue when internal links are going via a 301.
Issue # 3: Soft 404 errors
A very common problem with websites, though users will hardly notice the difference. Unfortunately, search engines crawlers will know it. A soft 404 page looks like a normal 404 error page but it returns the HTTP status code 200.Though the user will see some more text apart from the “Page not found” line, the code 200 tells the search engines that the page is working properly. As a result, search engines will crawl and index even those pages which you would rather like to hide from the crawlers.
But behind the scenes, a code 200 is telling search engines that the page is working correctly. This disconnect can cause problems with pages being crawled and indexed when you do not want them to be.
A soft 404 also means you cannot spot real broken pages and identify areas of your website where users are receiving a bad experience. From a link building perspective (I had to mention it somewhere!), neither solution is a good option. You may have incoming links to broken URLs, but the links will be hard to track down and redirect to the correct page.
How to solve:
Fortunately, this is a relatively simply fix for a developer who can set the page to return a 404 status code instead of a 200. Whilst you’re there, you can have some fun and make a cool 404 page for your user’s enjoyment. Here are some examples of awesome 404 pages, and I have to point to Distilled’s own page here :)
To find soft 404s, you can use the feature in Google Webmaster Tools which will tell you about the ones Google has detected:
You can also perform a manual check by going to a broken URL on your site (such as www.example.com/5435fdfdfd) and seeing what status code you get. A tool I really like for checking the status code is Web Sniffer, or you can use the Ayima tool if you use Google Chrome.
302 redirects instead of 301 redirects
Again, this is an easy redirect for developers to get wrong because, from a user’s perspective, they can’t tell the difference. However, the search engines treat these redirects very differently. Just to recap, a 301 redirect is permanent and the search engines will treat it as such; they’ll pass link equity across to the new page. A 302 redirect is a temporary redirect and the search engines will not pass link equity because they expect the original page to come back at some point.
How to solve:
To find 302 redirected URLs, I recommend using a deep crawler such as Screaming Frog or the IIS SEO Toolkit. You can then filter by 302s and check to see if they should really be 302s, or if they should be 301s instead.
To fix the problem, you will need to ask your developers to change the rule so that a 301 redirect is used rather than a 302 redirect.