Website Receive Updates For This Category
October 10, 2011 Crawling & Indexing 0 0
Canonical Issues refers to self-created duplicate content that is inadvertly created by websites and webmasters. To address the issue, search engines created the Canonical Tag and Google discusses how to Specify Your Canonical.
Website home pages. The most common canonical issue is how many websites serve up duplicate versions of their home page unintentionally when only one version, preferrably http://www.mydomain.com/:
- http://www.mydomain.com
- http://mydomain.com
- http://www.mydomain.com/index.html
- http://mydomain.com/index.html
Sub-domains. Another common canonical issue that most commonly affects websites is multiple sub-domains. Every page on the website should honor either www. or no www., but not both versions of the page. Some websites even honor wildcard sub-domains:
- http://mydomain.com/website-page.html
- http://www.mydomain.com/website-page.html
- htp://wildcard.mydomain.com/website-page.html
Upper-Case vs Lower-Case. Some websites, specifically those run on Microsoft IIS, honor both Upper Case and lowercase URLs. We recommend using lower case for your URLs and the use of dashes to separate words if you want prettier URLs. Examples of this include:
- http://www.mydomain.com/WebsitePage.html
- http://www.mydomain.com/websitepage.html
For any of these issues, the basic fixes involve:
- Specify the Canonical Tag
- Setup 301 Redirects from old URLs to the Specified Canonical
- Change all website navigation and controlled inbound links to Specified Canonical
A couple points on the Canonical Tag:
- The Canonical Tag is a hint that is honored by the search engines, but remember it is only a hint. Address your canonical issues instead of relying on hints as the search engines will use it in conjunction with other signals.
- Canonical Tags can only point to pages on the same domain and not across domains.
- The search engines will look to make sure the content is very similar. Slight differences are accepted, but the tag will not work for unique or distinct content.
October 10, 2011 Crawling & Indexing 0 0
What is a XML Sitemap?
XML Sitemaps have replaced the older method of 8220;submitting to search engines8221; by filling out a form on the search engine8217;s submission page. Now web developers submit a Sitemap directly, or wait for search engines to find it. Sitemaps do not guarantee all links will be crawled, and being crawled does not guarantee indexing. However, a Sitemap is still the best insurance for getting a search engine to learn about your entire site. (Source: Wikipedia)
We recommend creating a Sitemap based on the Sitemap protocol because the same file can be submitted to the other search engines, Google, Bing, Yahoo! and others, that are members of sitemaps.org. There is also support at the search engines and we recommend Google Sitemap Support as a reference.
Do I Need a XML Sitemap?
Google says that Sitemaps are particularly helpful if:
- Your site has dynamic content.
- Your site has pages that aren8217;t easily discovered by Googlebot during the crawl process—for example, pages featuring rich AJAX or images.
- Your site is new and has few links to it. (Googlebot crawls the web by following links from one page to another, so if your site isn8217;t well linked, it may be hard for us to discover it.)
- Your site has a large archive of content pages that are not well linked to each other, or are not linked at all.
Google doesn8217;t guarantee that we8217;ll crawl or index all of your URLs. However, we use the data in your Sitemap to learn about your site8217;s structure, which will allow us to improve our crawler schedule and do a better job crawling your site in the future. In most cases, webmasters will benefit from Sitemap submission, and in no case will you be penalized for it.
Create a XML Sitemap
You can manually, or automatically via website code, create an XML Sitemap from a list of your website URLs. There are also a number of tools that will spider your website and create an XML sitemap:
The Online XML Site Map Generator is limited to 500 URLs and will need manually updated. If you have any type of manual process, you need to setup reminders to rebuild your website on a regular interval (minimum monthly and recommended weekly) as well as when you add new pages to your website.
Tag Definition & Management
While only the <urlset>, <url>, and <loc> tags are required, we recommend providing more information about your website in the <url></url> tag:
- Modified Date: This should be in the W3C Datetime format such as YYYY-MM-DD
<lastmod>2007-01-08</lastmod> - Modification Frequency: This field is a suggestion rather than a command to search engines. They may crawl the pages more frequently than you indicate or less. Valid values are daily, weekly, monthly, yearly. Our recommendation is daily for the home page, weekly for important website pages that change sometimes, and monthly for content that rarely changes.
- Page Priority: This is the priority of the page relative to other website pages on the same website. The assigned value can range from 0.0 to 1.0 and the default value is 0.5. Assigning a high priority to all your pages is unlikely to help, as the priority only refers to the same site, and if all pages are marked the same priority, then they will be all treated equally. We recommend that you set only the home page to 1.0 and then scale down to 0.5 saving 0.1 for pages that are not ready for full promotion.
For Large Websites
For larger websites, it is recommended that you obtain, or build, an automated script usually with database access to product, article, or post information. Most CMS and Ecommerce platforms have plug-ins, or modules, for XML Sitemaps. Here are a few more popular tools:
A Sitemap file can contain no more than 50,000 URLs and must be no larger than 50MB when uncompressed. If your Sitemap is larger than this, break it into several smaller Sitemaps. These limits help ensure that your web server is not overloaded by serving large files to Google. If you have more than one Sitemap, you can list them in a Sitemap index file and then submit the Sitemap index file to Google. You don8217;t need to submit each Sitemap file individually.
How do I Submit a XML Sitemap?
Once you have created a XML sitemap, you should upload it to the root of your domain so that it is located at http://www.mydomain.com/sitemap.xml. Once uploaded, you can submit to submit your sitemap URL to Google Webmaster Tools & Bing Webmaster Tools. We also recommend adding the location of your XML sitemap to robots.txt for auto discovery.
Sitemap: http://www.mydomain.com/sitemap.xml
October 10, 2011 Site Performance 0 0
- How Load Time Affects Your Bottom Line: http://blog.kissmetrics.com/
loading-time/?wide=1 - Matt Cutts on Site Speed: http://www.mattcutts.com/blog/
site-speed/
- Image Optimization: This is the most common recommendation for increasing site speed and it8217;s our first one as well. Most image editing programs have an option called 8220;Save for Web8221;. Don8217;t forget to specify full dimensions. Here is a link to an online tool if you can not edit the original image: Image Optimizer
- Minimize HTML Code: Many website design programs add a lot of extra code and white space to HTML that needs removed. Another way way to optimize your HTML is through the use of GZip HTML Compression. Here is a link to the main GZip tool site for downloading purposes.
- Improve CSS & Javascript: There are a couple different steps involved related to CSS & Javascript including tasks to consolidate, move external and minify the code. Consolidation means fewer http requests are made by the browswer while using external files in the real world generally produces faster pages because the JavaScript and CSS files are cached by the browser. Minification is the practice of removing unnecessary characters from code to reduce its size thereby improving load times.
- Browser Caching: Browsers (and proxies) use a cache to reduce the number and size of HTTP requests, making web pages load faster. Setting an expiry date or a maximum age in the HTTP headers for static resources instructs the browser to load previously downloaded resources from local disk rather than over the network.
- GZIP Compression: The time it takes to transfer an HTTP request and response across the network can be significantly reduced by decisions made by front-end engineers. Compression reduces response times by reducing the size of the HTTP response. Many web servers can compress files in gzip format before sending them for download. Note that compression is only beneficial for larger resources. Due to the overhead and latency of compression and decompression, you should only compress files above a certain size threshold (e.g. over 200 bytes).
Site Speed Resources:
November 5, 2007 Crawling & Indexing 0 0
The major search engine Google has a rule regarding use of the rel=”nofollow” tag/attribute in links, which has generated some controversy in recent months. The rule is intended to prevent websites from gaining better Google search result rankings by participating in link exchanges or sales. This is becoming more relevant in the recent weeks as Google recently performed an internet wide update of Page Rank (PR) and many leading directories and sites that participate in paid link advertising seem to be singled out by Google.
In general, Google and other search engines allow websites with more links to them from other sites to appear higher in search results, considering each link as a “recommendation” for the site it links to. The “nofollow” tag, when added to a link, prevents Google from counting the link as such a “recommendation”, thus not providing any search ranking benefits to the site being linked to.
Google believes that a link which has been purchased or exchanged in a reciprocal manner should not benefit websites’ ranking in search results. Thus its “nofollow” rule calls on all website owners to use the “nofollow” tag when creating such links. According to mattcutts.com (a blog created by the head of the Google “webspam team”), any website which doesn’t use the tag for a link of this type may be penalized. Some have speculated that the recent reductions in PageRank values for many websites were caused by violations of this rule.
This has created controversy and complicated the process of buying or exchanging a link. Some argue that a reciprocal link exchange does indicate a recommendation for each other’s websites by the respective sites’ administrators, and should be counted positively. Website owners who gain a substantial part of their revenue from selling links, or had already exchanged many reciprocal links without the “nofollow” tag before Google instated the rule, are understandably displeased with it.
If the Google “nofollow” rule is to be followed, website owners now have to agree whether or not to use it when exchanging or selling link(s), which may complicate the matter if one of the two owners involved does not believe in following the rule, doesn’t know about it, or lacks understanding of the “nofollow” tag. Some claim that Google’s method for detecting bought or reciprocal links is not effective enough to make following the rule necessary. Website owners who do not believe they will be able to gain any other types of inbound links are especially likely to violate it. In addition to their automated system, Google has a page on their website where anyone can “report” the sale of links to them.
It remains to be seen how significant an effect Google’s “nofollow” rule will have on reciprocal linking and link sales, with the level of understanding and acceptance of the rule among website owners playing a significant role in the outcome. One thought to keep in mind is the that the use of “nofollow” might actually leave an SEO footprint for Google to detect that a site has been “SEO’d”.
October 10, 2011 Crawling & Indexing 0 0
Here are the basic elements to consider when determining the best way to manage your website URLs:
- Page Names 8211; Developing search friendly URLs starts with the name of the page in a friendly, descriptive way. We recommend that you use descriptive keywords instead of product SKUs or page numbers. For example, the last URL is descriptive for both search engines and when shared by visitors. We recommend (and so does Matt Cutts) that you use Dashes over Underscores. A good example is http://www.example.com/bike-acc-14.html and a better example is http://www.example.com/bicycle-wicker-baskets.html
- Static versus Dynamic URLs 8211; A static URL is one that does not change, so it typically does not contain any url parameters. Example: http://www.mydomain.com/january.htm If the content of a site is stored in a database and pulled for display on pages on demand, dynamic URLs maybe used. In that case the site serves basically as a template for the content. Example: http://www.mydomain.com/events.php?month=january Dynamic URLs have the disadvantage that different URLs can have the same content. So different users might link to URLs with different parameters which have the same content. That8217;s one reason why webmasters sometimes want to rewrite their URLs to static ones.
- Query String Parameters 8211; Google and Matt Cutts offers up a detailed explanation regarding the components of a URL. http://www.mattcutts.com/blog/seo-glossary-url-definitions/ It is generally accepted that query string parameters are everything after a ? and before a # and seperated by an &. Query sting parameters fall into 2 groups as it relates to SEO 8212; those that have an effect on content and those that don8217;t. Examples of query strings that don8217;t change content are Session IDs, Tracking IDs, and other parameters that lead to duplicate content.
- Parameter Handling 8211; When crawlers find the same content through varied URLs, there may be several negative effects including a dilution of link popularity, unfriendly URLs may show in search results, and website hierarchy confusion. Google has recently launched a tool to help them deal with this problem. Parameter Handling is available in Google Webmaster Tools and allows you to view which parameter Google believes should be ignored or not ignored at crawl time, and gives you the ability to overwrite the suggestions if necessary.
- Should I Switch to Static URLs? 8211; If you are currently using Dynamic URLs, Google states that the benefit of re-writing your Dynamic URLs may be minimal.
- Keep Only Essential Parameters 8211; Google8217;s recommendation is that you should not convert dynamic URLs to static URls unless your rewrites are limited to removing unnecessary parameters. If you transform your dynamic URL to make it look static, every URL should have a 301 redirect in place to make sure Google is able to interpret the information correctly in all cases. The real issue is not whether static or dynamic URLs are better for SEO, it is whether query string parameters are needed and produce unique content. Non-Essential query parameters need to be managed or removed regardless of whether the URL is static or dynamic.
October 10, 2011 Crawling & Indexing 0 0
Websites are perceived as both a single property and a collection of individual website pages. Search engines crawl websites and must determine the inner page hierarchy along with which sections and pages of your website are most important. A solid plan for your site8217;s structure can substantially enhance the search engines results. Here are the most important elements that need planned when developing your website hierarchy:
Site Architecture
The first consideration is how many clicks it takes the user to get from your home page to the farthest point of your website. The lower number of clicks in the click path, the higher in priority the page is viewed by the search engines. Having a flat website with all of the links only 1 or 2 clicks away can confuse search engines as well. A flat hierarchy makes it hard for the crawlers to figure out what pages are the most important.
For a website page, the level could be also be reviewed based on the phsyical structure of the website. Website pages in the root folder could be viewed as more important than sub directories. It it recommended to have only sub folder and absolutely no more than 2 sub folders.
Developing a proper information architecture shows search engines what you consider the most important categories and website pages on your site as well as the association of these category8217;s pages with their parent category theme.
Navigation
One of the primary tools for establishing website structure is the template navigation that usually persists through out a website.
Only the most important URLs should be placed into the top and side navigation and the total number should be limited.
The link in the header and left navigation are usually viewed as the most important template based links with right navigation only slightly behind left. Footer navigational links are not as powerful in establishing website hierarchy and they may even be ignored as a signal by some crawlers.
Breadcrumbs
Breadcrumb navigation shows where in the site hierarchy the currently viewed web page is located and your location within the site, while providing shortcuts to instantly jump higher up the site hierarchy. For example, a product page for a table lamp may have the breadcrumb navigation of “Home > Home Furnishings > Lighting > Table Lamps.”
The use of Breadcrumbs are very effective in creating the clues to your website hierarcy to search engines. Breadcrumbs are usually seen at the top of the page just under the main navigation. Breadcrumbs are absolutely used by Google to establish hierarchy and this discovered hierarchy is even displayed in Google search results.
Two common questions that we receive is whether the first(home) and last link(to itself) in the breadcrumb. In researching this, we have found the majority of website link the first link(home) in the breadcrumb to the home page and that the last link does not link(to itself) as that link provides absolutely no value to the user.
Other Resources
Here are a few more resources for Website Hierarchy:
