<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Be a Normalizer &#8211; a C14N Exterminator</title>
	<atom:link href="http://www.sugarrae.com/be-a-normalizer-a-c14n-exterminator/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.sugarrae.com/be-a-normalizer-a-c14n-exterminator/</link>
	<description>Never Mess With a Woman Who Can Pull Rank</description>
	<lastBuildDate>Tue, 12 Jan 2010 20:10:47 -0600</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.5</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<xhtml:meta xmlns:xhtml="http://www.w3.org/1999/xhtml" name="robots" content="noindex" />
	<item>
		<title>By: httpwebwitch</title>
		<link>http://www.sugarrae.com/be-a-normalizer-a-c14n-exterminator/#comment-79531</link>
		<dc:creator>httpwebwitch</dc:creator>
		<pubDate>Mon, 19 Jan 2009 19:27:43 +0000</pubDate>
		<guid isPermaLink="false">http://www.sugarrae.com/be-a-normalizer-a-c14n-exterminator/#comment-79531</guid>
		<description>This post has been nominated for a SEMMY
http://www.semmys.org/2009/search-tech-all-2009-nominees/</description>
		<content:encoded><![CDATA[<p>This post has been nominated for a SEMMY<br />
<a href="http://www.semmys.org/2009/search-tech-all-2009-nominees/" rel="nofollow">http://www.semmys.org/2009/search-tech-all-2009-nominees/</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Mike Riley</title>
		<link>http://www.sugarrae.com/be-a-normalizer-a-c14n-exterminator/#comment-78672</link>
		<dc:creator>Mike Riley</dc:creator>
		<pubDate>Sat, 04 Oct 2008 07:46:09 +0000</pubDate>
		<guid isPermaLink="false">http://www.sugarrae.com/be-a-normalizer-a-c14n-exterminator/#comment-78672</guid>
		<description>I usually find that the best way to handle the problem of Canonicalization, as well as a slew of others, is to use a framework that just rewrites all URLs to a single file and then routes them.  Ala ExpressionEngine or CodeIgniter (or Ruby!).  The real beauty of doing this is that you don&#039;t really need to ever screw around with regular expressions, and definitely don&#039;t need to mess with the .htaccess or apache directives in order to reconfigure your URL rules, everything is defined in the same language you&#039;re using to actually generate the pages.  It really makes for a much smoother way of handling this.</description>
		<content:encoded><![CDATA[<p>I usually find that the best way to handle the problem of Canonicalization, as well as a slew of others, is to use a framework that just rewrites all URLs to a single file and then routes them.  Ala ExpressionEngine or CodeIgniter (or Ruby!).  The real beauty of doing this is that you don&#8217;t really need to ever screw around with regular expressions, and definitely don&#8217;t need to mess with the .htaccess or apache directives in order to reconfigure your URL rules, everything is defined in the same language you&#8217;re using to actually generate the pages.  It really makes for a much smoother way of handling this.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: httpwebwitch</title>
		<link>http://www.sugarrae.com/be-a-normalizer-a-c14n-exterminator/#comment-60623</link>
		<dc:creator>httpwebwitch</dc:creator>
		<pubDate>Tue, 26 Feb 2008 23:44:48 +0000</pubDate>
		<guid isPermaLink="false">http://www.sugarrae.com/be-a-normalizer-a-c14n-exterminator/#comment-60623</guid>
		<description>@dan: the rewriterule for my example is something more like

RewriteRule /restaurants/([0-9]+)/.* /r.php?id=$1

Validating your fluff is usually not handled with rewriting rules; they&#039;re accomplished in the code since you need to compare the &quot;fluff&quot; with some actual data associated with a RowID.

BTW in Wordpress they call that a &quot;post slug&quot; used to create &quot;pretty permalinks&quot;

I presume your fluff will be some modified and hyphenated version of a title or name in the data. 
Once you&#039;ve figured out which restaurant is #46335, parse the URL and see if your fluff is an exact match for the string you expect to be associated with that data row.

done properly, your rewriterule becomes:

RewriteRule /restaurants/([0-9]+)/
(.*) /r.php?id=$1&amp;fluff=$2

create a slug and compare it to your fluff. If they&#039;re not identical, you&#039;ve already got your slug so redirect to a new URL using that.

the rewriterules giveth, and the code taketh away</description>
		<content:encoded><![CDATA[<p>@dan: the rewriterule for my example is something more like</p>
<p>RewriteRule /restaurants/([0-9]+)/.* /r.php?id=$1</p>
<p>Validating your fluff is usually not handled with rewriting rules; they&#8217;re accomplished in the code since you need to compare the &#8220;fluff&#8221; with some actual data associated with a RowID.</p>
<p>BTW in Wordpress they call that a &#8220;post slug&#8221; used to create &#8220;pretty permalinks&#8221;</p>
<p>I presume your fluff will be some modified and hyphenated version of a title or name in the data.<br />
Once you&#8217;ve figured out which restaurant is #46335, parse the URL and see if your fluff is an exact match for the string you expect to be associated with that data row.</p>
<p>done properly, your rewriterule becomes:</p>
<p>RewriteRule /restaurants/([0-9]+)/<br />
(.*) /r.php?id=$1&amp;fluff=$2</p>
<p>create a slug and compare it to your fluff. If they&#8217;re not identical, you&#8217;ve already got your slug so redirect to a new URL using that.</p>
<p>the rewriterules giveth, and the code taketh away</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Chad Ledford</title>
		<link>http://www.sugarrae.com/be-a-normalizer-a-c14n-exterminator/#comment-60567</link>
		<dc:creator>Chad Ledford</dc:creator>
		<pubDate>Tue, 26 Feb 2008 16:14:32 +0000</pubDate>
		<guid isPermaLink="false">http://www.sugarrae.com/be-a-normalizer-a-c14n-exterminator/#comment-60567</guid>
		<description>Great post on canonicalization.  Digg has actually just fixed one of their issues (http://www.3tailer.com/sundry/digg-implements-the-1000000-idea-non-www-301-redirect)</description>
		<content:encoded><![CDATA[<p>Great post on canonicalization.  Digg has actually just fixed one of their issues (<a href="http://www.3tailer.com/sundry/digg-implements-the-1000000-idea-non-www-301-redirect)" rel="nofollow">http://www.3tailer.com/sundry/digg-implements-the-1000000-idea-non-www-301-redirect)</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: dan</title>
		<link>http://www.sugarrae.com/be-a-normalizer-a-c14n-exterminator/#comment-60429</link>
		<dc:creator>dan</dc:creator>
		<pubDate>Mon, 25 Feb 2008 22:51:49 +0000</pubDate>
		<guid isPermaLink="false">http://www.sugarrae.com/be-a-normalizer-a-c14n-exterminator/#comment-60429</guid>
		<description>I&#039;ve recently been working on site that uses a lot of &quot;URL Fluffing&quot; (great term) as in Tip #3. 

Fixing this I guess would just be a simple rewrite rule. Unfortunately rewrite rules are never simple for me.

Using the given example C14N URL of
http://www.example.com/restaurants/46335/jimmys-lunch-hamilton-ontario.php

would the rewrite rule just be
RewriteRule  http://www.example.com/restaurants/(.*)/.* http://www.example.com/restaurants/46335/jimmys-lunch-hamilton-ontario.php [R=301,L]

I believe such an approach would require a separate rule for each URL, but I guess you have to maintain your fluff somewhere.</description>
		<content:encoded><![CDATA[<p>I&#8217;ve recently been working on site that uses a lot of &#8220;URL Fluffing&#8221; (great term) as in Tip #3. </p>
<p>Fixing this I guess would just be a simple rewrite rule. Unfortunately rewrite rules are never simple for me.</p>
<p>Using the given example C14N URL of<br />
<a href="http://www.example.com/restaurants/46335/jimmys-lunch-hamilton-ontario.php" rel="nofollow">http://www.example.com/restaurants/46335/jimmys-lunch-hamilton-ontario.php</a></p>
<p>would the rewrite rule just be<br />
RewriteRule  <a href="http://www.example.com/restaurants/(." rel="nofollow">http://www.example.com/restaurants/(.</a>*)/.* <a href="http://www.example.com/restaurants/46335/jimmys-lunch-hamilton-ontario.php" rel="nofollow">http://www.example.com/restaurants/46335/jimmys-lunch-hamilton-ontario.php</a> [R=301,L]</p>
<p>I believe such an approach would require a separate rule for each URL, but I guess you have to maintain your fluff somewhere.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: httpwebwitch</title>
		<link>http://www.sugarrae.com/be-a-normalizer-a-c14n-exterminator/#comment-60427</link>
		<dc:creator>httpwebwitch</dc:creator>
		<pubDate>Mon, 25 Feb 2008 22:29:09 +0000</pubDate>
		<guid isPermaLink="false">http://www.sugarrae.com/be-a-normalizer-a-c14n-exterminator/#comment-60427</guid>
		<description>Yes Soeren. I didn&#039;t delve into the &quot;why&quot; of C14N in this post, since that info is widely available elsewhere, but that&#039;s the crux: C14N prevents double-indexing and supplemental problems in the SERPs. Taken further, the same techniques you use for normalizing can catch and redirect IBLs that include common typos, odd punctuation or other abnormalities.

I also neglected the &quot;how&quot;, as in &quot;how to fix it&quot;. I only covered the &quot;what&quot; here, which apologetically only helps you identify the problems, it doesn&#039;t offer solutions. Nonetheless I think this post is a helpful guide, if only for the first half of the journey.

Cheers, hww</description>
		<content:encoded><![CDATA[<p>Yes Soeren. I didn&#8217;t delve into the &#8220;why&#8221; of C14N in this post, since that info is widely available elsewhere, but that&#8217;s the crux: C14N prevents double-indexing and supplemental problems in the SERPs. Taken further, the same techniques you use for normalizing can catch and redirect IBLs that include common typos, odd punctuation or other abnormalities.</p>
<p>I also neglected the &#8220;how&#8221;, as in &#8220;how to fix it&#8221;. I only covered the &#8220;what&#8221; here, which apologetically only helps you identify the problems, it doesn&#8217;t offer solutions. Nonetheless I think this post is a helpful guide, if only for the first half of the journey.</p>
<p>Cheers, hww</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Soeren Sprogoe</title>
		<link>http://www.sugarrae.com/be-a-normalizer-a-c14n-exterminator/#comment-60412</link>
		<dc:creator>Soeren Sprogoe</dc:creator>
		<pubDate>Mon, 25 Feb 2008 20:46:15 +0000</pubDate>
		<guid isPermaLink="false">http://www.sugarrae.com/be-a-normalizer-a-c14n-exterminator/#comment-60412</guid>
		<description>One thing that surprised me, when I originally learned about c14n, was that Google actually treats yourdomain.com and yourdomain.com/default.aspx as two completely different pages!

So page value will be divided out over the two, but one of them wil (most likely) be marked as dupe content and be removed from the index!

So I can definately confirm that you need to be consequent when doing (internal) linking:
- Allways use lower case.
- Allways use www. (or don&#039;t).
- Allways link to your frontpage with either / or default.aspx (or index.php, depending on your choice of technology).</description>
		<content:encoded><![CDATA[<p>One thing that surprised me, when I originally learned about c14n, was that Google actually treats yourdomain.com and yourdomain.com/default.aspx as two completely different pages!</p>
<p>So page value will be divided out over the two, but one of them wil (most likely) be marked as dupe content and be removed from the index!</p>
<p>So I can definately confirm that you need to be consequent when doing (internal) linking:<br />
- Allways use lower case.<br />
- Allways use www. (or don&#8217;t).<br />
- Allways link to your frontpage with either / or default.aspx (or index.php, depending on your choice of technology).</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ian Ring</title>
		<link>http://www.sugarrae.com/be-a-normalizer-a-c14n-exterminator/#comment-60319</link>
		<dc:creator>Ian Ring</dc:creator>
		<pubDate>Mon, 25 Feb 2008 06:51:36 +0000</pubDate>
		<guid isPermaLink="false">http://www.sugarrae.com/be-a-normalizer-a-c14n-exterminator/#comment-60319</guid>
		<description>thanks Rae
a lot of material was omitted from this post... watch for more installments like this one on ianring.com and at webmasterworld in the coming year. I consider it a compliment to be the only guestwhore who got dugg. :)</description>
		<content:encoded><![CDATA[<p>thanks Rae<br />
a lot of material was omitted from this post&#8230; watch for more installments like this one on ianring.com and at webmasterworld in the coming year. I consider it a compliment to be the only guestwhore who got dugg. :)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Rae Hoffman</title>
		<link>http://www.sugarrae.com/be-a-normalizer-a-c14n-exterminator/#comment-60247</link>
		<dc:creator>Rae Hoffman</dc:creator>
		<pubDate>Sun, 24 Feb 2008 19:57:37 +0000</pubDate>
		<guid isPermaLink="false">http://www.sugarrae.com/be-a-normalizer-a-c14n-exterminator/#comment-60247</guid>
		<description>Ian... wow... that was a lot of info...

1. Never post in dual categories on the blog :)
2. You weren&#039;t supposed to TEACH anyone anything. 

Thanks and awesome job dude. ;)</description>
		<content:encoded><![CDATA[<p>Ian&#8230; wow&#8230; that was a lot of info&#8230;</p>
<p>1. Never post in dual categories on the blog :)<br />
2. You weren&#8217;t supposed to TEACH anyone anything. </p>
<p>Thanks and awesome job dude. ;)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: rdrysdale</title>
		<link>http://www.sugarrae.com/be-a-normalizer-a-c14n-exterminator/#comment-60108</link>
		<dc:creator>rdrysdale</dc:creator>
		<pubDate>Sat, 23 Feb 2008 19:20:12 +0000</pubDate>
		<guid isPermaLink="false">http://www.sugarrae.com/be-a-normalizer-a-c14n-exterminator/#comment-60108</guid>
		<description>Alright, I finally had the time to absorb this. Nice auditing process. It takes a little while, but clearly doesn&#039;t need to be done on a weekly basis, so it&#039;s well worth the setup. My problem has always been understanding what&#039;s wrong and not having the technical know-how to fix it. Slowly getting better on that front. Thanks for the referrals. We&#039;re working with ISAPI rewrite now to counter some issues.</description>
		<content:encoded><![CDATA[<p>Alright, I finally had the time to absorb this. Nice auditing process. It takes a little while, but clearly doesn&#8217;t need to be done on a weekly basis, so it&#8217;s well worth the setup. My problem has always been understanding what&#8217;s wrong and not having the technical know-how to fix it. Slowly getting better on that front. Thanks for the referrals. We&#8217;re working with ISAPI rewrite now to counter some issues.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
