Wednesday, December 14, 2016

Implementing 3000+ Redirects in Sitecore

Standard
When standing up a new site, redirects always seem to be an afterthought - one of those items on the list that you talk about in the early phases, and then again when you are ready to tackle them in the last few weeks when the launch is right around the corner.

As a Sitecore developer, most of the time it's up to you to set up the module of choice, and then simply train your content authors on how to use it to load the redirects.

However, when dealing with a large corporate site, and in my case where we combined a couple sites into 1, you have to find a relatively quick way to get thousands of redirects handled by your shiny new Sitecore site.

In this post, I will provide the strategy that I took to import and implement a massive amount of redirects successfully within Sitecore.

You can go ahead and grab all the Url Rewrite module code changes that I mentioned in this post via my fork on GitHub: https://github.com/martinrayenglish/UrlRewrite

You can review the code changes here: https://github.com/martinrayenglish/UrlRewrite/commit/d9b649d129b6b49ee7cf3f6beae3a8229750a152

You can grab the PowerShell script here.

Url Rewrite Module

There are plenty of Sitecore redirect modules out there, but Andy Cohen's Url Rewrite module is my favorite one because of its rich feature set, great architecture and the fact that it's source code is available when you need to make customizations: https://marketplace.sitecore.net/Modules/Url_Rewrite.aspx

As shown above, it is available on the Sitecore Marketplace. I would recommend grabbing the branch / tag that is specific to your version of Sitecore by navigating over to the GitHub repository: https://github.com/iamandycohen/UrlRewrite.

If you view the changelog, you will be able to find out what version supports your instance.

That is what I did in my case - I worked with Version 1.8.1.3 when I had to make the customizations mentioned below for my 8.1 U2 implementation.

Handling Bucket Items

As we know, item buckets let you manage large numbers of items in the content tree and this was a natural direction to take the massive amount of redirect items that I intended to load into Sitecore.

Now, focusing onto the module's code - there is a recursive method called "AssembleRulesRecursive" within the RulesEngine.cs file that is responsible for aggregating all the redirect items and rules. I ended up having to update this area of the module to check within both bucket and node items for redirect items and rules.

This can be seen by my change on line 91 of RulesEngine.cs: https://github.com/martinrayenglish/UrlRewrite/commit/d9b649d129b6b49ee7cf3f6beae3a8229750a152#diff-b5f5d381da80e314aac4e60905fb7ea7

Next, I needed to set the standard values of the module's Simple Redirect template to be Bucketable


After this, I went ahead and added a new bucket content item at my "global" location in my content tree that would hold the redirect items that I intended to import into Sitecore.

PowerShell Import

The next step in this operation was to get the actual redirect items loaded into Sitecore. I created PowerShell script that would target a CSV file that was loaded into the media library and create items for each data record.

I have been using several derivations of Pavel Veller's script for handling imports in the past. If you are new to Sitecore PowerShell, I recommend taking a look at his post: http://jockstothecore.com/content-import-with-powershell-treasure-hunt/.

My final script simply required my CSV file to contain "name" , "old" and a "new" columns that I would use to create the redirect items within my bucket. The value in the "name" column would be used for the redirect item name, "Old" would hold the old url and "New" would hold the new / target url. Here is a screenshot of a sample from my CSV file:


With everything in place, I uploaded my CSV file containing my redirects into the media library, ran my script, and my many, many redirect items started to appear in my bucket.


Handling Redirects with Static File Extensions 

The module has a built-in handler for static file extensions that you can see by Brent Scav's post: https://blog.horizontalintegration.com/2014/12/19/sitecore-url-rewrite-module-for-static-file-extensions/.

You can simple add handler entries to your web.config to allow it to handle whatever static extensions you need to redirect from in your instance.

Unfortunately, this didn’t work for me in the latest version, as it kept throwing a Tracker.Current "null" error when trying to start the Analytics tracker within the RegisterEventOnRedirect method in Tracker.cs, line 30: https://github.com/martinrayenglish/UrlRewrite/blob/master/Hi.UrlRewrite/Analytics/Tracking.cs

I believe that this was because the handler was hit before Sitecore's InitializeTracker pipeline had been run.

I went ahead and added a way for the handler to tell the InboundRewriter not to try and start the Analytics tracker if it was handling a static extension redirect. This was done by adding an entry to the HttpRequestArgs custom data's SafeDictionary within the handler UrlRewriteHandler.cs on line 28:

https://github.com/martinrayenglish/UrlRewrite/commit/d9b649d129b6b49ee7cf3f6beae3a8229750a152#diff-1fca180afe168b7567be7ea87006de50

and looking for it within the InboundRewriteProcessor.cs line on 54:

https://github.com/martinrayenglish/UrlRewrite/commit/d9b649d129b6b49ee7cf3f6beae3a8229750a152#diff-f325026a733120e6591270e76c2d8347

After that, the handlers worked like a champ.

Here is an example of a handler for PDF files from my web.config:


Bonus - Handling Subdomain redirects

I needed a way to handle non-Sitecore site subdomain redirects within my solution.

To explain what I was doing here: 

We had merged a separate site with a different subdomain into our new site, and wanted to be able to create redirects for urls from the old site that pointed to the new urls.

Example:

http://old.mysite.com/folder/some-nice-url (old non-Sitecore site) → https://www.mysite.com/newfolder/some-new-nice-url (new Sitecore site)

Once again, I dug into the InboundRewriter.cs and updated the TestRuleMatches method to be able to match using host name as well. After this, I added a new TestAllRuleMatches method that would be called instead, that would first check using the "old way" of matching based on path, and if it didn’t find a match, it would check for a match using the full url with host name included.

You can see these changes here: https://github.com/martinrayenglish/UrlRewrite/commit/d9b649d129b6b49ee7cf3f6beae3a8229750a152#diff-4580f06f0095411a68df2fa0d1e890dd

With this in place, all I had to do was add the new "old site" binding in IIS to my Sitecore site and voila, the module handled requests for the old subdomain.

Problem Solved

With my items loaded into Sitecore, the ability to handle static file extensions and non-Sitecore site subdomains, I had reached my final destination on my redirect mission!

You can go ahead and grab all the Url Rewrite module code changes that I mentioned in this post via my fork on GitHub: https://github.com/martinrayenglish/UrlRewrite

You can review the code changes here: https://github.com/martinrayenglish/UrlRewrite/commit/d9b649d129b6b49ee7cf3f6beae3a8229750a152

You can grab the PowerShell script here.

Q&A

Good question asked by Kamruz Jaman: Did you consider generating redirect rules for IIS Rewrite module directly?

The IIS rewrite module was used for forcing ssl behind our AWS elastic load balancer (see this post http://stackoverflow.com/questions/19791820/redirect-to-https-through-url-rewrite-in-iis-within-elastic-beanstalks-load-bal) and to prevent font leaching. Our client made us work with a 3rd party that delivered a redirect map in Excel format of about 6k entries 3 weeks prior to launch. The old and new urls were vastly different and would result in some very complex rewrite rules and we would end up with a web.config 10 miles long. Also, tweaking things after launch (we still are) would be painful because updating the rules using the IIS module would update the web.config and as you know, would cause a recycle.

This approach was the best solution for our situation.