Monday, January 21, 2019

Improving the Sitecore Broken Links Removal Tool



While working through an upgrade to Sitecore 9.1, I ran into a broken links issues that couldn't be resolved using Sitecore's standard Broken Links Removal tool.

While searching the internet, I was able to determine that I wasn't the only one that faced these types of issues.

In this post, I intend to walk you through the link problems that I ran into, and why I decided to create an updated Broken Links Removal tool to overcome the issues that the standard links removal tool wasn't able to resolve.

NOTE: The issues that I present in this post are not specific to version 9.1.  They exist in Sitecore versions going back to 8.x.

Exceptions after Upgrade Package Installation

After installing the 9.1 upgrade package and completing the post installation steps of rebuilding the links database and publishing, I discovered that lots of my site's pages started throwing the following exceptions:

The model item passed into the dictionary is of type 'Castle.Proxies.IGlassBaseProxy', but this dictionary requires a model item of type 'My Custom Model'.

During the solution upgrade, I had upgraded to Glass Mapper to version 5, so I thought that the issue could be related to this.  After digging in, I noticed that my items / pages that were throwing exceptions had broken links.  I determine this by turning on Broken Links using the Sitecore Gutter in the Content Editor.

Next, I attempted to run Broken Links Removal tool located at http://{your-sitecore-url}/sitecore/admin/RemoveBrokenLinks.aspx.

After it had run for several minutes, it threw the following exception:

ERROR Error looking up template field. Field id: {00000000-0000-0000-0000-000000000000}. Template id: {128ADD89-E6BC-4C54-82B4-A0915A56B0BD}
Exception: System.ArgumentException
Message: Null ids are not allowed.
Parameter name: fieldID
Source: Sitecore.Kernel
   at Sitecore.Diagnostics.Assert.ArgumentNotNullOrEmpty(ID argument, String argumentName)
   at Sitecore.Data.Templates.Template.DoGetField(ID fieldID, String fieldName, Stack`1 stack)
   at Sitecore.Data.Templates.Template.GetField(ID fieldID)

Digging In

I needed to understand why this exception was being thrown, and started down the path of decompiling Sitecore's assemblies.  My starting point for reviewing the code was Sitecore.sitecore.admin.RemoveBrokenLinks.cs which is the code behind for the Broken Links Removal page.

I took all the code and pasted it into my own ASPX page so that I could throw in a break point and debug what was going on.  After a lot of trial and error and a ton of logging,  I discovered that code that was throwing the error existed in the FixBrokenLinksInDatabase method on line 11 shown below:

If the Source Field ID / "itemLink.SourceFieldID" on line 11 is null (this is the field where it has determined that there is a broken link), the exception noted above will be thrown.

The Cause of the Null Source Field

During my investigation, I determined that the cause of this field being null was due to the item being created from a branch template that no longer existed.

To put this another way, the target item represented as the sourceItem in the code above (line 8), had a reference to a branch template that no longer existed, and the lookup for item was returning a null source field.

Through my code logging and Content Editor validation, I found that we had a massive amount of broken links caused by a developer deleting several EXM branch templates:

Stack Exchange and Sitecore Community uncovered some decent information regarding this type of issue, and how to solve it manually by running a SQL query:

Now, to fix this problem automatically using the tool, I just needed to add a null check in the code, and also create a way to clean up the references to the invalid branch templates.

Improved Broken Links Tool

The outcome of my work was an improved Broken Links Removal tool that I call the "Broken Links Eraser".

The tool does everything that the Sitecore Broken Links Removal tool does, with the following improvements:

  • Detects and removes item references to branch templates that no longer exist.
  • Removes all invalid item field references to other items (inspects all fields that contain an id).
  • Allows you to target broken links using a target path, you don't have to run through every item in the target database. This is useful when working with large sets of content.
  • Has detailed logging while it is running and feedback after it has completed. 

The tool is built as a standalone ASPX page, so you can simply drop the file in your {webroot}/sitecore/admin folder to use it. No need to deploy assemblies and recycle app pools etc.

All updates were made using Sitecore's SqlDataApi, so the code is consistent with Sitecore's standards. The code is available on GitHub for you to download and modify as needed:

Final Thoughts

I hope that you find this tool useful in solving your broken link issues. Please feel free to add comments or contact me with any questions on either Sitecore Slack or Twitter.