Over the course of the last month, we ran into data inconsistencies between what was in our content databases compared to our Solr indexes.
We have content authors from around the globe and content creation happens around the clock by authors via the Experience Editor and imports via external sources.
Illegal Characters Causing Index Issues
As mentioned by this KB article https://kb.sitecore.net/articles/592127, index documents are submitted Solr in XML format, and if your content contains and “illegal” characters that cannot be converted to XML, all documents in the batch submission will fail.
When you perform an index rebuild or re-index a portion of your tree, Sitecore will submit 100 documents in a batch to Solr. How is the related to the character issue? If you perform an index rebuild and have a single bad character in one of your items in the batch, none of the 100 docs in that batch will make it into your Solr index.
What makes this especially difficult to troubleshoot is that item batches contain different items every time. So, what could be missing from your index during one rebuild, could be different during the next rebuild.
There is a good Stack Exchange article that explains all of this, and kudos to Anders Laub who provides a pretty decent fix for this issue: https://sitecore.stackexchange.com/questions/18832/wildly-inconsistent-index-data-after-rebuilds
PowerShell Index Item Check Script
There are several other reasons why content could be missing from your Sitecore Indexes, and so I needed to come up with a way to identify would could be missing.
PowerShell Extensions for the win!
I decided to create a PowerShell script to do just that - check for items in a selected target database that are missing in selected index, and produce a downloadable report.
What’s nice is that I strapped on an interactive dialog making it friendly for Authors or DevOps to make their comparison selections.
If you are newish to PowerShell Extensions, this could also help you understand how powerful it truly is, and serve as a guide to build your own scripts that you can use daily!
0 comments:
Post a Comment