Monday, August 31, 2015

Setting up Sitecore's Geolocation Lookup Services in a Production Environment

Standard

Background

We have been working with Sitecore’s Business Optimization Services (SBOS) team for quite some time, helping one of our client's stretch the legs of the Experience Platform.

One of the tasks on the list included setting up Sitecore's Geolocation Service so that we could personalize based on the visitor's location. The SBOS team had some pretty slick rules set up on the Home Page of a site, where they switched out carousel slides and marketing content spots based on the visitor's location.


Sitecore Geolocation Lookup Service and MaxMind

There have been some changes with regards to the Geolocation / MaxMind set up because Sitecore launched IP Geolocation directly to their customers via the Sitecore App Center instead of going through MaxMind to use their service. Here is a link to Sitecore's documentation site that contains set up instructions, and how to migrate from MaxMind if purchased and have been working directly with them in the past:
https://doc.sitecore.net/Sitecore%20Experience%20Platform/Analytics/Setting%20up%20Sitecore%20IP%20Geolocation

Setup

After sifting through the documentation, we highlighted the following steps that needed to be completed in order to get up and running, and validate that things were indeed working:

  1. Download and install the Geolocation client package from https://dev.sitecore.net/Downloads.aspx
  2. Enable ALL configuration files in the CES folder
  3. Whitelist geoIp-ces.cloud.sitecore.net and discovery-ces.cloud.sitecore.net
  4. Test personalization
NOTE: Depending on your firewall, you may only have the option to whitelist by IP address. If this applies to you, you will need obtain the list of Azure IP addresses from the following link: Azure Datacenter IP Address Ranges. This is what happened to us, and I call tell you that the list is looooooooooooooong!!! Your network guy or gal will hate you!

Unfortunately for us, there was one piece of configuration that isn't documented. It so happened to be one of the most important pieces.

You will know what I am referring to as you read further along.

Testing

With all the pieces in place, we started testing our personalization.

Things worked beautiful on our Staging Server, but Production was a non-starter! So, as good detectives, we started our troubleshooting by looking at the differences between Staging and Production.

Load Balancer / Firewall / CDN woes

Our client uses Incapsula to protect their production websites. It does a great job protecting and caching their various site's to ensure optimal performance. It has however given us some grey hairs in the past when dealing with Sitecore's Federated Experience Manager. But, that's a story for another time.

The Incapsula CDN was the main difference between Staging and Production.

After running several tests with Fiddler and capturing packets using Wireshark, we were able to the determine that the Geolocation service was not obtaining the visitor's actual IP address. Instead, it was passing along Incapsula's IP address.

The reason for this was identified in the CreateVisitProcessor within the CreateVisits analytics pipeline. As you can see below, it was passing over the Request.UserHostAddress value.


This doesn't work when you are behind a load balance or proxy, as described by this article: http://stackoverflow.com/questions/200527/userhostaddress-gives-wrong-ips

Digging further, we discovered another interesting processor in the CreateVisits pipeline called XForwardedFor. Aha! As we know; the "...header field is a de facto standard for identifying the originating IP address of a client connecting to a web server through an HTTP proxy or load balancer."

Looking at the code below, you will notice that it's pulling in a setting, and if it's not empty, it is used as the key to obtain the value from the request's header NameValueCollection.



After digging around, and talking to support, we discovered the config file named Sitecore.Analytics.Tracking.config and setting below:

 <!-- ANALYTICS FORWARDED REQUEST HTTP HEADER  
    Specifies the name of an HTTP header variable containing the IP address of the webclient.  
    Only for use behind load-balancers that mask web client IP addresses from webservers.  
    IMPORTANT: If this setting is used incorrectly, it allows IP address spoofing.  
    Typical values are "X-Forwarded-For" and "X-Real-IP".  
    Default value: "" (disabled)  
 -->  
 <setting name="Analytics.ForwardedRequestHttpHeader" value="" />  

Light at the end of the tunnel

After setting the value to "X-Forwarded-For" as shown below, the magical Geolocation based personalization started working like a champ!

<setting name="Analytics.ForwardedRequestHttpHeader" value="X-Forwarded-For" />

NOTE: We discovered that casing matters when setting the value. "X-FORWARDED-FOR" will NOT work. It needs to be set exactly like I have it above. For more information on this, you can read this Stack Overflow article:
http://stackoverflow.com/questions/11616964/is-request-headersheader-name-in-asp-net-case-sensitive


I hope that this information helps make your Sitecore IP Geolocation configuration go smoothly for your environment!

A special thanks to Kyle Heon from Sitecore for his support through this process.

2 comments:

Mike LeV said...

I think the "X-Forwarded-For" config setting goes at least back to the Sitecore 7 DMS days. Probably OMS too.

Martin English said...

Mike, thanks for the note. That may be the case. Unfortunately, we were not made aware of that config setting early on when working with the Sitecore team. As a result, I decided to get my hands dirty and decompile code to understand what could be going wrong. At the very least, I learnt a lot from this experience and wanted to share my story with others who might run into the same configuration scenario.
Cheers!

Post a Comment