Thursday, August 16, 2018

Sitecore GeoIP - What Is Happening Under The Hood In 8.x?

Standard

Background

Most posts explain how Sitecore's GeoIP service works from a high-level point of view.

In this post, I intend to take the explanation a few steps deeper, so that developers can understand all the pieces that make this process work. The goal is to arm developers with the necessary details to successfully troubleshoot a problem if one arises.


Visitor Interaction - Start of visitor's session

  • Visitor visits Sitecore website, and this is regarded as a new interaction. Sitecore's definition of an interaction is ".. any point at which a contact interfaces with a brand, either online or offline". In our case, this is a new visitor session on the website.

  • Sitecore runs the CreateVisits pipeline. Within this pipeline, there is a processor called UpdateGeoIpData that fires a method called GeoIpManager.GetGeoIpData within Sitecore.Analytics.Tracking.CurrentVisitContext that initiates the GeoIP lookup for the visitor's interaction.

  • Within the GeoIP lookup logic, Sitecore will use the visitor's IP address to generate a unique identifier (GUID) based on the visitor's IP address. Eg. 192.168.1.100 => fd747022-dd48-b1ca-1312-eb4ba55030b2. 

NOTE: Sitecore performs all GeoIP lookups using this unique identifier. You can see this id by looking inside your MongoDB's GeoIPs collection. The field is named _id and this is the unique naming convention that MongoDB uses across all of its content. See my previous post for a snapshot.

  • Sitecore performs a GeoIP data lookup in memory cache.

  • If the GeoIP data IS in memory cache, then it will attach it to the visitor's interaction.

  • If the GeoIP data IS NOT in memory cache, it performs a GeoIP lookup in the MongoDB Analytics database's GeoIps collection.

  • If the GeoIP data IS in the MongoDB Analytics database's GeoIps collection, it attaches it to the visitor's interaction and stores the result in memory cache.

  • If the GeoIP data IS NOT in the GeoIps collection, it performs a lookup using the Sitecore Geolocation service and stores the result in memory cache and attaches it to the visitor's interaction.

NOTE: After a successful lookup, the GeoIP data is stored in the Tracker.Current.Interaction.GeoData (ContactLocation class)

GeoIP Data Cache

  • When the GeoIP data is obtained, it is added to a dictionary object that is part of the Sitecore Tracker so that it can be referenced via the Tracker.Current.Interaction.GeoData (shown above).

  • The odd thing that I noticed was that the cache expiration was set to 10 seconds (by default)
          Code reference:
          Sitecore.Analytics.Data.Dictionaries.TrackingDictionary
          private readonly TimeSpan defaultCacheExpirationTimeout = TimeSpan.FromSeconds(10.0);

GeoIP Data - End of visitor's session

  • At the end of the visitor's interaction / session, Sitecore will run the CommitSession pipeline.

  • Like the CreateVisits pipeline, there is a processor called UpdateGeoIpData that fires a method called GeoIpManager.GetGeoIpData (with the exact same code as in the CreateVisits pipeline). This initiates the GeoIP lookup flow once again (Cache / MongoDB / GeoIP Service).

  • Seems like the intention here is to confirm the visitor's GeoData before storing the data in MongoDB that will ultimately make it's way to the reporting database.

More To Come

Next, I intend to dig into Sitecore's GeoIP code for the 9.x series, and talk about the differences identified in that implementation.