Thursday, July 23, 2015

FXM Experience Editor - Cleaning Up Problematic External Content

Standard

Background

I have been using Federated Experience Manager (FXM) a lot, and it has worked extremely well in almost all of our client deployments.




I recently ran into an issue with one of our client's production instance where we had an external site that wasn't able to load up in the FXM Experience Editor.  We could see the page start loading, and then after a few seconds, it would stop leaving a blank page with a little bit of script. This left us scratching our heads for quite some time.

Using Fiddler's AutoResponder feature, we were eventually able to determine
that if we disabled an Adobe Dynamic Tag Management script library, we were able to successfully load the external site in the Experience Editor.

Our problematic script:






Our client made it very clear that removing the script from the external site was not an option. So, we needed to find a way to remove the script from the external site when loading it in the Experience Editor only.

Finding the Sweet Spot

Almost certain that there had to be a pipeline that I could hook into, I armed myself with the instance's "showconfig" and my favorite decompiler (JetBrains dotPeak), I started digging around until I discovered the following pipeline:

1:  <content.experienceeditor>  
2:   <processor type="Sitecore.FXM.Client.Pipelines.ExperienceEditor.ExternalPage.GetExternalPageContentProcessor, Sitecore.FXM.Client"/>  
3:   <processor type="Sitecore.FXM.Client.Pipelines.ExperienceEditor.ExternalPage.UpdateBeaconScriptPathProcessor, Sitecore.FXM.Client"/>  
4:   <processor type="Sitecore.FXM.Client.Pipelines.ExperienceEditor.ExternalPage.InjectControlsProcessor, Sitecore.FXM.Client"/>  
5:   <processor type="Sitecore.FXM.Client.Pipelines.ExperienceEditor.ExternalPage.AddPlaceholderData, Sitecore.FXM.Client"/>  
6:  </content.experienceeditor>  
This looked very promising indeed! Next step was to crack open the GetExternalPageContentProcessor.

Voila, I found exactly what I was looking for; the point at which FXM grabs the content from the external site, and sticks it into an argument that the rest of the processors can access (lines 19 & 20):

1:  public void Process(ExternalPageExperienceEditorArgs args)  
2:    {  
3:     Assert.ArgumentNotNull((object) args, "args");  
4:     Assert.ArgumentNotNull((object) args.MatcherContextItem, "MatcherContextItem");  
5:     Assert.ArgumentNotNull((object) args.ExperienceEditorUrl, "ExperienceEditorUrl");  
6:     string externalPageUrl = this.GetExternalPageUrl(args);  
7:     if (string.IsNullOrEmpty(externalPageUrl))  
8:     {  
9:      args.AbortPipeline();  
10:     }  
11:     else  
12:     {  
13:      string experienceEditorUrl = this.GetBaseExperienceEditorUrl(args);  
14:      HttpRequestMessage request = new HttpRequestMessage(HttpMethod.Get, externalPageUrl);  
15:      request.Headers.Add("FxmReferrer", (IEnumerable<string>) new string[1]  
16:      {  
17:       experienceEditorUrl  
18:      });  
19:      HttpResponseMessage httpResponseMessage = this.externalSiteWebProxy.MakeRequest(string.Format("{0}&url={{0}}", (object) experienceEditorUrl), request);  
20:      args.ExternalPageContent = httpResponseMessage.Content.ReadAsStringAsync().Result;  
21:     }  
22:    }  

FXM External Content Sanitizer Processor

Yes, the name is a mouthful!

Knowing that this type of problem was bound to pop up again, I decided to write a custom processor that would accept a series of regular expressions in configuration,  and use them to strip out any problematic content that may be causing issues when loading up the FXM Experience Editor.

Processor

1:  public class SanitizeContent : IExternalPageExperienceEditorProcessor  
2:    {  
3:      private static string _sanitizeNode = "/sitecore/fxmSanitizeExternalContent";  
4:      public void Process(ExternalPageExperienceEditorArgs args)  
5:      {  
6:        Assert.ArgumentNotNull(args, "args");  
7:        Assert.ArgumentNotNull(args.ExternalPageContent, "ExternalPageContent");  
8:        foreach (var regex in GetRegexList())  
9:        {  
10:          var currentReg = new Regex(regex);  
11:          var cleanHtml = currentReg.Replace(args.ExternalPageContent, "");  
12:          args.ExternalPageContent = cleanHtml;  
13:        }  
14:      }  
15:      /// <summary>  
16:      /// Returns list of strings containing regular expressions that have been set in configuration  
17:      /// </summary>  
18:      private static IEnumerable<string> GetRegexList()  
19:      {    
20:        var configNode = Factory.GetConfigNode(_sanitizeNode);  
21:        var regexList = new List<string>();  
22:        foreach (XmlNode childNode in configNode.ChildNodes)  
23:        {  
24:          regexList.Add(XmlUtil.GetAttribute("value", childNode));  
25:        }  
26:        return regexList;  
27:      }  
28:    }  

Configuration

You can duplicate line 4 and add as many regular expressions as you need to. In my configuration, I added a regular expression (thanks Andy Uzick for the help) to strip out any script that contained the words "adobetm".

1:  <configuration xmlns:patch="http://www.sitecore.net/xmlconfig/">  
2:   <sitecore>  
3:    <fxmSanitizeExternalContent>  
4:     <sanitizeRegex value="&lt;script[^&lt;]*(adobedtm)[\s\S]*?&lt;/script&gt;"/>  
5:    </fxmSanitizeExternalContent>  
6:    <pipelines>  
7:     <group groupName="FXM" name="FXM">  
8:      <pipelines>  
9:       <content.experienceeditor>  
10:        <processor type="MyProject.Domain.Pipelines.ExperienceEditor.ExternalPage.SanitizeContent, MyProject.Domain" patch:after="processor[@type='Sitecore.FXM.Client.Pipelines.ExperienceEditor.ExternalPage.GetExternalPageContentProcessor, Sitecore.FXM.Client']" />  
11:       </content.experienceeditor>  
12:     </pipelines>  
13:    </group>  
14:   </pipelines>  
15:  </sitecore>  
16: </configuration>  

Problem solved!

With the script removed, the external site loaded up in the FXM Experience Editor, and we were able to complete the tasks that we had originally set out to do.

I hope that this helps others that run into this same issue.




Sunday, July 19, 2015

Managing Countries and Regions within the same site (Part 2) : The Region Resolver

Standard
With the content in place, the next step was to be able to tie the specific site instances to their respective countries or regions. We wanted to keep things nice and clean, and decided to add 2 additional attributes to the site definitions within our config file:

  1. siteRegion - The item id value of the region content item
    1. 1 These items where located in the global / shared location within our site. See part 1 for more information.

  2. pageNotFound - Redirect the user to a path within our site that contains a "page not found" page if they try and access a page that didn't match their country or region. 
So, our site definition looked something like this:

 <sites>  
    <site name="MySiteUSA" patch:before="site[@name='website']"  
       virtualFolder="/"  
       physicalFolder="/"  
       rootPath="/sitecore/content"  
       startItem="/MySite"  
       database="master"  
       domain="extranet"  
       allowDebug="true"  
       cacheHtml="true"  
       htmlCacheSize="10MB"  
       enablePreview="true"  
       enableWebEdit="true"  
       enableDebugger="true"  
       disableClientData="false"  
       hostName="*MySiteusa.com"  
       siteRegion="{B4F05B9C-1A40-4B95-AC03-C2115E7CA448}"  
       pageNotFound="/Page-Not-Found"  
       />  
    <site name="MySiteCA" patch:before="site[@name='website']"  
       virtualFolder="/"  
       physicalFolder="/"  
       rootPath="/sitecore/content"  
       startItem="/MySite"  
       database="master"  
       domain="extranet"  
       allowDebug="true"  
       cacheHtml="true"  
       htmlCacheSize="10MB"  
       enablePreview="true"  
       enableWebEdit="true"  
       enableDebugger="true"  
       disableClientData="false"  
       hostName="*MySitecda.com"  
       siteRegion="{A71B1C90-107B-4740-9913-8E683B019BF7}"  
       pageNotFound="/Page-Not-Found"  
       />  
   </sites>  

Quick note:
This site definition is not meant for a production instance. You obviously want to increase the cache size and turn off things like debugging in production.

Next, we added a simple class that we would use to retrieve an object that contains our site's context as well as the above-mentioned attributes from the configuration xml via it's properties.

When looking at the code below (starting at line 44), you will notice that we are attempting to pull the region object out of cache if it exists. We added a caching layer, because we wanted to ensure that this processor would be as fast as possible. You will see this being used a lot in our processor code that follows.

1:  public class SiteRegion  
2:    {  
3:      private const string SiteNode = "/sitecore/sites";  
4:      private XmlNode CurrentSiteNode  
5:      {  
6:        get  
7:        {  
8:          XmlNode targetParamsNode = Factory.GetConfigNode(SiteNode);  
9:          var currentSiteContext = Context.Site;  
10:          foreach (XmlNode childNode in targetParamsNode.ChildNodes)  
11:          {  
12:            if (XmlUtil.GetAttribute("name", childNode)  
13:              .Equals(currentSiteContext.Name, StringComparison.InvariantCultureIgnoreCase))  
14:            {  
15:              return childNode;  
16:            }  
17:          }  
18:          return null;  
19:        }  
20:      }  
21:      public string CacheKey  
22:      {  
23:        get  
24:        {  
25:          return Context.Site.Name;  
26:        }  
27:      }  
28:      public SiteContext Site  
29:      {  
30:        get { return Context.Site; }  
31:      }  
32:      public string Region  
33:      {  
34:        get { return XmlUtil.GetAttribute("siteRegion", CurrentSiteNode); }  
35:      }  
36:      public string PageNotFoundUrl  
37:      {  
38:        get { return XmlUtil.GetAttribute("pageNotFound", CurrentSiteNode); }  
39:      }  
40:      public static string CurrentRegion  
41:      {  
42:        get  
43:        {  
44:          var siteRegion = CacheHelper.RegionCache.GetObject(Context.Site.Name) as SiteRegion;  
45:          if (siteRegion == null)  
46:          {  
47:            siteRegion = new SiteRegion();  
48:            CacheHelper.RegionCache.SetObject(Context.Site.Name, siteRegion);  
49:          }  
50:          return siteRegion.Region;  
51:        }  
52:      }  
53:    }  

Region Resolver Processor

Next, we built out the region resolver pipeline processor to check if an item was meant for a specific country / region.

Sitecore provides us with a nice SafeDictionary KeyValuePair object in their PipelineArgs that is useful for adding custom data to pass down the pipeline. This was ideal for us to pass the message along to the Region Page Not Found processor (next up) telling it whether the "page not found" page should be displayed or not (Line 26).

Line's 15 and 16 are performing the check for the context item's region field being set.

1:  public class RegionResolver : HttpRequestProcessor  
2:    {  
3:      public override void Process(HttpRequestArgs args)  
4:      {  
5:        Assert.ArgumentNotNull(args, "args");  
6:        var showRegionPage = true;  
7:        var siteRegion = CacheHelper.RegionCache.GetObject(Context.Site.Name) as SiteRegion;  
8:        if (siteRegion == null)  
9:        {  
10:          siteRegion = new SiteRegion();  
11:          CacheHelper.RegionCache.SetObject(Context.Site.Name, siteRegion);  
12:        }  
13:        if (Context.Item != null && siteRegion.Site.HostName.IsNotEmpty())  
14:        {  
15:          if (Context.Item.Fields["Region"] != null &&  
16:            !Context.Item.Fields["Region"].Value.Contains(siteRegion.Region))  
17:          {  
18:            showRegionPage = false;  
19:          }  
20:        }  
21:        if (showRegionPage)  
22:        {  
23:          return;  
24:        }  
25:        //Add entry to safe dectionary to tell region page not found processor to redirect to page not found  
26:        args.CustomData.Add("hideRegionPage", true);  
27:        var notFoundProcessor = new RegionPageNotFound();  
28:        notFoundProcessor.Process(args);  
29:      }  
30:    }  

Region Page Not Found Processor

One of things that we wanted was to have a "page not found" bit of logic that would be able to handle both our normal page not found / 404's for our sites, as well as those that were not supposed to be displayed for our country / region.

So, we wrote a processor that would be able to catch both.

Before looking at the code, there are a few things to note:
  1. On line 7, we are checking to see if our Region Resolver has told us to hide the context item via the "message" in the safe dictionary object .
  2. If you have defined custom MVC routes, you would need to check for those in the pipeline and allow them to be processed (line 31).
  3. This processor would do the job of redirecting visitors to our page not found path, set in our site definition that was mentioned above (line 47).

1:  public class RegionPageNotFound : HttpRequestProcessor  
2:    {  
3:      public override void Process(HttpRequestArgs args)  
4:      {  
5:        Assert.ArgumentNotNull(args, "args");  
6:        //Check for safe dectionary object indicating that page needs to be "hidden"  
7:        var hideRegionPage = args.CustomData.ContainsKey("hideRegionPage");  
8:        if (!hideRegionPage &&  
9:          (Context.Item != null  
10:          || Context.Site == null  
11:          || Context.Site.Name.Equals("shell", StringComparison.CurrentCultureIgnoreCase)  
12:          || Context.Site.Name.Equals("website", StringComparison.CurrentCultureIgnoreCase)  
13:          || Context.Database == null  
14:          || Context.Database.Name.Equals("core", StringComparison.CurrentCultureIgnoreCase)  
15:          || string.IsNullOrEmpty(Context.Site.VirtualFolder)  
16:          ))  
17:        {  
18:          return;  
19:        }  
20:        // The path in the requested URL.  
21:        var filePath = Context.Request.FilePath.ToLower();  
22:        if (string.IsNullOrEmpty(filePath)  
23:          || WebUtil.IsExternalUrl(filePath)  
24:          || System.IO.File.Exists(HttpContext.Current.Server.MapPath(filePath)))  
25:        {  
26:          return;  
27:        }  
28:        //Api path checks  
29:        var uri = HttpContext.Current.Request.Url.AbsoluteUri;  
30:        if (uri.Contains("sitecore/api")  
31:          || uri.Contains("api/mycustomroute"))  
32:        {  
33:          return;  
34:        }  
35:        var siteRegion = CacheHelper.RegionCache.GetObject(Context.Site.Name) as SiteRegion;  
36:        if (siteRegion == null)  
37:        {  
38:          siteRegion = new SiteRegion();  
39:          CacheHelper.RegionCache.SetObject(Context.Site.Name, siteRegion);  
40:        }  
41:        // Send the NotFound page content to the client with a 404 status code  
42:        if (!string.IsNullOrEmpty(siteRegion.Region) && !string.IsNullOrEmpty(siteRegion.PageNotFoundUrl))  
43:        {  
44:          var ctx = HttpContext.Current;  
45:          ctx.Response.TrySkipIisCustomErrors = true;  
46:          ctx.Response.StatusCode = 404;  
47:          ctx.Response.Redirect(siteRegion.PageNotFoundUrl);  
48:          ctx.ApplicationInstance.CompleteRequest();  
49:        }  
50:      }  
51:    }  

Hooking into the HttpBeginRequest Pipeline

The final piece of this puzzle, was to add our new processors to the HttpBeginRequest Pipeline, after the ItemResolver processor.

Here is what the config file looked like that would make the magic happen:

 <pipelines>  
  <httpRequestBegin>  
   <processor  
        type="MyProject.Library.Pipelines.HttpRequestBegin.RegionResolver, MyProject.Library"  
        patch:after="processor[@type='Sitecore.Pipelines.HttpRequest.ItemResolver, Sitecore.Kernel']"/>  
   <processor  
        type="MyProject.Library.Pipelines.HttpRequestBegin.RegionPageNotFound, MyProject.Library"  
        patch:after="processor[@type='Sitecore.Pipelines.HttpRequest.ItemResolver, Sitecore.Kernel']"/>  
  </httpRequestBegin>  
 </pipelines>  

Order is important here, because as I mentioned, we want the Region Resolver Processor to be able to tell the Region Not Found Processor whether or not the page should be displayed for the context site.

Next Up

In Part 3, I am going to demonstrate how we were able to "regionize" renderings on the sites by building a custom condition for the Sitecore Rules Engine.