Saturday, July 19, 2014
After working on many multisite Sitecore configurations, it's always surprised me that Sitecore doesn't provide the ability to specify different 404, 500 & robots.txt's for each site. In the past I've implemented these pieces of functionality separately. I've now combined the three into single module which can handle all three cases. This has been released on Github and Nuget and is available for use now.
This module is hosted in Nuget and can be installed in Visual Studio via the package manager using the following command:
PM> Install-Package Sitecore.MultisiteHttpModule
All you need to do then is to manually edit your Global.ascx.cs to include the reference to the error handling code.
public void Application_Error(object sender, EventArgs args)
{
Sitecore.MultisiteHttpModule.Errors.ErrorHandler.HandleError(Server.GetLastError());
}
<httpRequestBegin>
<processor type="Sitecore.MultisiteHttpModule.NotFound.NotFoundHandler, Sitecore.MultisiteHttpModule" patch:after="processor[@type='Sitecore.Pipelines.HttpRequest.ItemResolver, Sitecore.Kernel']" />
</httpRequestBegin>
<configSections>
<section name="multisiteHttpModule" type="Sitecore.MultisiteHttpModule.Configuration.MultisiteHttpModuleSettings, Sitecore.MultisiteHttpModule" />
</configSections>
<multisiteHttpModule defaultErrorPage="/error.html" errorsEnabled="True" notFoundEnabled="True">
<exclude404Rules>
<add type="Contains" match="~/media/" />
<add type="Contains" match="~/icon/" />
<add type="Contains" match="~/link.aspx" />
<add type="StartsWith" match="/sitecore" />
<add type="Contains" match=".asmx" />
<add type="Contains" match=".ashx" />
</exclude404Rules>
</multisiteHttpModule>
<system.webServer>
<handlers>
<add name="RobotsHandler" path="/robots.txt" verb="GET" type="Sitecore.MultisiteHttpModule.Robots.RobotsHandler, Sitecore.MultisiteHttpModule" />
</handlers>
</system.webServer>
public void Application_Error(object sender, EventArgs args)
{
Sitecore.MultisiteHttpModule.Errors.ErrorHandler.HandleError(Server.GetLastError());
}
After installing the module, you need to include the new attributes on your existing Sitecore sites node. If you don't want to add one of the functions to a site, then don't include that attribute and the code will fallback to the default Sitecore functionality.
<site name="MySite"
...
notFoundPageId="{BFA65433-4552-4507-9375-C96137287640}"
errorPagePath="/errors/MySite.html"
robotsTxtLocation="/robots/MySite.robots.txt" />
The 404 page functionality can be configured by populating the notFoundPageId attribute on the site node. This needs to be populated with the ID of the item you wish to use for the 404 page. If the item isn't found then it will fallback to the default Sitecore 404 functionality.
If you want to exclude certain URL's from being processed by the 404 handler then you can use the exclude404Rules configuration section in the web.config, this will allow you to exclude URL's from being processed by checking for a string value in the URL. You can check whether a URL Contains, StartsWith or EndsWith the value. The default ones are as follows:
<add type="Contains" match="~/media/" />
<add type="Contains" match="~/icon/" />
<add type="Contains" match="~/link.aspx" />
<add type="StartsWith" match="/sitecore" />
<add type="Contains" match=".asmx" />
<add type="Contains" match=".ashx" />
This will not redirect to your 404 page as that results in a 302 response being returned to the browser (a known issue with Sitecore) which is incorrect. Instead it will simply render the 404 content on the requested URL. You then need set the response code to be 404 on that layout using the following code
HttpContext.Current.Response.StatusCode = 404;
The 500 page functionality can be configured by populating the errorPagePath attribute on the site node. This needs to be populated with the url to forward the user to when an unhandled error is encountered. It is recommended to use a static page as the target and not a Sitecore URL, as this will then still function if major issues (e.g. database connectivity) occur.
The robots.txt functionality can be configured by populating the robotsTxtLocation attribute on the site node. This needs to be populated with the location of a text document containing the contents you wish to display when robots.txt is called for that site. If this is not specified then the standard robots.txt will be returned, if it exists.
This module was built upon the excellent work in two blog posts by: Anders Laub & Brian Pederson