dispatcher.any Part1

Visit link to understand and get overall idea about dispatcher, its usages and installation for AEM on Prem.

As part of this blog, we will try to understand over all code flow file by file. Dispatcher execution starts from httpd.conf file present inside conf folder as shown below.

There are two important things about dispatcher in terms of writing code and execution flow.

Below is the Dispatcher Code Flow topic which will help us to understand over all flow of dispatcher code:

Dispatcher Code Flow

httpd.conf file is responsible to load diapatcher.any module and file using DispatcherConfig property. Below is the over all dispatcher code flow:

Executation starts from conf/httpd.conf file which helps us to load conf.d/dispatcher_vhost.conf as sub sequent file.

dispatcher_vhost.conf file will help us to load dispatcher.any and enabled_vhost.conf file.

dispatcher.any file will have all the farms define and down the line it will helps us to configure clientheaders, virtualhosts, renders, filter, vanity_urls, propagateSyndPost, cache.

enabled_vhost.conf file will help us to declare port, hostname, log file position, whitelist rules and define rewrite rules.

Below screenshot covers important modules and code snippets:

Over all Dispatcher Code flow

Below is over all code flow within the files.

Below is the dispatcher.any file high level content and hierarchy with proper description. 99% of the case developers follow below sequence to define all modules:

Over all AEM code structure mapping with dispatcher.any file:

dispatcher_vhost.conf file will get load at first position which will load dispatcher.any file from conf.dispatcher.d folder.

dispatcher.any file will load all enabled farms which will have all sub sections defined such as clientheaders, virtualhosts, renders, vanity_urls, filter, cache, statistics.

Let’s discuss every section in detail:

/clientheaders

clientheaders section is to include all required request headers. Below are required headers that are added by default after project creation.

This section allows us to add or remove extra custom headers as part of this request.

At the time of dispatcher setup, it provides a default set of headers. In the case of overriding headers, some extra headers will be required.

/virtualhosts

virtualhosts Put hostnames that would be honored for publish blob matching works.

Virtualhosts is an important file which come in picture as soon as server gets request and also help us to get map with farm file and load related content such as filter, chache, render, clientheaders, etc.

/renders

renders to balance loads among multiple AEM publish instances.

We can declare all our AEM publish instances with IP and Port within render section.

This way our dispatcher will be able to identify and divide load in between defined AEM instances.

/cache

This section is responsible to cache resources depending on rules define under rules section.

Below is the sample cache section with all sub-sections:

/cache {
/docroot "${PUBLISH_DOCROOT}"
/statfileslevel "2"
/allowAuthorized "0"
/serveStaleOnError "1"
/rules {
$include "../cache/ams_publish_cache.any"
}
/invalidate {
/0000 {
/glob "*"
/type "deny"
}
/0001 {
/glob "*.html"
/type "allow"
}
}
/allowedClients {
/0000 {
/glob "*.*.*.*"
/type "deny"
}
# Allow certain IP's like publishers to invalidate cache
$include "../cache/ams_publish_invalidate_allowed.any"
}
/ignoreUrlParams {
/0001 { /glob "*" /type "deny" }
/0002 { /glob "q" /type "allow" }
}
/headers {
"Cache-Control"
"Content-Disposition"
"Content-Type"
"Expires"
"Last-Modified"
"X-Content-Type-Options"
}
# /gracePeriod "2"
# /enableTTL "1"
}

/docroot

docroot is to declare the root folder path where data will get cache.

/statfile

statfile is to declare statfile path to register the time of the most recent content update.

By default .stat file gets stored in docroot path.

/allowAuthorized

allowAuthorized must be set to 0 in the /cache section in order to enable this feature. When we set /allowAuthorized 0 requests that include authentication information are not cached.

The /allowAuthorized property controls whether requests that contain any of the following authentication information are cached:

  • The authorization header
  • A cookie named authorization
  • A cookie named login-token

By default, requests that include this authentication information are not cached because authentication is not performed when a cached document is returned to the client. This configuration prevents Dispatcher from serving cached documents to users who do not have the necessary rights.

However, if your requirements permit the caching of authenticated documents, set /allowAuthorized to one:

/allowAuthorized "1"

Note: To enable session management (using the /sessionmanagement property), the /allowAuthorized property must be set to "0".

/serveStaleOnError

serveStaleOnError set to 1 will not delete invalidated content from the cache unless the renderer server returns successful response.

By default, when a statfile is touched and invalidates cached content, Dispatcher deletes the cached content the next time it is requested.

/statfileslevel

statfileslevel is to invalidate cache depending upon path.

  • Dispatcher creates .statfiles in each folder from the docroot folder to the level that you specify. The docroot folder is level 0.
  • When a file located at a certain level is invalidated then all .stat files from the docroot to the level of the invalidated file or the configured statsfilevel (whichever is smaller) will be touched.

For example: if you set the statfileslevel property to 6 and a file is invalidated at level 5 then every .stat file from docroot to 5 will be touched. Continuing with this example, if a file is invalidated at level 7 then every . stat file from docroot to 6 will be touched (since /statfileslevel = "6").

Only resources along the path to the invalidated file are affected. Consider the following example: a website uses the structure /content/myWebsite/xx/. If you set statfileslevel as 3, a .statfile is created as follows:

  • docroot
  • /content
  • /content/myWebsite
  • /content/myWebsite/*xx*

When a file in /content/myWebsite/xx is invalidated then every .stat file from docroot down to /content/myWebsite/xxis touched. This would be the case only for /content/myWebsite/xx and not for example /content/myWebsite/yy or /content/anotherWebSite.

Note: If you specify a value for the /statfileslevel property, the /statfile property is ignored.

/ignoreUrlParams

This section helps us to declare parameter to by pass dispatcher cache.

Use below code as best practice to first deny selected list of parameters and than allow everything to get cache.

/ignoreUrlParams
{
/0001 { /glob "nocache" /type "deny" }
/0002 { /glob "*" /type "allow" }
}

According to above rule, below URL having nocache as a parameter will always fetch latest content from AEM instance and by pass dispatcher.

http://localhost/content/we-retail/us/en/men.html?nocache=true

/headers (response headers)

Declare HTTP header types to be cached by the Dispatcher. On the first request to an un-cached resource, all headers matching one of the configured values (see the configuration sample below) are stored in a separate file, next to the cache file. On subsequent requests to the cached resource, the stored headers are added to the response.

/cache {
...
/headers {
"Cache-Control"
"Content-Disposition"
"Content-Type"
"Expires"
"Last-Modified"
"X-Content-Type-Options"
"Last-Modified"
}
}

/enableTTL

enableTTL section allows us to clear cache depending to time.

If set to 1 (/enableTTL "1"), the /enableTTL property will evaluate the response headers from the backend, and if they contain a Cache-Control max-age or Expires date, an auxiliary, empty file next to the cache file is created, with modification time equal to the expiry date. When the cached file is requested past the modification time it is automatically re-requested from the backend.

NOTE: Keep in mind that TTL-based caching is a superset of header caching and as such the /headers property should also be properly configured.

/allowedClients

allowedClients section by default deny or block all IP’s to flush the cache. We can explicitly allow required IP addresses.

/allowedClients {
/0000 {
/glob "*.*.*.*"
/type "deny"
}
/0 {
/glob "${AUTHOR_IP}"
/type "allow"
}
/01 {
/glob "${PUBLISH_IP}"
/type "allow"
}
/02 {
/glob "${JENKINS_IP}"
/type "allow"
}
}

Note: It is recommended to define the /allowedClients.

If this is not done, any client can issue a call to clear the cache; if this is done repeatedly it can severely impact the site performance.

/rules

rules section allows to cache content depending on given conditions as shown below. By default cache everything and not to cache CSRF token.

/cache {
...
/rules {
# Default allow all items to cache
/0000 {
/glob "*"
/type "allow"
}
# Don't cache csrf login tokens
/0001 {
/glob "/libs/granite/csrf/token.json"
/type "deny"
}
}
}

The /rules property controls which documents are cached according to the document path. Regardless of the /rules property, Dispatcher never caches a document in the following circumstances:

  • If the request URI contains a question mark (?). This usually indicates a dynamic page, such as a search result that does not need to be cached.
  • The file extension is missing. The web server needs the extension to determine the document type (the MIME-type).
  • The authentication header is set (this can be configured).
  • If the AEM instance responds with no-cache, no-store or must-revalidate headers.

/graceperiod

The /gracePeriod defines the number of seconds to stale, auto-invalidated resource and load from cache after publishing content.

The property can be used in a setup where a batch of activations would otherwise repeatedly invalidate the entire cache. The recommended value is 2 seconds.

/vanity_urls

Dispatcher will auto-allow the item or by pass the normal dispatcher, once vanity package is installed on publishers.

/vanity_urls {
/url "/libs/granite/dispatcher/content/vanityUrls.html"
/file "/tmp/vanity_urls"
/delay 300
}

Click Here to read more about vanity URL section.

Dispatcher Functional Flow

The end user will hit practice.abc in the browser, and at the same time a request will reach out to the dispatcher. Now, dispatcher will check all virtualhost entries made under the vhost file. It will look under all the farm files as soon as visrtualhost mapping is found. Within the farm file, it will look for vistualhost entry named practice.abc, which got mapped to the vhost file.

Matching virtualhost will load respective farm file clientheaders, virtualhosts, renders, vanity_urls, filter, cache, statistics.

Note: In next tutorial as part 2, we will deep dive filter module.

Imran Khan

Specialist Master (Architect) with a passion for cutting-edge technologies like AEM (Adobe Experience Manager) and a proven track record of delivering high-quality software solutions.

  • Languages: Java, Python
  • Frameworks: J2EE, Spring, Struts 2.0, Hibernate
  • Web Technologies: React, HTML, CSS
  • Analytics: Adobe Analytics
  • Tools & Technologies: IntelliJ, JIRA

🌐 LinkedIn

📝 Blogs

📧 Imran Khan