AEM dispatcher.any File
Visit link to understand and get overall idea about dispatcher, its usages and installation for AEM on Prem.
As part of this blog, we will try to understand over all code flow file by file. Dispatcher execution starts from httpd.conf file present inside conf folder as shown below.
There are two important things about dispatcher in terms of writing code and execution flow.
Below is the Dispatcher Code Flow topic which will help us to understand over all flow of dispatcher code.
Dispatcher Code Flow
httpd.conf file is responsible to load diapatcher.any module and file using DispatcherConfig property. Below is the over all dispatcher code flow:
Executation starts from conf/httpd.conf file which helps us to load conf.d/dispatcher_vhost.conf as sub sequent file.
dispatcher_vhost.conf file will help us to load dispatcher.any and enabled_vhost.conf file.
dispatcher.any file will have all the farms define and down the line it will helps us to configure clientheaders, virtualhosts, renders, filter, vanity_urls, propagateSyndPost, cache.
enabled_vhost.conf file will help us to declare port, hostname, log file position, whitelist rules and define rewrite rules.
Below screenshot covers important modules and code snippets.
Over all Dispatcher Code flow
Below is over all code flow within the files.
Below is the dispatcher.any file high level content and hierarchy with proper description. 99% of the case developers follow below sequence to define all modules:
Over all AEM code structure mapping with dispatcher.any file:
dispatcher_vhost.conf file will get load at first position which will load dispatcher.any file from conf.dispatcher.d folder.
dispatcher.any file will load all enabled farms which will have all sub sections defined such as clientheaders, virtualhosts, renders, vanity_urls, filter, cache, statistics.
Let’s discuss every section in detail:
/clientheaders
clientheaders section is to include all required request headers. Below are required headers that are added by default after project creation.
This section allows us to add or remove extra custom headers as part of this request.
At the time of dispatcher setup, it provides a default set of headers. In the case of overriding headers, some extra headers will be required.
/virtualhosts
virtualhosts Put hostnames that would be honored for publish blob matching works.
Virtualhosts is an important file which come in picture as soon as server gets request and also help us to get map with farm file and load related content such as filter, chache, render, clientheaders, etc.
/renders
renders to balance loads among multiple AEM publish instances.
We can declare all our AEM publish instances with IP and Port within render section.
This way our dispatcher will be able to identify and divide load in between defined AEM instances.
/cache
This section is responsible to cache resources depending on rules define under rules section.
Below is the sample cache section with all sub-sections:
/cache {
/docroot "${PUBLISH_DOCROOT}"
/statfileslevel "2"
/allowAuthorized "0"
/serveStaleOnError "1"
/rules {
$include "../cache/ams_publish_cache.any"
}
/invalidate {
/0000 {
/glob "*"
/type "deny"
}
/0001 {
/glob "*.html"
/type "allow"
}
}
/allowedClients {
/0000 {
/glob "*.*.*.*"
/type "deny"
}
# Allow certain IP's like publishers to invalidate cache
$include "../cache/ams_publish_invalidate_allowed.any"
}
/ignoreUrlParams {
/0001 { /glob "*" /type "deny" }
/0002 { /glob "q" /type "allow" }
}
/headers {
"Cache-Control"
"Content-Disposition"
"Content-Type"
"Expires"
"Last-Modified"
"X-Content-Type-Options"
}
# /gracePeriod "2"
# /enableTTL "1"
}
/docroot
docroot is to declare the root folder path where data will get cache.
/statfile
statfile is to declare statfile path to register the time of the most recent content update.
By default .stat file gets stored in docroot path.
/allowAuthorized
allowAuthorized must be set to 0 in the /cache section in order to enable this feature. When we set /allowAuthorized 0 requests that include authentication information are not cached.
The /allowAuthorized
property controls whether requests that contain any of the following authentication information are cached:
- The
authorization
header - A cookie named
authorization
- A cookie named
login-token
By default, requests that include this authentication information are not cached because authentication is not performed when a cached document is returned to the client. This configuration prevents Dispatcher from serving cached documents to users who do not have the necessary rights.
However, if your requirements permit the caching of authenticated documents, set /allowAuthorized
to one:
/allowAuthorized "1"
Note: To enable session management (using the /sessionmanagement
property), the /allowAuthorized
property must be set to "0"
.
/serveStaleOnError
serveStaleOnError set to 1 will not delete invalidated content from the cache unless the renderer server returns successful response.
By default, when a statfile is touched and invalidates cached content, Dispatcher deletes the cached content the next time it is requested.
/statfileslevel
statfileslevel is to invalidate cache depending upon path.
- Dispatcher creates
.stat
files in each folder from the docroot folder to the level that you specify. The docroot folder is level 0. - When a file located at a certain level is invalidated then all
.stat
files from the docroot to the level of the invalidated file or the configuredstatsfilevel
(whichever is smaller) will be touched.
For example: if you set the statfileslevel
property to 6 and a file is invalidated at level 5 then every .stat
file from docroot to 5 will be touched. Continuing with this example, if a file is invalidated at level 7 then every . stat
file from docroot to 6 will be touched (since /statfileslevel = "6"
).
Only resources along the path to the invalidated file are affected. Consider the following example: a website uses the structure /content/myWebsite/xx/.
If you set statfileslevel
as 3, a .stat
file is created as follows:
docroot
/content
/content/myWebsite
/content/myWebsite/*xx*
When a file in /content/myWebsite/xx
is invalidated then every .stat
file from docroot down to /content/myWebsite/xx
is touched. This would be the case only for /content/myWebsite/xx
and not for example /content/myWebsite/yy
or /content/anotherWebSite
.
Note: If you specify a value for the /statfileslevel
property, the /statfile
property is ignored.
/ignoreUrlParams
This section helps us to declare parameter to by pass dispatcher cache.
Use below code as best practice to first deny selected list of parameters and than allow everything to get cache.
/ignoreUrlParams
{
/0001 { /glob "nocache" /type "deny" }
/0002 { /glob "*" /type "allow" }
}
According to above rule, below URL having nocache as a parameter will always fetch latest content from AEM instance and by pass dispatcher.
http://localhost/content/we-retail/us/en/men.html?nocache=true
/headers (response headers)
Declare HTTP header types to be cached by the Dispatcher. On the first request to an un-cached resource, all headers matching one of the configured values (see the configuration sample below) are stored in a separate file, next to the cache file. On subsequent requests to the cached resource, the stored headers are added to the response.
/cache {
...
/headers {
"Cache-Control"
"Content-Disposition"
"Content-Type"
"Expires"
"Last-Modified"
"X-Content-Type-Options"
"Last-Modified"
}
}
/enableTTL
enableTTL section allows us to clear cache depending to time.
If set to 1 (/enableTTL "1"
), the /enableTTL
property will evaluate the response headers from the backend, and if they contain a Cache-Control
max-age or Expires
date, an auxiliary, empty file next to the cache file is created, with modification time equal to the expiry date. When the cached file is requested past the modification time it is automatically re-requested from the backend.
NOTE: Keep in mind that TTL-based caching is a superset of header caching and as such the /headers
property should also be properly configured.
/allowedClients
allowedClients section by default deny or block all IP’s to flush the cache. We can explicitly allow required IP addresses.
/allowedClients {
/0000 {
/glob "*.*.*.*"
/type "deny"
}
/0 {
/glob "${AUTHOR_IP}"
/type "allow"
}
/01 {
/glob "${PUBLISH_IP}"
/type "allow"
}
/02 {
/glob "${JENKINS_IP}"
/type "allow"
}
}
Note: It is recommended to define the /allowedClients
.
If this is not done, any client can issue a call to clear the cache; if this is done repeatedly it can severely impact the site performance.
/rules
rules section allows to cache content depending on given conditions as shown below. By default cache everything and not to cache CSRF token.
/cache {
...
/rules {
# Default allow all items to cache
/0000 {
/glob "*"
/type "allow"
}
# Don't cache csrf login tokens
/0001 {
/glob "/libs/granite/csrf/token.json"
/type "deny"
}
}
}
The /rules
property controls which documents are cached according to the document path. Regardless of the /rules
property, Dispatcher never caches a document in the following circumstances:
- If the request URI contains a question mark (
?
). This usually indicates a dynamic page, such as a search result that does not need to be cached. - The file extension is missing. The web server needs the extension to determine the document type (the MIME-type).
- The authentication header is set (this can be configured).
- If the AEM instance responds with no-cache, no-store or must-revalidate headers.
/graceperiod
The /gracePeriod
defines the number of seconds to stale, auto-invalidated resource and load from cache after publishing content.
The property can be used in a setup where a batch of activations would otherwise repeatedly invalidate the entire cache. The recommended value is 2 seconds.
/vanity_urls
Dispatcher will auto-allow the item or by pass the normal dispatcher, once vanity package is installed on publishers.
/vanity_urls {
/url "/libs/granite/dispatcher/content/vanityUrls.html"
/file "/tmp/vanity_urls"
/delay 300
}
Click here to read more about vanity URL section.
Dispatcher Functional Flow
The end user will hit practice.abc in the browser, and at the same time a request will reach out to the dispatcher. Now, dispatcher will check all virtualhost entries made under the vhost file. It will look under all the farm files as soon as visrtualhost mapping is found. Within the farm file, it will look for vistualhost entry named practice.abc, which got mapped to the vhost file.
Matching virtualhost will load respective farm file clientheaders, virtualhosts, renders, vanity_urls, filter, cache, statistics.
Note: In next tutorial as part 2, we will deep dive filter module.
Imran Khan, Adobe Community Advisor, AEM certified developer and Java Geek, is an experienced AEM developer with over 11 years of expertise in designing and implementing robust web applications. He leverages Adobe Experience Manager, Analytics, and Target to create dynamic digital experiences. Imran possesses extensive expertise in J2EE, Sightly, Struts 2.0, Spring, Hibernate, JPA, React, HTML, jQuery, and JavaScript.