AEM Link Checker and Transformer
Link checker provide the capability to validate all internal or external links authored on content page. AEM link checker is an event based and gets triggered on update of content.
It is not advisable to use link checker for large repositories which is going to have frequent changes/updates in links.
As it is event based and on creation or modification in node inside /content folder structure will create below mapping inside /var/linkchecker
Link Checker is responsible For:
- Validating both external and internal links authored on the pages.
- Show list of all external links authored on pages.
- Perform link rewritten / transformation.
Internal Links
Internal links is all about AEM content pages link of same instance starting from /content. e.g. /content/<projects>/us/en/home.html
Internal links on page gets validated as soon as added or updated on a page.
External Link
External links are those links which are outside of the AEM instance or different domain. e.g. https://www.google.com
External links validated based on their syntax and by checking their availability.
Broken links on author page
- Both internal and external links in author shown as broken link and looks like below highlighted red in color:
Broken links on publish page
- Both internal and external links on publish appears as plain text and link gets remove internally.
External Link Checker User Interface
Below URL will provide complete information around all internal and external links authored on AEM pages.
http://localhost:4502/etc/linkchecker.html
We dragged and dropped two different teaser components. On one of the component authored an internal link and on other component authored an external link.
Link Checker Interface will show below section highlighted blue in color for teaser component having an incorrect external URL as https://ww.test.com.
Link checker user interface will have below entry highlighted red in color for teaser component as we authored an internal URL /content/test/practice which doesn’t exist. In case of internal link, it will show component link under which we authored link in place of showing authored link itself.
Below URL will return list of all links validated as part of link checker:
http://localhost:4502/var/linkchecker.list.json?_dc=1667974306388
Link checker and transformer service configurations
Below are the system console level configuration for link checker and transformer:
Day CQ Link Checker Service
Service allows us to validate syntax of external links. Link checker enables by default on all instances. Override Day CQ Link Checker Service configurations for any customization or project specific configuration changes.
Scheduler Period: Allow us to provide period in terms of number of seconds to call this service repeatedly.
Link Check Override Patterns: This property allow us to ignore particular link to get validate. For example, provide an entry of ^http://www.google.com will ignore http://www.google.com link by link checker to validate.
Day CQ Link Checker Task
This task or service allows us to show changes of link status from pending state to valid or invalid. Changes will show on below URL once this task executed.
http://localhost:4502/etc/linkchecker.html
According to below configuration Scheduler Period, this task will run in every 60 minute.
Day CQ Link Checker Transformer
Service allow us to transform URL with the help of link checker.
For Example: link transformer will rewritten or transform URL from /content/practice/us/en/test.html to /test.html
Disable Checking property will allow us to disable link checking completely for a particular AEM instance.
Disable Rewriting property will allow us to stop URL rewritten or transformation.
linkcheckertransformer.rewriteElements allows us to add tag and attribute as tag:attribute to transform attribute value.
Disable Link Checker using code
There are two ways to disable link checker using code for particular tag:
- By adding x-cq-linkchecker=”valid” attribute as part of anchor <a> or other tags. Link checker by default will mark it as valid.
<a href="https://ww.test.com" x-cq-linkchecker=”valid”></a>
2. By adding x-cq-linkchecker=”skip” attribute as part of anchor <a> or other tags. Link checker will not validate this link.
<a href="https://ww.test.com" x-cq-linkchecker=”skip”></a>
Link Transformer / rewriter Implementation
As name suggested, link transformer allow us to transform link on page load. We will be able to achieve the same with the help of config and custom transformer class.
e.g. If we have a image src URL as /content/dam/practice/us/en/book.png and with the help of custom Transformer we can transform this URL to http://cdn/assets/practice/us/en/book.png OR
http://localhost:4502/content/dam/practice/us/en/book.png OR any other mapping URL.
Similarly, we can also trim content URL from /content/practice/us/en/home.html to /us/en/home.html also with the help of transformer as shown below:
Follow below steps to create custom link transformer:
- Create rewriter folder beneath config folder for rewriter to work on both author and publish.
Below configurations are required for link transformer to get work.
- Mark enabled property as true.
- Provide project specific paths.
- Provide transformerTypes as practice-linkrewriter.
Important Note: transformerTypes as practice-linkrewriter must be unique throughout the AEM instance. Same name practice-linkrewriter we will be using in MyRewriterTransformer.java for mapping.
<?xml version="1.0" encoding="UTF-8"?>
<jcr:root xmlns:jcr="http://www.jcp.org/jcr/1.0" xmlns:nt="http://www.jcp.org/jcr/nt/1.0"
jcr:primaryType="nt:unstructured"
contentTypes="[text/html]"
enabled="{Boolean}true"
generatorType="htmlparser"
order="1"
paths="[/content/practice]"
serializerType="htmlwriter"
transformerTypes="[linkchecker,versioned-clientlibs,practice-linkrewriter]">
<generator-htmlparser
jcr:primaryType="nt:unstructured"
includeTags="[A,/A,LINK,IMG]"/>
</jcr:root>
Note: We can also copy configurations from /libs/cq/config/rewriter/default node.
Intellij View
crx/de view
- Download config package from Link
2. Create below custom LinkTransformer class implementing Transformer and TransformerFactory as shown below:
package com.javadoubts.core.transformer;
import org.apache.commons.lang3.StringUtils;
import org.apache.sling.rewriter.ProcessingComponentConfiguration;
import org.apache.sling.rewriter.ProcessingContext;
import org.apache.sling.rewriter.Transformer;
import org.apache.sling.rewriter.TransformerFactory;
import org.osgi.service.component.annotations.Component;
import org.xml.sax.Attributes;
import org.xml.sax.ContentHandler;
import org.xml.sax.Locator;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.AttributesImpl;
import java.io.IOException;
@Component(
immediate = true,
service = TransformerFactory.class,
property = {
"pipeline.type=practice-linkrewriter"
}
)
public class MyRewriterTransformer implements Transformer, TransformerFactory {
private ContentHandler contentHandler;
@Override
public Transformer createTransformer() {
return new MyRewriterTransformer();
}
@Override
public void init(ProcessingContext processingContext, ProcessingComponentConfiguration processingComponentConfiguration) throws IOException {
}
@Override
public void setContentHandler(ContentHandler handler) {
this.contentHandler = handler;
}
@Override
public void dispose() {
}
@Override
public void setDocumentLocator(Locator locator) {
contentHandler.setDocumentLocator(locator);
}
@Override
public void startDocument() throws SAXException {
contentHandler.startDocument();
}
@Override
public void endDocument() throws SAXException {
contentHandler.endDocument();
}
@Override
public void startPrefixMapping(String prefix, String uri) throws SAXException {
contentHandler.startPrefixMapping(prefix, uri);
}
@Override
public void endPrefixMapping(String prefix) throws SAXException {
contentHandler.endPrefixMapping(prefix);
}
/*
This is the main function which is responsible for URL
main update.
*/
@Override
public void startElement(String uri, String localName, String qName, Attributes atts) throws SAXException {
// This will trim /content/practice/us/en/home to /us/en/home on page load
if (atts.getIndex("href") > -1 && qName.equalsIgnoreCase("a")) {
AttributesImpl modifiedAttributes = new AttributesImpl(atts);
String sortHref = modifiedUrl(atts.getValue("href"));
modifiedAttributes.setValue(atts.getIndex("href"), sortHref);
contentHandler.startElement(uri, localName, qName, modifiedAttributes);
}
// This will append http://localhost:4502 in front of asset URL
if (atts.getIndex("src") > -1 && qName.equalsIgnoreCase("img")) {
AttributesImpl modifiedAttributes = new AttributesImpl(atts);
String sortHref = "http://localhost:4502"+modifiedUrl(atts.getValue("src"));
modifiedAttributes.setValue(atts.getIndex("src"), sortHref);
contentHandler.startElement(uri, localName, qName, modifiedAttributes);
}
}
public static String modifiedUrl(String path) {
if (StringUtils.isBlank(path)) {
return path; // blank, return it as is.
} else {
if(path.startsWith("/content/practice")) {
return StringUtils.removeAll(path, "/content/practice");
}
}
return path;
}
@Override
public void endElement(String uri, String localName, String qName) throws SAXException {
contentHandler.endElement(uri, localName, qName);
}
@Override
public void characters(char[] ch, int start, int length) throws SAXException {
contentHandler.characters(ch, start, length);
}
@Override
public void ignorableWhitespace(char[] ch, int start, int length) throws SAXException {
contentHandler.ignorableWhitespace(ch, start, length);
}
@Override
public void processingInstruction(String target, String data) throws SAXException {
contentHandler.processingInstruction(target, data);
}
@Override
public void skippedEntity(String name) throws SAXException {
contentHandler.skippedEntity(name);
}
}
3. Author anchor link and text component on the page.
On page load, the MyRewriterTransformer.java class with get call multiple times for each and every image src and anchor link. At the same time it will call startElement() function to update URL’s according to our need.
Sling Rewriter Configurations
Hit below URL to see all sling rewriters deployed on current AEM instance.
Imran Khan, Adobe Community Advisor, AEM certified developer and Java Geek, is an experienced AEM developer with over 11 years of expertise in designing and implementing robust web applications. He leverages Adobe Experience Manager, Analytics, and Target to create dynamic digital experiences. Imran possesses extensive expertise in J2EE, Sightly, Struts 2.0, Spring, Hibernate, JPA, React, HTML, jQuery, and JavaScript.