Enhancing the security of your DTM implementation

We are all aware of the importance of creating secure products. In a previous post, I explained how to set up a workflow for a DTM implementation. One of the consequences of using this workflow is that only a reduced number of users can cause damage to the website via DTM. This is also good from the security perspective, as it reduces the risks of a successful attack. This is probably enough for most companies.

Financial institutions and DTM

Having said that, we must also realise that it is virtually impossible to create a completely invulnerable software product and DTM is not an exception. Banks and similar companies take the security to the next level. For them, the minimum security level should be “paranoid”. For obvious reasons, this paranoid level of security also applies to DTM. If an attacker gained access to DTM, he could very easily inject malicious JavaScript code to the whole website, by just adding a page load rule.

There are many places in DTM where an attack could take place; this is what is called the attack surface:

  • Use brute force to get to the password of one of the publishers or administrators.
  • Find a vulnerability in the DTM UI and gain control of an account.
  • Find a vulnerability in Akamai (current provider for DTM) and gain control of the servers.

I am not implying that DTM is insecure; I am 100% sure that the developers take security very seriously, but we must never forget that there can always be a hidden security hole.

Self-host DTM library

The immediate solution to reduce the attack surface is to self-host the DTM library. You would just use the DTM UI to create the rules and, once the development is finished, download the DTM JavaScript library. You can then run an audit of the files and host them in your own servers, which you have already hardened. With this approach, all three security concerns raised in the previous paragraph are gone. Even if an attacker gained access to the DTM, he could not modify the DTM library in your production servers.

In order to do this, you first need to enable the “Library download” option, under the “Embed” tab.

DTM library download enabled

After configuring the options, you get a URL to download both the staging and production libraries:librarydownload-2

Remember to update the links to the library in the HTML, as specified in the “Header code” section:

librarydownload-3

DTM library workflow

The previous solution works, but it adds a level of complexity for the DTM users. When debugging unpublished rules, the DTM Switch is completely useless, as it only works if the library is hosted in Akamai. You need to use tools like Charles, which make the whole process more complicated: you might need a license, you will need to download the library with every rule change, you need to learn how to manually replace the live DTM files with your local version… One problem has been solved, but a new one has been created. In particular, non-technical people will find quite challenging this solution.

My proposal in this case is to use a hybrid solution:

  • Use Akamai for the development environment
  • Use self-hosting for the test and production environments

and a create new workflow:

  1. The report suites in DTM should always point to development and/or UAT values.
  2. Your friendly marketer should have access to a development server with a working version of the website, where DTM is loaded from Akamai URLs, as with the typical DTM deployment. In this environment, she can easily create and test new rules using the DTM Switch. She should only have user permissions in DTM. Any successful attack on DTM will be confined to the development environment.
  3. Once the rules have been finished, a tester should verify the correctness of them. If everything is correct, the tester should approve and publish the rules. This is different from the workflow I suggested in a previous post: now, the tester has both approve and publish privileges.
  4. An automated script should download regularly the production DTM library and deploy the JavaScript files in the SIT/UAT/staging environments.
  5. Security audits should regularly be made to these JavaScript files.
  6. When pushing the whole development environment to SIT/UAT/staging, another script or process should automatically modify the DTM links in the HTML, to point to the correct server (not Akamai). These links are those in the “Library Download” tab.
  7. Website testers should only see the approved and published rules and report any errors.
  8. In pre-production, the report suite ID of the DTM library should automatically be replaced with the production RSID.
  9. The production environment should be exactly the same as pre-production.

It is a complicated solution, but it combines both the easiness of the DTM Switch for development and the security of self-hosting.

On final note. You will have noticed that the SIT/UAT/staging environment will have a slightly different version of the DTM library, than pre-production and production, as the RSID will be different. I would also expect that server names will be different. In this case, one solution is to replace the URLs in DTM with placeholders:

librarydownload-4

Your scripts I mention should also do a find and replace in the JavaScript files, looking for these placeholders and replacing them with the correct URL.

DTM, products and W3C data layer

Before getting into the details of the post… Happy New Year to all of you! I hope that 2016 is full of DMPs, DTMs and Analytics 🙂

Now, going back to today’s topic, I want to talk about how to create the products string in DTM using the W3C data layer. One of the reasons why we prefer a tag management solution (TMS) over hard-coded snippets is to write less code. All modern TMSs include features to set analytics variables using a point and click interface, usually through Web. In the case of DTM, you can create a data element that reads a data layer variable; you can then assign it to an eVar or a prop, without writing a single line of code.

However, when it comes to the products string, things are not that easy. There is no simple way of creating a one-size-fits-all solution for this variable. Let’s have a quick reminder of this variable’s structure:

which can be repeated as many times as needed, once for each product, using the comma as separator. Each element in this structure has its own rules:

  • Category. It is rarely used, as there was a limitation in SiteCatalyst v14 that only allowed one category per product. This limitation was lifted with v15, but very few implementations use it anyway.
  • Product. This is the only mandatory element.
  • Quantity. Only on the order confirmation page.
  • Price. Total price of all units combined for that product; only on the order confirmation page.
  • Product-specific events. Optional.
  • Merchandising eVars. Optional and, usually, only on product view or add to basket.

Data Elements for products

As it can be seen, there are many potential combinations of these elements. As a consequence, my recommendation is to create one data element (custom script) in DTM for each of the cases. For example:

  • Product listing page (PLP)
  • Product description page (PDP)
  • Cart page
  • Add to basket
  • Remove from basket
  • Order confirmation page

products-dtm

As an example, on the order confirmation page, you could use code similar to the following:

In this example, if a product is a gift, event48 should also be set.

Remember to use the correct data layer object for each case:

  • digitalData.product: PDPs and PLPs
  • digitalData.cart: add to basket event, cart and checkout pages
  • digitalData.transaction: order confirmation page

I have already described some details about digitalData.product and digitalData.cart in my post The W3C data layer – part II.

Products in rules

As you know, the recommended approach to set Adobe Analytics variables using data elements and is to use directly the UI:

dtm-vars

However, the products string and the purchaseID variable cannot be set through the UI. The only option is to set them in code, using something similar to:

Please note that the events string must also be updated depending on the product-specific events. Following the previous example, event48 needs to be set only if it is set in the products string. In rules where the Adobe Analytics call is s.tl() , DTM will detect that s.products  has been set and will add it to s.linkTrackVars , but not the events in it. Thus, the variable s.linkTrackEvents  must also be updated. For example:

Be careful if you use “eventX” without the equals sign (=) in the call to indexOf() , as there is the small risk of setting the wrong event. For example, s.products.indexOf("event12") will detect event12, but also event120event129.

When to use and when NOT to use DTM

A while ago, a customer requested a call with me to discuss one issue. Usually, I get more technical questions, but this time, he wanted to have my input regarding something completely different. The developers had realised that they forgot to include a JavaScript library in the website and they could not add it immediately, due to code freeze. They thought of an alternative solution: load it through DTM. My customer, from the marketing department, was not sure whether this was possible or acceptable and, therefore, wanted to know my point of view.

I must admit that, initially, I was a bit perplexed and did not know exactly what to reply. This was the first time I received this question and I had never thought about it. However, I quickly came with a proper answer and this is what I suggested. It must be noted that this is my personal perspective, not Adobe’s.

In order to reply to this question, you must first think about governance. Who manages DTM? Who owns DTM in the company? What is the purpose of using DTM? I think that DTM is a marketing tool, managed by the marketing department, used to deploy marketing tags in a website. So, if the new piece of code that needs to be added to DTM is requested by the marketing department, then it will probably make sense to use DTM; on the other hand, if it is the IT department requesting the addition, I would recommend against using DTM for it.

Technically, it is possible to deliver any JavaScript (and even HTML) through DTM. However, the fact that it is possible does not mean that it should be done. Think about the following questions when adding a piece of JavaScript that was not requested by the marketing department: who owns that piece of code? who updates it? who fixes it if stops working? who is to blame if it is inadvertently removed? what happens if DTM is removed? The marketing team will not want to take any responsibility of code they do not even understand.

So, to summarise, from my point of view, here you have some cases that are suitable for DTM and cases that are not:

  • In DTM:
    • Web Analytics tags (like Adobe Analytics)
    • Optimisation code (for Adobe Target, for example)
    • DMP tags (AAM also has a module in DTM)
    • Third party re-marketing tags
    • On-site surveys
  • Not recommended in DTM:
    • Generic JavaScript libraries (for example: jQuery)
    • Website functionalities (chats, UI effects)
    • Code that needs to be executed in a very specific location of the page (i.e. not at the top or the bottom)
    • CMS libraries

Can you think of any other case that would fit in one or the other case? If so, please, leave a comment!

Executing DTM at the top of the page

If you have been developing websites for a while, you will know that one of the typical recommendation is to execute as much JavaScript as possible at the bottom of the page. This is nothing new and Yahoo recommended it in 2007. The reason is very simple: JavaScript code tends to add a delay, both when loading the JS file and executing it; so, moving it towards the bottom, you make sure the HTML is loaded and the page is rendered before starting to execute any JavaScript. The user believes the page is loaded a bit sooner than when it is actually fully loaded. DTM knows that very well and this is why you have to add the two pieces of code: one at the top and one at the bottom of the HTML.

This approach works very well in most cases. DTM loads first the code that needs to be at the top of the page, but then allows you to defer the rest of the loading and executing at the bottom of the page, like Adobe Analytics, Adobe Audience Manager and 3rd party tags. This is the typical recommendation. The risk of this recommendation is that some page views might be lost, if the user moves away too fast.

However, there is one particular case where this recommendation fails very often. I was working with a well known British newspaper, helping them with the migration from another Web analytics tool to Adobe Analytics. They wanted to run both tools, side by side, for a short period of time, to make sure the numbers did not change too much. To our dismay, Adobe Analytics was showing a much lower number of page views than the other Web analytics tool. We realised that the problem was that Adobe Analytics was executed at the bottom of the page, as per the typical recommendations. The homepage of newspapers tends to be massive, taking many seconds, even minutes, to fully load. This means that the code at the bottom has the risk of not being executed, if the user clicks on a link or closes the browser tab after quickly reading the headlines.

The only solution in this case is to reorganise the code in a slightly different way:

  • The DTM header code needs to be moved to the bottom of the <head> section, ignoring the recommendation in DTM of pushing it to the top.
  • The DTM footer code should still be at the bottom of the <body>  section.
  • Add the data layer before the DTM header code.
  • Configure the Adobe Analytics tool to be executed at the top of the page.
    analytis-top
  • Set all Page Load Rules that will be setting analytics variables to be executed at the top of the page.
    plr-top
  • The Data Elements used for Adobe Analytics cannot use CSS selectors.

This solution guarantees that the analytics code is executed most of the times, at the expense of delaying for a few hundred milliseconds the load of the page.

The W3C data layer – part II

Now, looking into the standard, we will get into the different sections that conforms recommended data layer. Let’s review each of them in the following posts.

Root: digitalData

The JavaScript object should always be called digitalData .

Page Identifier Object

Although I personally do not find it very useful for Web analytics, this identifier should be completely unique. In particular:

This value SHOULD distinguish among environments, such as whether this page is in
development, staging, or production.

Page Object

This is where you store all the information about the page. It is very well suited for page name, section, subsection… In particular, s.pageName and s.channel  usually are taken from this object. For example:

If you want to track additional information from the page, just add more props to the analytics object s .

Product Object

This is the start of a set of objects that can be used in various ways. In particular,  digitalData.product[n]  is an array of product objects. You should this object for products that are shown on the page, irrespective of whether they have already been added to the basket. In a PLP (Product Listing Page), the contents of the array are straight forward.

However, in a PDP (Product Description/Details Page), it is not as obvious. Initially, you might think of only including one element in the array, the main product, but it might also be useful to include other product shown: similar products, recommended products, people who bought this product also bought these others… In the latter case, you may set digitalData.product[0] as the main product and   digitalData.product[n] for n>0 for the other products. This is useful to set the prodView event only on the main product.

Regarding the data that you can set, most of the elements are self explanatory and most of them are optional. Some comments from the sub-objects and nodes of this object:

  • productInfo.productID: it does not have to be the SKU, especially if you have a unique productID for each product, but the same SKU can be used for different colours, sizes… in which case, the productID is what you would use for the s.products  variable
  • productInfo.productName: I would not suggest that you used it as the product ID in the s.products  variable
  • category.primaryCategory: in version 15 of SiteCatalyst/Adobe Analytics, the category slot in the s.products  variable was fixed, although I have never seen an implementation that uses it consistently; in general, I suggest to create a merchandising eVar for the category
  • attributes: in case you want to know what kind of secondary product this is (similar products, recommended products, people who bought this product also bought these others…), you can set an attribute for this
  • linkedProduct: in the case of secondary products that are related to the main, you could link that secondary product to the main using this property

With all the previous comments, you could use the following code to create the s.products  variable:

Cart Object

Although the cart object might look similar to the Product Object, in fact, they serve different purposes. As it name implies, all products that are already in the cart should be added to this object. So, as a consequence, it is entirely possible to have both the Product and the Cart objects on the same page, with different contents: the user has already added some products to the basket and it is still browsing in order to add new products to it. It is up to the development team to decide whether it makes sense to include this object on all pages or only on those pages where it makes sense to have it; for example, you might want to remove it in the help section of the website.

Some comments from the sub-objects and nodes of this object:

  • cartID: a unique ID of the cart, that is usually created when the cart is opened
  • price: all details about the price of the contents of the cart; however, the values might not be 100% accurate, as you only know some values as you progress through the checkout process; the voucher and shipping detailsshould only contain cart-wide information
  • item[n].productInfo: this is exactly the same as  digitalData.product[n].productInfo
  • item[n].quantity is the total number of units for this particular item; however, remember that Adobe Analytics does not track units in the cart
  • item[n].price is where you would keep product-specific vouchers

Since you can have both Product and Cart objects, it is up to the implementation to decide which one to use on each page. For example, in a PLP, the Product Object will generally be used, but in a cart page, the Cart Object is the one to be used.

The W3C data layer – part I

This is the first post of a series of posts, in which I am going to describe the W3C data layer. A few months ago, I explained why it was a good idea to have a data layer. In this series, I am going to dive into the details of one particular data layer implementation: the W3C standard. For those of you who do not know what the W3C does, it is the international body that creates the standards that we use everyday on the web: HTML, CSS, Ajax… Although there are other options for data layers, like JSON-LD, I personally prefer the W3C standard; after all, this body has created some of the most important standards in the Internet.

The first thing I suggest is that you download the W3C data layer standard: http://www.w3.org/2013/12/ceddl-201312.pdf. It is completely free. Have a look at it. You will notice the amount of well known companies that contributed to this standard, including Adobe, my employer. In total, 56+ organisations and 102 individuals have collaborated in the creation of it. So, if you choose to follow this document, you can be confident that you are not on your own.

You might have also noticed the recency of this document: it is less than two years old (at the time of writing). This is probably why many Web analysts have never heard of the concept of data layer. That being said, the word is spreading quickly and it is starting to become the norm, rather than the exception. In fact, a few of my customers, that are undergoing a major redevelopment of their websites, are including a data layer, which they did not have before.

I hope that, by now, you are fully convinced of the need of a data layer and the benefits of going with the W3C standard. Your should also start spreading the word within your organisation. I have found that this step can be important, as any new addition to the website will face some resistance. It must also be remembered that this data layer is not exclusive for Web analytics; other Web marketing tools, like Web optimisers and DMPs will greatly benefit from a data layer.

Probably, the development team is going to be the most difficult to convince. They might have a different approach or think of the effort it will take, but my experience shows that, once they understand it, they will support this concept.

Start defining your data layer

Once you have everybody aligned, you should create a document with the contents of your particular implementation of data layer. Remember to include in the documenting process all on-line marketing teams: Web analytics, optimisers, advertisers… I was recently involved in the creation of a data layer for a customer and it took 5 weeks until it was finished. This is probably an edge case, but you should be aware that this stage might take longer than initially expected.

In a future post I will explain what is the content of the data layer. For now, I suggest you review section 6 of the W3C data layer document, to see what you can expect to include in the data layer. There are a couple of examples in section 7.

Location of the data layer

Before starting the development, the location of the data layer must be agreed with all parties involved. Ideally, it should be at the beginning of the <head>  section of the HTML document. The reason is that it can then be used by any other JavaScript code. If this top-most location cannot be achieved, it should be located before loading any tool that needs will read the data layer. For example, if you are using a tag manager or a DMP like Adobe Audience Manager, the data layer should be placed above all of these tools.

There is finally one additional technical problem with placing the data layer at the top. Page-level information is usually retrieved from the CMS and can easily be cached and set in the HTML. However, depending on the CMS, there is some information, like user-level information, which is not available on page load and it requires an AJAX call. As a consequence, it is possible that the code that needs this data executes before the data is available. For example, the Web analytics code might be capturing the log-in status and will need the user-level information when executing. This problem needs to be solved on a case-by-case basis.

 

In future posts I will describe in greater detail other aspects of the W3C data layer:

  • Each of the JavaScript object
  • Integration with DTM

Why a data layer is a good thing

Back in the old days, when we used the traditional division between an s_code and on-page code, the concept of a data layer made little sense. The developers had to add some code server-side to generate the on-page code. Gathering the information to be captured was a server-side issue: the CMS would have to collect the information from one or various sources (CMS DB, CRM…) and present it on-page, so that, when calling s.t(), the s object would have all needed information.

However, now that tag managers are becoming the norm, the previous approach does not work well. There is no on-page code; all code is generated in the tag manager and injected to the page through it. This means that, in order to track some data, it must already be in the HTML code or in other resources available to JavaScript, like query string parameters or cookies.

One might think that, as long as the required information is visible on the page can be extracted using CSS selectors, we are safe. Consider the requirement to capture the city in the delivery address and this code:

Using the selector #delivery-address .city, we should be able to extract “London”. But, what if the developers decide to change the id or the class? What if there is a new requirement to completely remove this data from the web page? Our tracking will be broken and, if this happens a few months after the release, we will probably not know why.

The most reliable solution is to add a JavaScript object, completely independent of the rest of the HTML code, with all the relevant information of the web page. Then, the tag manager just needs to reference directly the elements in the JavaScript object. The developers can then change anything in the HTML code and, as long as this JavaScript object is kept intact, the tracking will continue working.

DataLayer_Blog_Example-2

There are many ways to create a data layer, but all fall into two categories: create a custom data layer or follow a standard. I will never recommend to create a custom data layer. There are a few standards that are worth mentioning:

  • AEM client context. This is the de facto standard that comes with AEM. I have spoken with AEM developers and all say that this can be used in many cases.
  • JSON-LD. I have never worked with this one, but one of my clients was already using it. More information here: json-ld.org.
  • W3C Customer Experience Digital Data Layer. I always recommend this standard, as it has been produced by the W3C (the same body standardising the Web) and Adobe took a role in this standardisation, together with other companies like Google, IBM, Red Hat… The previous image is an example of how the data layer would look like in this case. The standard itself is freely available: http://www.w3.org/2013/12/ceddl-201312.pdf.

In future posts, I will add some details about the last standard.

DTM permissions and workflow

Some time ago, I received an urgent call from a customer that claimed that DTM broke their website. They wanted to see me immediately, as that was causing a huge impact in them. Fortunately, the problems only manifested in staging, but they could not move to production. Once I arrived at my client’s office, I immediately realised what had happened: the data analyst had created some data elements using JavaScript he found on the Internet and he just copied and pasted the code. The code worked on his computer, so he went to approve and publish it. The reality was that his code only worked in IE and crashed in other browsers.

As I have already explained, there are many benefits from using a tag management solution. However, with great power, comes great responsibility. Using DTM does not mean that marketers can implement anything they want and publish rules as they like. To the contrary, they must be very strict. The previous example is exactly the opposite of what must be done.

The solution I suggested to my client is to follow a very strict workflow and use wisely the permissions:

  • Anybody who needs to create new rules should be given the User role. This allows the user to create new rules and make sure they work fine using the right DTM plugin.
  • You need to have a group of testers that have the Approver role. They are responsible of making sure the new rules not only work fine in one browser, but they are cross-browser compatible, do not break the website, capture the expected data…
  • Once the rules have been approved, it is time to publish them. Very few people should have the Publisher role and publishing rules should be done very carefully, just like a typical code release.

Workflow

Some additional recommendations:

  • Publish rules at a certain time every week, so that, in case something goes wrong, it is easy to identify that the new rules are the culprit and roll back to the previous approved version.
  • Get some developer’s time in order to create JavaScript rules. He will probably need 5 minutes per rule and will do it right the first time.
  • The testers, ideally, should be the same as the website testers, so that the tests are full tests, with the right tools and environment.

Custom conditions in DTM

When creating both page load rules and event based rules in DTM, you have the option of configuring some conditions to determine when to fire those rules: parts of the URL, cookies, browser properties, JavaScript variables… If you add more than one condition, the AND boolean operator is applied to all conditions. However, in some cases, these out-of-the-box conditions are not enough. Thankfully, DTM offers the custom condition, where you can write pure JavaScript.

The first thing to note is that the snippet of code must return a boolean. It might work with expressions that can be evaluated as true/false (like empty/non-empty string, 0 or different than 0), but I have just not tried them. In other words, there must be a return statement within the code. In fact, the return statement must be at the very end of the code.

This code will NOT work:

condition_not

What I always do in this situation, is create a variable named ret and return it at the very end.

condition_yes

Execute JavaScript in event based rules

If you create an event based rule that is fired upon click of a link and you want to execute some JavaScript, chances are that, if you add it to the “JavaScript/Third Party Tags”, it will never be executed. The reason is that the browser has already been instructed to unload the page and move to a new URL. It will not have time to load an external JavaScript and executed; if fact, I think that it will not even attempt to do it.

What can you do in this case? Use the custom condition of the rule, to make sure the JavaScript is executed. For example, to write a cookie:

custom_js

There is no need to return true in this case. However, it is more convenient to leave it as shown, as the console will then show the rule as executed.

Capture information from the clicked element

Imagine you have an event based rule in which you want to capture the full URL of the clicked linked and store it in prop1. Your first idea would be to create a data element that returns this.href:

de_this

However, one limitation of data elements is that they have no idea of what DOM element has been clicked, if any. If you try to use the this keyword in data elements, you will get a surprise: it will not work as expected.

One possible solution is to go to the custom code of the Analytics section as set the value there:

custom_code_this

Remember to include s.linkTrackVars if you are making an s.tl() call.

I do not like the previous approach, as it is not elegant and you need to explicitly open the editor to see the contents of the rule. My preferred solution is to use the custom condition and set a data element on the fly; then, in the analytics section, just reference the data element as usual:

custom_setvar

prop_de

 

Do you have any other uses for the custom conditions in DTM? I would love to hear your opinions on that.

Use a tag manager

Back in the old days, the only way to add Web analytics code to a website, was through manual coding. If you were using Adobe Analytics, you would need to add two pieces of code into the website: the s_code and the on-page code. The s_code is a JavaScript file with common code for Adobe Analytics (SiteCatalyst) and the on-page code contains the page-specific data. I am sure many of you are familiar with these lines of code:

While this does not seem to be a great problem, my experience with many customers shows that this traditional solution is far from ideal. Typical issues that I have found are:

  • Web developers have usually little or no knowledge of Adobe Analytics code. The moment you mention “eVar” or “prop”, they completely disconnect from the conversation until they clearly understand what these words mean. Do not get me wrong, I have been a developer myself, I have nothing against developers, but I know that it is very difficult to find a web developer that understands Adobe Analytics.
  • Changes take a very long time to be published. Even the minimum code change (just adding an eVar, for example), can take weeks, if not months, before it is live. The main reason is that, any new feature must be added to a scrum backlog, a change request…
  • Disconnect between IT and marketing. These two departments tend to have very different goals. As a consequence, what is of great importance for a web analyst, might be considered low priority by the scrum master.

If you search for more reasons, you will find many more.

So, what is the solution? Use a tag management solution, like Adobe Dynamic Tag Management. This is not the silver bullet that will solve all your problems, but it will help move forward more easily. Do not even think on developing your own solution: it will take you years before you have a solution that matches the worst commercial solution.