The User-Agent (Part I)

The User-Agent (Part I)

The User-Agent parameter is a piece of information that all browsers attach to all HTTP(S) requests they make. In today’s post, I will demystify this HTTP parameter and explain how it works. There will be a second part, where I will explain how this parameter is used in Adobe products.

Let’s start with the basics. What is a user agent? According to the W3C:

A user agent is any software that retrieves, renders and facilitates end user interaction with Web content, or whose user interface is implemented using Web technologies.

You will have noticed that I have been using “user agent” and “User-Agent” as two distinct concepts. In case it was not clear, the former is the software (as defined above) and the latter the HTTP parameter and I will use this convention in the rest of this post.

Types of user agents

In the most usual case, this software is your browser, the application you are using to read this post, but there are other cases. Think of it as a software that gets the HTML and all the asses from the website, renders the page and executes the JavaScript, in order for you to be able to interact with the website.

The other typical case is any client software that interacts with a remote web server. Nothing prevents a developer from creating such piece of software. Examples of it are:

  • Web crawlers. These are programs, used by search engines, to systematically browse the whole World Wide Web, in order to index it. The best example is Googlebot, used by Google.
  • Malicious bots. Any piece of software trying to gain advantage of a website.
  • Applications like curl or wget.
  • Programming language libraries, like Java’s Apache HTTP Client or Python’s http.client module.

HTTP and the User-Agent

The HTTP protocol established an HTTP request parameter, aptly named “User-Agent”, for the user agent software to populate. It is meant to be used as an indication of what type of software is making the request. The server receiving the request can then know which client is connecting to it. However, and crucially, this parameter can be any string.

For browsers, the typical format is something like:

Note that, for legacy reasons, it always starts with “Mozilla”, irrespective of the software vendor: Google, Microsoft, Mozilla, Opera… Other types of software use a simplified version of the previous format.

For those of you who are curious about the User-Agent value of your browser, I suggest you try WhatIsMyBrowser.com. As an example, the browser I am using right now shows:

Let’s parse it:

  • Mozilla/5.0. It basically means it is an advanced browser.
  • X11. I am using the X windows system.
  • Ubuntu. Obvious, but also notice that it does not tell which version of Ubuntu I am using.
  • Linux x86_64. I am running a 64-bit Linux operating system.
  • rv:63.0. I am using Firefox 63.0.
  • Gecko/20100101. Gecko is the Firefox rendering engine.
  • Firefox/63. Again, stating that this is Firefox 63.

In summary, there is a lot of information we can gather from this simple string. Depending on the User-Agent, we know which browser is being used and we can use it to tailor the experience.

Limitations

It is precisely the limitations of the User-Agent HTTP parameter that causes trouble to digital marketers. The main limitation is that you can just fake it. There are no enforcements on its format or content. Let me illustrate it with an example:

As you can see, by default, my curl version identifies itself correctly and adds the version number. However, nothing prevents me from doing the following:

Now, curl is telling to the whole world that it is a Mozilla Firefox browser! What is more important is that there is nothing the server can do to verify the identity of this “browser”. You may have also noticed that there is another interesting consequence of this change. Yahoo, when it detected it was curl, it did not set any cookie. However, when tricked into thinking that it is “talking” with a real browser, it sends back a cookie in the response (line 22).

Another notable limitation is what we see with Apple devices. They do not include the actual hardware version, just the iOS version. For example, the following User-Agent string:

Just tells that it is an iPhone running iOS 11.4.1. There is no way of knowing which version of iPhone this Safari browser is running on.

As I said at the beginning, this is just the first part of a 2-part series. In my next post, I will explain how Adobe products use the User-Agent string and the consequence of the limitations. Stay tuned!

4 thoughts on “The User-Agent (Part I)”

  1. Thanks Pedro! Great article, easy to digest. Could you write more about AAM and digital display if you could? is a very trendy topic. Cart abandonment is a great topic as well.
    Gracias!
    Maria Mendoza

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.