These strings hold more insight than you may realize
As we browse the web, we've come to expect a seamlessly smooth experience from the websites we visit. And as website owners, we strive to improve the performance and engagement of our websites by understanding our visitors. We need to understand their different characteristics, from their location to the browser, operating system, and hardware of the device they are using to access our site.
This information supports a multitude of use cases, including website analytics and optimization, security features if your user logs in to your site with a new device, or content adaption tailored specifically to the device used (such as a mobile phone). These use cases all require knowledge of your user's device, browser, or operating system.
One way to gather this information is via the User Agent string – a type of HTTP header. For nearly thirty years, the User Agent string has been a well-established part of the web experience. However, a new HTTP header has been created by Google to replace the User Agent string: User Agent Client Hints (UA-CH).
We've got more information on User Agent Client Hints later in this blog. In the meantime, let's dissect the User Agent string for more detail.
Table of contents
Unpacking the User Agent string
A User Agent (UA) string is a type of HTTP header request that contains information on the device. For example, if you browse the web on your smartphone, your device will send a HTTP request header to the web server, saying that it is a mobile device. The website will then respond and show you the mobile version of the page.
We could just give you a list of User Agent strings and be done with it. But that’s not particularly good for developing your learning, is it?
Instead, let’s dissect some example User Agents:
Mozilla/5.0 (Linux; Android 12; Pixel 6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.62 Mobile Safari/537.36
Mozilla/5.0 appears at the start of most UA strings. It can largely be ignored as it has no relevance to the associated device. Historically, it was used to indicate compatibility with the Mozilla rendering engine, a piece of software that draws text and images on the screen.
Linux; Android 12 tells us details about the operating system. In this case, the device is running the Android operating system, which is based on Linux.
For mobile User Agents, the Pixel 6 section of the string tells us the device name or device model number. In this case, this User Agent came from a Google Pixel 6 phone. In other instances (such as for a Windows desktop device), this element of the string may define the device architecture.
AppleWebKit/537.36 indicates what browser rendering engine is used. A rendering engine is what transforms HTML into an interactive webpage on the user’s screen. The WebKit browser engine was developed by Apple and is primarily used by Safari, Chromium, and all other WebKit-based browsers.
(KHTML, like Gecko). This section of the string doesn’t necessarily provide more detail on the device but ensures compatibility for historical reasons. Check out the history of the User Agent string guide from Human Who Codes for more details.
And finally, Chrome/93.0.4577.62 Mobile Safari/537.36 has more detail on the browser and its version number. In this User Agent example, the device is using a mobile version of Chrome, version 93.
So, the different sections of this Chrome User Agent string are:
Mozilla/5.0 (Linux; Android 12; Pixel 6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.62 Mobile Safari/537.36
Let’s look at another example, this time it’s a Firefox User Agent:
Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:94.0) Gecko/20100101 Firefox/94.0
Firefox User Agents tend to follow a four-component format, whereas Chrome User Agents may include more string elements. In the above case, the rendering engine is Gecko, and rv:94.0 is its version number. Win64; x64 indicates that the device is using the 64-bit version of Windows with x64 computer architecture.
Why are User Agent strings like this?
The Mozilla/5.0 and (KHTML, like Gecko) parts of a User Agent string can largely be ignored, but why were they included in the first place?
Originally, the User Agent string wasn't designed to be confusing, but it was a necessary evil to ensure compatibility between new and old browsers. For new browsers to ensure they got a slice of the browser market share pie, they need to make sure that they're not excluded because they're new.
As an example, there's a new browser on the block called "VeryNewBrowser". Your User Agent string contains information that you are using VeryNewBrowser to access a website. Many sites will (incorrectly!) see that VeryNewBrowser is not PopularBrowser, so will serve you a crappy version of the website.
To counteract this, the User Agent string could contain "PopularBrowser (actually VeryNewBrowser)" which fools the web server into thinking that you are using PopularBrowser.
This addition to the User Agent string continues for many many years, until eventually you end up where we are today. Everyone claims to be Mozilla 5.0; Android Chrome claims to be Mobile Safari; Mobile Safari claims to be Gecko. It can get very confusing, very quickly. Check out WebAIM's blog for more information on the compatibility history of the User Agent string.
Other User Agent strings
User Agents don't necessarily follow the same format. You may see a string that looks like the following:
Dalvik/2.1.0 (Linux; U; Android 9.0; ZTE BA520 Build/MRA58K)
Looks scary, right? But when you start to look deeper, you can see that this is a ZTE Android phone. To gather more insight, you can run the UA string through our User Agent parser and it’ll decode the rest!
It’s important to note that it’s not always easy to detect a device. For example, iPad or iPhone User Agents don’t always contain a lot of information on the device. Additionally, with each new Apple device that is released, it gets harder to identify between the latest devices.
In the following User Agent, we can only identify that this is an iPhone device – we can’t determine the exact device model.
Our Device Detection takes the string out of identifying devices from User Agent strings. Since the release of iOS 12.2, we upgraded our device detection with a two-part solution to identifying Apple devices: Image Hashing and Benchmarking.
Put our device identification to the test below with an iPhone User Agent. We’ll take the string and show you what information we can parse from it.User Agent tester
Let’s talk about the operating systems (OS) section of the UA string. Some of the main OS software used by various devices are Android, Windows, and Chrome OS (a Linux-based desktop OS developed by Google).
For Apple devices, we have iOS, iPadOS, macOS (previously Mac OS X and later OS X), tvOS, and watchOS. You could also include Darwin in this list, as it is an open-source Unix-like operating system first released by Apple in 2000. (Fun fact: Darwin forms the core set of components upon which the other Apple OS software are built upon.)
Within a UA string, you might find information on both the operating system and its version number. Each software vendor determines their own numbering system, as well as whether each release is a minor or major increment.
Seems simple, but don’t assume a software vendor will always follow the same pattern for a new release – sometimes conventions can change, or version numbers are skipped. And if that wasn’t enough, User Agents don’t always present the browser name and version number in a consistent manner.
To keep our User Agent string database healthy, we regularly map new User Agents, especially when there are new OS versions that have recently been released. It’s imperative that we provide our customers with the most accurate and up-to-date User Agent database available.
The web browser
The UA string also contains information on the type of web browser that is used by the device. Some of the more commonly known web browsers are Chrome, Safari, Edge, Firefox, and Opera.
Like software vendors, each browser vendor (think Google for Chrome, and Mozilla for Firefox) determines their own numbering system. Browser versions can be represented in the string at various stages of the development cycle. For example, a UA string might contain a browser that is in beta, canary, or developer stage.
Again, the browser name and version number aren’t always presented in a consistent manner. For example, let’s look at this common Android User Agent:
Mozilla/5.0 (Linux; Android 10; POT-LX1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4621.2 Mobile Safari/537.36
At first glance, you may look at this UA string and assume the browser vendor is Mozilla, or even Safari on mobile (thanks to the Mozilla/5.0’ and Mobile Safari/537.36 portions of the string). In fact, for this string, the browser is Chrome version 95.0.4621.2. It pays to slow down and carefully analyze a UA string!
If you are ever in doubt about what information is contained within a UA string, we recommend referring to our User Agent tester to do all the hard analysis for you.
Within the UA string, there is a key element present that can be used to detect the device. This element details whether the physical device that is accessing the website is a mobile phone, tablet, e-reader, smart TV, or games console (to name but a few).
For example, the UA string below tells us that the device is a 2018 model Samsung Galaxy J7:
Mozilla/5.0 (Linux; Android 8.0.0; SAMSUNG SM-J737F) AppleWebKit/537.36 (KHTML, like Gecko) SamsungBrowser/14.0 Chrome/87.0.4280.141 Mobile Safari/537.36
Sometimes, it’s difficult to be certain of a device’s hardware. The biggest reason for this is because the UA string is lacking information, or the device could belong to multiple vendors.
Mozilla/5.0 (Linux; Android 9; S20_EEA Build/PPR1.180610.011; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/95.0.4638.74 Safari/537.36
In this UA, the only supplied information on the device is ‘S20’. That could refer to any of these hardware vendors:
Samsung Galaxy S20
User Agent strings aren’t always associated with a physical device. Sometimes they represent something called a crawler.
A crawler is a type of web traffic that operates without human interaction. Its purpose is to monitor the availability or performance of a website, retrieving information to be included into search engines or monitoring services.
Crawlers go by many names – bots, robots, spiders, probes, monitors – but they are typically easy to spot and not necessarily harmful to your website.
A good crawler will proudly identify itself as a crawler, often including “bot” at some point within their UA string:
Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)
However, bad crawlers may crawl your website to find email addresses to send spam to, or to try and gain access to your website development login page. Since these bad crawlers will actively try to hide their intentions, they may spoof their User Agent to look like a bog-standard string.
We recently created a method to filter out these bad crawlers from our website analytic reports. Find out how you can utilize User Agent string information to find these malicious crawlers.
Changing your User Agent string
Let’s talk more about User Agent spoofing, as mentioned above. User Agent spoofing can be boiled down to replacing characters in your User Agent string with anything from a few characters to the whole string.
It’s surprisingly easy to change your UA string – you can change your browser User Agent by pressing F12 on your keyboard to open your browser’s debug tool. Alternatively, most major browsers have plugins that can spoof the UA for you. Search for “User Agent changer” or “User Agent switcher” in your browser’s plugins and add-ons to see what’s available.
One benefit of spoofing your UA string is that you can test your website for various devices. It’s easy to change the string from a desktop to a mobile. However, there are more reliable ways to test your website for its mobile friendliness (check out our article on mobile emulators to learn more).
The accessibility of User Agent spoofing is great for the humble user who just wants to test their website on a different device. But in the hands of a malicious crawler, having a fake User Agent string can alter the accuracy of device detection.
When the whole string is changed, it can be near-impossible to detect whether the string belongs to a real device or not. However, when only part of the string is changed, it is still possible to detect the device with a fair level of accuracy.
For example, let’s take this spoofed User agent:
Mozilla/5.0 (Linux;Android 3.1ipad 4 Build/AppleWebKit Gecko) Version/4.0 Safari/534
There is conflicting information in this string, as it contains both Android and iPad. An iPad User Agent tends to follow the format Mozilla/5.0 (iPad..., which isn’t the case here. Therefore, you can deduce that this is an Android device, not iPad.
The future of the User Agent string
You can see from this article that we know our onions when it comes to detecting information from a User Agent string.
However, change is on the horizon. Google's alternative to the User Agent HTTP header request, User Agent Client Hints (UA-CH), are due to reduce the UA string on all Chrome devices in the second quarter of 2022. The reduction is planned to ship with Chrome browser minor versions.
User Agent Client Hints dissect device information in a similar way to the User Agent string, but often includes additional information in the response, such as Sec-CH-UA-Platform or Sec-CH-UA-Mobile.
This unilateral change pushed by only the one company has forced others to adapt, regardless of whether they want to or not. Not all the User Agent Client Hints are sent by default, so website owners will need to undertake extra work to adapt to the change. Some will inevitably drag their heels, which could cause their website to break.
The best way to prepare in the next few months is to upgrade your services to support UA-CH.
Version 4 of our Device Detection has been upgraded to support UA-CH detection. Don’t get caught short when the User Agent is reduced in favor of User Agent Client Hints – sign up to version 4 of our solution today.
User Agent Device Detection
The purpose of gathering User Agent string data from your users allows you to gain further insights into their characteristics. If you find that 80% of your website visitors use mobile, you should probably look into mobile optimization of your website.
There are so many reasons to use User Agents for device detection in industries such as AdTech, E-commerce, and Digital publishing. Find out how businesses like yours have excelled with our real-time data services.Get started with Device Detection