performance

Why Fast Device Detection Matters

Engineering

7/11/2014 2:48 PM

Device Detection Performance Analysis Development

How 51Degrees Beats the Competition

Importance of Identifying Mobile Traffic

Over the past decade mobile technology evolved to the point where a phone has more memory and a faster CPU than an average Desktop computer used to have some 8 years ago. In conjunction with faster Internet access, better coverage and cheaper carrier charges this caused a substantial rise in the amount of web traffic originating from mobile devices. Data suggests we are very close to seeing a third of all web traffic originating from mobile devices. The trend is bound to continue as LTE (4G) networks are spreading throughout the world.

Many businesses and website owners fail to recognize this trend at all while others fail to adjust. However, even the most basic level of detection that determines whether the requesting device is mobile or not can prove to be very beneficial.

Suppose you run a personal or a business Wordpress blog with a thousand of unique visitors per day. While the number is not that large, about a third of them are likely to be using a mobile device. When it comes to browsing the web from a smartphone there is hardly anything worse than loading a website that was designed with large Desktop computer screens in mind, containing a lot of unnecessary elements and overloaded with graphics. Detecting a mobile device allows you to adjust the way information is presented by redirecting to a mobile version of your website or supplying a different style sheet.

On the opposite end we have businesses that handle large amounts of data each day. An Ad Network business with millions of requests per second is one example. Mobile device detection is very beneficial in this case as mobile screens may have a high resolution, but physical screen size is fairly small and the way users interact with a mobile device is different from how they interact with a desktop computer or a laptop. By detecting the device is mobile you can provide ads in the appropriate location and format. At the same time having to deal with millions of requests per second means your mobile detection solution has to be fast.

The conventional way to deal with device detection is to use an XML file to store information about devices and a set of regular expressions to match the requesting device against existing devices. This is a valid approach, however as the data file grows larger device detection will slow down. Additionally this approach does not take in to account changes in software, so the same phone using a different browser or a different version of the operating system could have a different entity in the database for each combination.

51Degrees Device Detection

51Degrees recognizes the pitfalls of the conventional device detection methods and has developed a solution that is fast, reliable and future-proof. As you shall see, it is easy to implement and it provides consistently fast and reliable device detection across many platforms and technologies.

To detect devices we use two methods: Trie and Pattern.

  • Pattern is a fast and memory efficient method capable of delivering detection times of less than 0.1 milliseconds. This is achieved by our patent applied for algorithm, which uses important parts of http headers to find information about the hardware, software and browser version of the device as well as distinguish between search bots, crawlers and human users. Pattern matching can process well over 5 000 requests per second per CPU core and is generally sufficient for most uses. Learn how Pattern device detection works.
  • Trie detection method requires a significant amount of memory to be allocated at initialisation as it builds a tree of all known devices. Trie is extremely fast and outperforms hashing algorithms. Trie can easily process millions of requests per second even on mediocre hardware. Learn how Trie detection works.

Testing Trie and Pattern Detection

So, we claim we can perform millions of detections per second. Sounds like hard to believe, right? Well, you can test-drive our Pattern and Trie matching for yourself. Just follow the 4 steps below.

The following tests will be using 51Degrees C solution, as it is the fastest cross-platform programming language. To repeat the tests your system needs a GCC and ‘make' set up so they can be used from command line or terminal. Additionally the data file with test user agents should contain one user agent string per line.

Trie

  • Step 1: Download and extract the detector from http://sourceforge.net/projects/fiftyone-c/
  • Step 2: Start the terminal or the command line and navigate to the detector directory (directory that contains the makefile file).
  • Step 3: Run ‘make' command and wait for a few moments while programs build.
  • Step 4: Run the PerfTrie with two parameters: path to Trie device data file (.trie) as the first parameter and path to file containing user agents to be tested with. Use a free evaluation version of our Enterprise or Premium data files to achieve the best results.

Pattern:

  • Steps 1 – 3 are exactly the same as for Trie (above), so if you have already done them see Step 4 below.
  • Step 4: Run the PerfPat file with two parameters: path to Pattern device data (.dat) as the first parameter and path to file containing the user agents that needs to be tested. Use a free evaluation version of our Enterprise or Premium data files to achieve the best results.

Execution of both PerfPat and PerfTrie consists of 3 stages. First stage prepares data for use by reading it in to the memory. Then calibration takes place. Calibration verifies that data is actually in memory and ready to be used as well as calculates the time it takes to read data file. Calibration time is subtracted from total test time to remove the overheads of reading data from the results. Detection is the last stage and that is where user agents from user agent file are matched against 51Degrees device data file (Trie or Dat). Please note that first two stages are not taken in to account when measuring performance.

The results will vary depending on the hardware you employ. The two hardware components that have the greatest impact on detection rate are the number of CPUs, CPU cores and CPU clock speed. Both PerfPat and PerfTrie will output the average number of detections per second and the average time it takes for detection test pass to complete.

For our testing we have used several platforms. Below you will find a brief description of each platform with test results presented as tables.

Platform 1: Early 2011 MacBook Pro with 4-core Intel Core i7-2820QM, 2.3GHz and 8Gb RAM (1333 MHz) under OsX 10.9. Platform original cost: 2400 USD (With VAT), mid-2014 cost: 1000 USD.

Threads # Detections (000,000s) Average Time to Complete (Seconds) Average Detections per Second
2 2m 1.46s 1,369,000
4 4m 1.51s 2,649,000
8 8m 2.03s 3,940,000
10 10m 2.26s 4,440,000
12 12m 2.79s 4,424,000
14 14m 3.22s 4,340,000
16 16m 3.75s 4,266,000
32 32m 7.65s 4,183,000

Platform 2: Alienware Laptop (2013) with Intel Core i7-3740QM, 2.7 GHz and 8Gb RAM (1600MHz) under Windows 7x64. Platform Original cost: 1250 USD (VAT included), mid-2014 cost: 900 USD.

Threads # Detections (000,000s) Average Time to Complete (Seconds) Average Detections per Second
2 2m 1.75s 1,142,000
4 4m 1.46s 2,739,000
8 8m 1.45s 5,517,000
10 10m 2.55s 3,921,000
12 12m 3.5s 3,428,000
14 14m 3.6s 3,835,000
16 16m 4.9s 3,274,000
32 32m 10.0s 3,200,000

Platform 3: Server 12 with Intel Xeon 5520 CPU 2.27GHz and 16Gb of RAM under Windows 8x64. Platform mid-2014 cost: 1000 USD.

Threads # Detections (000,000s) Average Time to Complete (Seconds) Average Detections per Second
2 2m 1.95s 1,025,000
4 4m 1.56s 2,564,000
8 8m 1.25s 6,400,000
10 10m 2s 5,000,000
12 12m 2.65s 4,528,000
14 14m 3.36s 4,166,000
16 16m 3.92s 4,081,000
32 32m 8.0s 3,980,000

The above results confirm the expected general pattern. Once you pass the number of supported hardware threads, you will start to experience the diminishing returns where the overheads of multithreading start to outweigh the benefits.

Comparing results across platforms confirms that detection rate depends on the CPU. The dedicated server (Platform 3) was able to process well over 6.5 million devices per second where as both laptops showed a pretty high figure of 4 - 5 million detections per second, which makes 51Degrees the fastest device detector on the market.

Graph of results for 3 platforms tests used for test.

The above graph represents data from the previous tables. Y-axis shows the average number of detections per second. X-axis represents the number of threads. Each point on the graph represents the average detections per second per number of treads. Graph confirms the generally expected pattern where increasing the number of software threads past the number of supported hardware threads causes diminishing returns due to the overheads of threads creation.

You can test-drive and use our software with any type of project. Our licensing is compatible with open source and commercial use for all 6 platforms that we currently support. Install process for every platform is designed to be as easy as the 4-step example above.

Best Solution For Your Needs

What makes us stand out are the algorithms we utilise to detect devices. They do not rely purely on regular expressions for device detection, hence even a dramatic increase in size (Device combinations number) of our data file will only slightly increase time needed to cache and initialise data but won't affect the detection time. Additionally a lot of the data processing is done when device data files are created which means less work needs to be done on your servers. Fast and accurate detection also means 51Degrees solution uses less time and hence less power to detect devices which makes it the most environmentally friendly solution on the market.

We supply three kinds of device data files: Lite, Premium and Enterprise. The key difference is in the number of supported devices and how much information is available about each device.

Lite data file is the most basic and is distributed with our solutions by default. The data file is free and contains over 30 000 devices with 40 properties for each device. This file enables you to detect mobile devices and provides information that will help you present your website in an appropriate way for mobile users. For a better insight and understanding of the requesting mobile devices we recommend using Premium or Enterprise data files.

Premium data file contains over 70 000 devices with over 100 properties for each device and is recommended for use when you want to have a more in-depth understanding of mobile devices that access your website. For example: Lite data file only allows you to detect whether device is mobile where as Premium file can detect the type of mobile device such as Smart Phone, Console, Reader, Tablet or TV. Premium data files are updated weekly so you will always have information on the latest devices.

Enterprise data file contains over 150 000 devices with over 150 properties for each device. This is the ultimate choice if you want the most reliable and complete information about mobile devices. Enterprise option contains all the benefits of Lite and Premium with added benefits of daily updates, scripts to enhance device detection and data representation and professional support backed by a Service Level Agreement. The scripts include enhanced feature detection, client-side properties and bandwidth monitoring to further fine-tune the way you present data to your customers.

For a full list of supported properties visit the Property Dictionary or to compare the three device data files visit the Compare Device Data page.

In addition to presenting content to your users in an appropriate fashion you will also have data you need to see the full picture. An analytics system such as Google Analytics that you may already have in place can easily be augmented with additional information you gain by using our data. Hence you can conduct more in-depth analysis and take steps to improve experience for your target audience.

Even if your current device detection solution is capable of handling the day-to-day requests, are you prepared for every eventuality? What if the number of requests was to increase dramatically over a short period of time putting a strain on your resources, increasing detection time that would degrade website performance. Imagine a website that decided to do live coverage of the World Cup and already has some device detection in place. This website usually experiences thousands of requests per minute which it handles nicely. But what happens when the amount of requests increases dramatically for the 2 hours the game is played. Slow device detection will become the single point of failure that will worsen user experience to the point where users will be forced to use an alternative. Five million detections per second offered by 51Degrees solution guarantee that performance will remain fast and accurate even at peak times.

Update: 14/11/2014

The original version of this blog did not provide details of the average detection time when run over a single CPU core. This addition shows how this figures has been calculated using one of the test machines from the original blog.

Test environment and setup:

  • Platform: Alienware Laptop (2013) with Intel Core i7-3740QM, 2.7 GHz and 8Gb RAM (1600MHz) under Windows 7x64. Platform Original cost: 1250 USD (VAT included), mid-2014 cost: 900 USD.
  • Enterprise Trie data file was used as the devices signatures database. A list of million user agent strings was used to perform matching.

Test parameters:

  • We have compiled and used PerfTrie program that is part of our 51Degrees C distribution to conduct this test. PerfTrie is an implementation of C detector that is designed to measure the number of detections per second.
  • PerfTrie was compiled from the sources distributed with our C package using the supplied makefile. An –O3 optimisation parameter was added to the makefile to optimise the compiled code.
  • PerfTrie has been changed to use a single thread, and to perform 100 passes to obtain a smoother average.

Test result:

Our test environment produced a figure of 625 000 average detections per second across 100 iterations. Which means 625 000 devices were detected every second on average. To find the time it took to detect a single device we divide time (which is 1 second, or 1000ms) by the number of detections, which is 625 000. So 1000 / 625000 = 0.0016ms. Meaning it takes 0.0016ms to detect a single device.