Combining Benchmarking and Image Hashing
Apple decided to modify iOS 12.2 in April 2019 to remove information vital to identifying an iPhone or iPad precisely. This is an expected but 'unintended impact' associated with the WebKit tracking prevention policy adopted for future versions of Safari.
In response 51Degrees adopted a two-stage solution, initially releasing the first stage to identify many iOS models precisely and consolidating others into the smallest groups possible. This initial approach utilized differences between Apple GPU image rendering implementations, alongside more established techniques such as querying the screen's dimensions and pixel ratios. A full write up of the innovation is available here.
This second stage utilizes benchmarking and the screen's color gamut to obtain a precise model match, or a group of two devices in some narrow situations. This blog explains the approach for web developers advising under which conditions the solution should be used.
Shorter blogs explaining the business background, a technical summary, and some of the other techniques considered are also available.
- Multi Stage Approach to Apple iOS Device Detection
- 51Degrees Open Sources GPU Renderer Technique to Identify Apple Devices Using iOS 12.2 or Higher
- Apple iOS Degrades Device Detection Accuracy
Ratification and Standards
Apple are the only mobile operating system vendor to actively hide information about the model of device being used to access a web page. The precise iPhone or iPad model is never contained within the User-Agent or other HTTP headers of Safari. Apple do however make the model information easily available to App developers.
Apple modified Safari in iOS 12.2 to make all iPhone and iPad devices appear to support the same graphics capability. When a web developer writes a web page to query the device and understand graphics support the same information is returned for the original iPhone and the most modern iPhone XS. In doing so a quick and simple method to identify Apple devices was removed.
The change in iOS 12.2 broke many web sites, particularly those concerned with banking and where fraud prevention is important.
Many more examples of such 'unintended impact' are to be expected as Apple have adopted a tracking prevention policy which explicitly acknowledges these possibilities.
Considering Apple are prepared to make such changes to prevent device model identification, 51Degrees have been careful to only make use of web APIs that would break many web sites on Apple products if Apple were to tamper with them.
The prospects of Apple breaking the 51Degrees solution in the future is therefore minimized, but never impossible.
Micro-benchmarking involves running a demanding recursive algorithm on the CPU and measuring the time to completion. This completion time can then be compared against known results to identify the device in question.
In 2015 51Degrees evaluated CPU benchmarking as a method of granular Apple device identification. The full write up of the work and the data captured is available here.
The 2015 analysis concluded CPU benchmarking was not a suitable solution as it not only slowed down the rendering of a web page, but there was insufficient difference between the benchmark results to identify many models with 99% or greater accuracy.
As such 51Degrees concluded benchmarking should be avoided.
The initial analysis considered executing the benchmark within the main thread. As such, the time taken to execute the benchmark would directly impact the time to render the web page and the associated user experience. Therefore, only short duration benchmarking techniques were considered which would complete in tens of milliseconds.
When coming back to the problem in 2019, four years after the initial analysis, web workers are now consistently available across iOS versions and widely used by many web sites. Web workers enable long running computations to be performed in the background without delaying the main thread. For example; fetching and preparing data whilst the user engages with the user interface. It seems highly unlikely Apple would tamper with their operation.
Whilst the web worker standard is still to be ratified by the W3C, web workers form a part of the modern HTML 5 web browsers feature set. More information about web workers is available from the W3C here.
By employing web workers, the benchmark algorithm can run without delaying the main user interface and can also be run for a longer period collecting more meaningful results. The results proved surprisingly usable with 99% or better detection accuracy obtained from normal iPhone and iPads.
A similar Tak recursive benchmarking algorithm to the one originally evaluated in 2015 is used. Other algorithms were also tried including prime factoring and the ackerman benchmark. Tak continues to provide the most consistent set of results.
Two web workers are started and instructed to run 80 iterations of the Tak recursive benchmark. 160 samples are then provided in an array for analysis. On modern Apple CPUs the entire benchmark completes in well under 2 seconds.
A timer is used to shutdown the background benchmark if it has not completed within 4 seconds to avoid long running worker threads if something has gone wrong.
The benchmark needs to be executed on known Apple devices to establish a sufficiently broad training set of values for use in device detection. Ideally the devices should be in a state like the average state of an Apple device when accessing a typical web page. For example; the device should not be so warm as to slow the CPU performance, or the web page running many other web workers unrelated to device detection.
At the time of publication results have been gathered across a wide spectrum of iOS 12 and 13 beta devices.
iPhone X, XS and XS Max
The results for three of the most recent versions of iPhone are shown in the following graph.
iPhone XS and XS Max which contain the same 6 core A12 Bionic chipset produce average values with over 99% consistency between 13.05 and 13.82 milliseconds. iPhone X which shares all other attributes including screen size and graphics image hash produces values in the range 13.9 to 15 milliseconds. This is to be expected as the A11 Bionic chip contained in the iPhone X is slightly less powerful.
Where devices exhibit average timings which overlap with one another, as is the case for the iPad Mini 4 and the iPad Air 2, further analysis is required.
The iPad Air 2 has a very tight set of average values ranging from 29.8 to 30.6 milliseconds. The iPad Mini 4 has a broader range of results ranging from 28 to 32.5 milliseconds. Averages alone would result in unacceptable device detection accuracy for these devices.
The standard deviation of the samples can be used to achieve the required accuracy.
The iPad Air 2 has a very narrow range of standard deviation values from 0.3 to 3.5. iPad Mini 4 is far broader with values in the range 6 to over 44. This additional analysis can be used to improve accuracy beyond the average only.
Sometimes it is not possible to identify a precise model as all available attributes including image hash and CPU benchmarks are identical. An example of such a group is the iPhone SE and iPhone 6s.
51Degrees group the devices and return the different marketing names of the device in the HardwareName property. The lowest specification device in the group is used to provide the single value properties such as HardwareModel. The IsHardwareGroup property can be used to determine if a single device is being returned or a group.
Many Apple devices, such as the XS Max, support Zoom mode where the entire user interface is altered to provide larger icons and fonts. When operated in Zoom mode the screen size information reported by the web browser changes and can often be identical to other models of iPhone or iPad.
Desktop mode is also a feature added to iPadOS 13 and alters the User-Agent HTTP header sent by the iPad.
Some devices, such as iPad Air 2, will therefore support four modes of operation. Each mode needs to be sampled individually to create a comprehensive training data set. iPhones like the XS Max don't support desktop mode and therefore only two modes for zoom and standard need to be sampled.
The CPU benchmarking technique works very well but has a high performance and time penalty.
The result might be susceptible to error if there are other significant activities being performed by the web page. 51Degrees considered the following when designing and testing the solution.
- Other web workers running in parallel.
- Low remaining battery degrading CPU clock speed.
- Newer versions of iOS altering the results.
- The ambient temperature of the device.
- Age of the device.
- Newer versions of iOS altering the results.
- Presence of developer tools and debugging.
Two web workers are run in parallel to avoid consuming all the available CPU cores. Two proved to be an optimum number of the four and six core devices.
Surprisingly the remaining battery capacity had no impact on the results.
The ambient temperature had very little impact. CPU generated heat did build up when the full benchmarking cycle was repeated continually. For this reason, samples were collected with enough delay to avoid CPU generated heat increasing.
A range of different device ages were used and age does not appear to be a factor.
The version of iOS does have a bearing on the results, presumably because the context switching algorithms at the heart of the OS vary between versions. 51Degrees have procedures to gain a full set of samples from all iOS and iPadOS devices when a new version of iOS is provided for beta testing and general release.
Developer tools and other debugging solutions connected to the device significantly impact the benchmark. Real device hosting platforms such as Browserstack cannot be used for collecting samples and will not return accurate results. It is assumed the presence of debugging tools and the technology needed to expose the device via a web browser material alters the performance characteristics of the device.
All implementations created by 51Degrees store the result in cookies to avoid needing to perform the calculation for every page request. Other caching techniques could be used to avoid reprocessing.
It is usually acceptable to wait a few seconds to collect device level information for inclusion in analytics data. The web developer can therefore start the benchmarking after all other activity on the web page has been completed, collect the results and pass the values to the analytics solution. In Google Analytics custom dimensions can be used for the purpose.
If the device must be identified before the web page can be rendered, a 'splash' screen may be required.
iOS 10 released in 2016 introduced the color gamut media query to Safari. The specification is currently a candidate recommendation with the W3C's CSS working group. It is not implemented on some modern web browsers. It seems likely that Apple's tracking prevention policy would consider the media query a candidate for removal in the future. As such 51Degrees were inclined to avoid using the technique because it is not an essential building block of the web. Read the earlier paragraphs concerning the importance of using ratified standards or techniques Apple will find hard to remove in the future to understand why this is so important.
Querying the color gamut, when available, avoids the overhead of the CPU benchmark execution on those devices that support color gamut. Overall performance is improved. For this reason, the color gamut has been used in the 51Degrees solution at the time of publication. Should Apple degrade color gamut, as they did with WebGL, in the future 51Degrees will make greater use of CPU benchmarking.
All other major APIs available to web workers were also evaluated as contributors to a benchmark function. Most were dismissed because Apple are in control of the implementation and therefore could change the performance characteristics in the future.
For example, the Crypto API provides the getRandomValues() method to fill random arrays of numbers. When used for benchmarking the results were not as consistent as the Tak algorithm.
The 3D WebGL library was also investigated as a method of benchmarking using the GPU rather than the CPU. The results of the work and conclusions are described in a companion blog here. The solution has not been used at this time due to the lack of support for OffscreenCanvas operations with web workers. Should this change the method will be re-evaluated.
A robust set of future proof techniques have been assembled to provide granular Apple device identification within iOS 12 and 13.
The full Apple identification solution is part of 51Degrees device detection suite of services. Save the hassle of rolling your own solution and deploy 51Degrees today and get access to over 55,000 different device models with associated properties.