On 5 October we released a new predictive algorithm. This was a result of months of observing and refining.
Why we made the changes
Increasingly over the last few months, we’ve seen more HTTP header data that does not follow the format that we’d expect. The data was muddled with broken conventions and distractions such as irrelevant characters or additional blank space.
In turn, after extensive performance checks, we’ve updated our device detection algorithm to be more tolerant of corruption in the data structure.
What’s changed with the predictive algorithm?
For User-Agent Client Hints, we saw a lot of corruption in the Sec-CH-UA headers, including additional quotation marks or spaces. This could be due to spoofed data or a poor conversion between the OpenRTB’s Structured User-Agent and User-Agent Client Hints.
Now, our algorithm puts greater emphasis on the important substrings within the header, ignoring the structure and irrelevant characters.
For User-Agents, our new algorithm puts a greater weighting on smaller substrings and high-volume substrings that have been seen for long periods of time. This allows our algorithm to differentiate between User-Agents that don't follow popular conventions.
Our algorithm will now be better equipped to handle the ever-changing data landscape with more consistent and accurate detection.
The performance graph
In phase two of our algorithm changes, we are planning to deprecate the performance graph.
Although the performance graph is approximately two milliseconds per detection faster than our predictive graph, it doesn't return the same levels of accuracy. We feel deprecating the performance graph is a trade off worth making!
What should I expect?
Once the performance graph is deprecated, you won't need to make any changes. If you continue to call the performance graph in your code, it’ll redirect to call the predictive graph, continuing smooth running of the system. The change will also result in a smaller data file.
We will update this blog when the performance graph has been deprecated.
Our algorithm is now better positioned for unusual and unconventional HTTP headers. If you have any questions about the changes, please reach out to the support team.