\r\n

51Degrees API Documentation  4.4

Usage Sharing

Introduction

Some of the services offered by 51Degrees benefit from evidence (optionally) being sent back to 51Degrees' data processing system from live installations of the Pipeline. We use this evidence to ensure that our data is up-to-date, comprehensive and continues to provide accurate results.

Internals

To minimize any overhead of this feature, received requests are grouped and sent in batches, rather than sending each request individually.

Usage sharing is designed such that any failure within it should not impact the result of the Pipeline. If a failure does occur then usage sharing will simply be disabled and an appropriate warning logged.

In languages that support multiple threading, Usage sharing will typically use a producer/consumer model, where the 'main' thread adds the evidence to a queue while a background thread takes items from this queue, transforms them into the appropriate format, adds them into a message and sends the message when ready. This is done to avoid blocking the Pipeline process thread.

Repeated Evidence

To avoid situations where the same evidence is sent multiple times (for example, a single user visiting multiple pages on a web site), we keep track of the evidence that has been shared over a defined time period (maximum 20 minutes by default) and only share evidence which is different to any already shared during the window.

Note that the amount of evidence tracked is also constrained based upon available memory. In high-traffic scenarios, this may mean that the time period covered by the evidence in the tracker is much smaller than the configured maximum.

Configuration

The usage sharing feature is provided by a flow element that is added to the Pipeline. Certain pipeline builders will do this automatically. For example, the device detection pipeline builder will add the usage sharing element by default. This can be disabled using the SetShareUsage method on the builder.

There are also several configuration options when building a usage sharing element. These can be used to control what is shared and how it is collected:

Evidence Shared

The usage sharing element will not be interested in all evidence in the flow data. These are the rules for whether or not a particular piece of evidence is shared:

  • Any evidence named 'header.<name>', if <name> is not on a configured blacklist.
  • Any evidence named 'query.<name>', if <name> is on a configured whitelist.
  • Any evidence named 'cookie.<name>' is ignored, unless <name> starts with '51D_'
  • Any other evidence is shared if it is not on a configured blacklist.

The various blacklists and whitelists can be configured using the share usage element builder.

Share Percentage

Usage sharing can be configured to only share a certain percentage of requests that pass through the Pipeline. This can be useful in very high-traffic scenarios where usage sharing is desired, but sharing every request could put too much strain on the web server.

This is based on a randomized value, so the exact amount shared may not be precisely the percentage specified. For example, if generating a number between 0 and 1, the result will be above 0.5 roughly 50% of the time but it's unlikely to be exact.

Timeouts

There may be one or multiple configurable timeouts depending on the language. Typically, these are used to suspend usage sharing if its internal mechanisms are responding too slowly.

Maximum Queue Size

In languages that support multiple threads, this settings controls the size of the internal producer/consumer queue".

Minimum Entries per Message

The minimum number of evidence entries that must be added before the message will be sent to the usage sharing web service.

Repeat Evidence Interval

The maximum time period which evidence is stored for the purpose of filtering repeat evidence.

Usage Sharing for low-level APIs

The low-level device detection APIs such as C, Nginx and Varnish do not support usage sharing out of the box. However, some customers using these technologies still want to share usage with us in order to help us improve the accuracy of results.

Our recommended approach in this situation is to have the low-level code write a log file containing the necessary evidence values from requests. This file can then be processed offline at a later date using one of the higher-level languages in order to share the data with 51Degrees.

The offline processing examples provide a good sample for how this might work. These take a YAML file where each record represents a request. For example:

---
header.User-Agent: Mozilla/5.0 (iPhone; CPU iPhone OS 15_3_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.3 Mobile/15E148 Safari/604.1
---
header.Sec-CH-UA-Mobile: ?0
header.Sec-CH-UA-Platform: '"Windows"'
header.Sec-CH-UA: '" Not A;Brand";v="99", "Chromium";v="98", "Google Chrome";v="98"'
header.User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.102 Safari/537.36
...

You will need to modify your low-level code to output this data to a file (or memory stream, etc). As a minimum, the values below MUST be present for each record. If not, the record will be discarded by our backend processing system.

header.user-agent [The value of the User-Agent HTTP header]
header.host: [The value of the Host HTTP header]
server.client-ip: [The source public IP that is making the request to your server]

The output will then need to be consumed by a process using one of the higher-level APIs. You can of course use whatever format you wish for transferring the data between your low-level code and the usage sharing process. If using the suggested YAML format and the offline processing example, the following changes will need to be made to the example:

  1. Configure the input stream to take the output stream that is producing the YAML formatted data.
  2. Device detection is not needed, only usage sharing, so replace the DeviceDetectionPipelineBuilder with FiftyOnePipelineBuilder and remove all the builder options that are no longer valid.
  3. Configure the setShareUsage option to true.
  4. Remove the code to get the device detection result and write an output file.

This code should now be able to consume the output from the low-level code and send the usage data back to 51Degrees for analysis.