Technical details for data integration

Blockmetry produces anonmyized web analytics hit-level data that is stored in customer-controlled databases. This document describes the details of this data integration.

Sending measurements to Blockmetry

Blockmetry can receive measurements through a number of channels:

We plan to add to these channels in the future. Please get in touch if you have a specific use case or need.

Blockmetry measurement processing

Once a measurement is received, it is processed in datacenters in the EU (currently Ireland). We use Amazon Web Services (AWS) as our infrastructure provider. This diagram describes the data flow using the webpage integration option, i.e using JavaScript to send measurements to Blockmetry (click for full-size version):

Blockmetry data flow and integration

The Blockmetry pipeline is built for high availability, using AWS products such as AWS Lambda. This has many technical and user-facing benefits, such as high availability and user-visible low response times.

Storing hit-level data records in customer databases

The final step is to produce the analytics record with all the (anonymized) parameters.

The analytics record, called BMRecord, is produced only if the measurement is detected to have come not from a bot/spammer. This means that not all Blockmetry processing will produce an analytics record.

If produced, the analytics record is stored in a customer-owned data asset such as:

Blockmetry is given write-only access to these customer-owned data assets. You really own your data. Part of the privacy promises from Blockmetry is that we cannot read back the analytics data.

Sample Blockmetry record

This is a simplified Blockmetry analytics record that is sent to a customer’s database, rendered as JSON:

{
  "AnonymitySetID": <string>,
  "PropertyType": <string>,
  "Datetime": {
    "DatetimeUTC": "YYYY-MM-DD HH:MM",
    "DateUTC": "YYYY-MM-DD",
    "TimeUTC": "HH:MM",
    "HourUTC": _H,
    "MonthUTC": _M,
    "WeekdayUTC": <number>,
    "ISOWeekUTC": <number>,
    "DatetimeLocal": "YYYY-MM-DD HH:MM",
    "DateLocal": "YYYY-MM-DD",
    "TimeLocal": "HH:MM",
    "HourLocal": _H,
    "MonthLocal": _M,
    "WeekdayLocal": <number>,
    "ISOWeekLocal": <number>
  },
  "PageProperties": {
    "Title": <string>,
    "URLHost": <string>,
    "DocumentURL": <string>,
    "CanonicalURL": <string>,
    "URLScheme": <string>,
    "URLPath": <string>,
    "PathSegment1": <string>,
    "PathSegment2": <string>,
    "PathSegment3": <string>
  },
  "DeviceProperties": {
    "UAFamily": <string>,
    "UAMajor": <number>,
    "OS": <string>,
    "OSMajor": <number>,
    "DeviceFamily": <string>,
    "DeviceClass": <number>,
    "ScreenWidth": <number>,
    "ScreenHeigh": <number>,
    "JSEnabled": <Boolean>,
    "IsGoogleWeblight": <Boolean>,
    "IsInAppBrowser": <Boolean>,
    "HostApp": <string>
  },
  "Locale": {
    "Country": <ISO 3166-1 alpha-2>,
    "CityGeoNameID": <number>,
    "CityNameEN": <string>,
    "Languages": {
      "<rank>": <ISO 639-1 string (optional ISO 3166-1 alpha-2 language code)>,
      ...
    },
    "TimeZone": <string from tz database>
  },
  "Acquisition": {
    "Referrer": <string>,
    "ReferrerHost": <string>,
    "AcquistionSource": <string>,
    "AcquisitionMedium": <string>,
    "AcquisitionCampaign": <string>,
    "AcquisitionTerm": <string>,
    "AcquisitionContent": <string>,
    "AcquisitionCampaignID": <string>
  },
  "PerfMetrics": {
    "NavTimingsRaw": {
      <metric>: <number>,
      ...
    },
    "NavTimingsComputed": {
      <metric>: <number>,
      ...
    }
  }
}

Note that not all properties may be populated in the final record. For example, if using a web data integration configuration and the webpage does not have a rel=canonical in the <head> element (as per spec), then the PageProperties.CanonicalURL property will be an empty string.

Data security

Blockmetry will write to S3 or AWS Kinesis using HTTPS endpoints, meaning all data is encrypted in transit.

The customer is responsible for data encryption, access controls, logging, auditing, etc under the GDPR, for the data once it reaches your data stores.