Addons Datasets

Introduction

This is a work in progress. The work is being tracked here.

Data Reference

Example Queries

Sampling

It contains one or more records for every Main Summary record that contains a non-null value for client_id. Each Addons record contains info for a single addon, or if the main ping did not contain any active addons, there will be a row with nulls for all the addon fields (to identify client_ids/records without any addons).

Like the Main Summary dataset, No attempt is made to de-duplicate submissions by documentId, so any analysis that could be affected by duplicate records should take care to remove duplicates using the documentId field.

Scheduling

This dataset is updated daily via the telemetry-airflow infrastructure. The job DAG runs every day after the Main Summary data has been generated. The DAG is here.

Schema

As of 2017-03-16, the current version of the addons dataset is v2, and has a schema as follows:

root
 |-- document_id: string (nullable = true)
 |-- client_id: string (nullable = true)
 |-- subsession_start_date: string (nullable = true)
 |-- normalized_channel: string (nullable = true)
 |-- addon_id: string (nullable = true)
 |-- blocklisted: boolean (nullable = true)
 |-- name: string (nullable = true)
 |-- user_disabled: boolean (nullable = true)
 |-- app_disabled: boolean (nullable = true)
 |-- version: string (nullable = true)
 |-- scope: integer (nullable = true)
 |-- type: string (nullable = true)
 |-- foreign_install: boolean (nullable = true)
 |-- has_binary_components: boolean (nullable = true)
 |-- install_day: integer (nullable = true)
 |-- update_day: integer (nullable = true)
 |-- signed_state: integer (nullable = true)
 |-- is_system: boolean (nullable = true)
 |-- submission_date_s3: string (nullable = true)
 |-- sample_id: string (nullable = true)

For more detail on where these fields come from in the raw data, please look in the AddonsView code.

The fields are all simple scalar values.