Addons Datasets
Introduction
This is a work in progress. The work is being tracked here.
Data Reference
Example Queries
Sampling
It contains one or more records for every
Main Summary
record that contains a non-null value for client_id
.
Each Addons record contains info for a single addon,
or if the main ping did not contain any active addons,
there will be a row with nulls for all the addon fields
(to identify client_id
s/records without any addons).
Like the Main Summary dataset, No attempt is made to de-duplicate submissions by documentId
, so any analysis that could be affected by duplicate records should take care to remove duplicates using the documentId
field.
Scheduling
This dataset is updated daily via the telemetry-airflow infrastructure. The job DAG runs every day after the Main Summary data has been generated. The DAG is here.
Schema
As of 2017-03-16, the current version of the addons
dataset is v2
,
and has a schema as follows:
root
|-- document_id: string (nullable = true)
|-- client_id: string (nullable = true)
|-- subsession_start_date: string (nullable = true)
|-- normalized_channel: string (nullable = true)
|-- addon_id: string (nullable = true)
|-- blocklisted: boolean (nullable = true)
|-- name: string (nullable = true)
|-- user_disabled: boolean (nullable = true)
|-- app_disabled: boolean (nullable = true)
|-- version: string (nullable = true)
|-- scope: integer (nullable = true)
|-- type: string (nullable = true)
|-- foreign_install: boolean (nullable = true)
|-- has_binary_components: boolean (nullable = true)
|-- install_day: integer (nullable = true)
|-- update_day: integer (nullable = true)
|-- signed_state: integer (nullable = true)
|-- is_system: boolean (nullable = true)
|-- submission_date_s3: string (nullable = true)
|-- sample_id: string (nullable = true)
For more detail on where these fields come from in the
raw data,
please look
in the AddonsView
code.
The fields are all simple scalar values.