Introduction

This is a work in progress. The work is being tracked here.

Data Reference

Example Queries

This is a work in progress. The work is being tracked here.

Sampling

The events dataset contains one row for each event in a main ping. This dataset is derived from main_summary so any of main_summary's filters affect this dataset as well.

Data is currently available from 2017-01-05 on.

Scheduling

The events dataset is updated daily, shortly after main_summary is updated. The job is scheduled on Airflow. The DAG is here.

Firefox events

Firefox has an API to record events, which are then submitted through the main ping. The format and mechanism of event collection in Firefox is documented here.

The full events data pipeline is documented here.

Schema

As of 2017-01-26, the current version of the events dataset is v1, and has a schema as follows:

root
 |-- document_id: string (nullable = true)
 |-- client_id: string (nullable = true)
 |-- normalized_channel: string (nullable = true)
 |-- country: string (nullable = true)
 |-- locale: string (nullable = true)
 |-- app_name: string (nullable = true)
 |-- app_version: string (nullable = true)
 |-- os: string (nullable = true)
 |-- os_version: string (nullable = true)
 |-- subsession_start_date: string (nullable = true)
 |-- subsession_length: long (nullable = true)
 |-- sync_configured: boolean (nullable = true)
 |-- sync_count_desktop: integer (nullable = true)
 |-- sync_count_mobile: integer (nullable = true)
 |-- timestamp: long (nullable = true)
 |-- sample_id: string (nullable = true)
 |-- event_timestamp: long (nullable = false)
 |-- event_category: string (nullable = false)
 |-- event_method: string (nullable = false)
 |-- event_object: string (nullable = false)
 |-- event_string_value: string (nullable = true)
 |-- event_map_values: map (nullable = true)
 |    |-- key: string
 |    |-- value: string
 |-- submission_date_s3: string (nullable = true)
 |-- doc_type: string (nullable = true)