Introduction
This is a work in progress. The work is being tracked here.
Data Reference
Example Queries
This is a work in progress. The work is being tracked here.
Sampling
The events dataset contains one row for each event in a main ping.
This dataset is derived from main_summary
so any of main_summary
's filters affect this dataset as well.
Data is currently available from 2017-01-05 on.
Scheduling
The events dataset is updated daily, shortly after
main_summary
is updated.
The job is scheduled on Airflow.
The DAG is here.
Firefox events
Firefox has an API to record events, which are then submitted through the main ping. The format and mechanism of event collection in Firefox is documented here.
The full events data pipeline is documented here.
Schema
As of 2017-01-26, the current version of the events
dataset is v1
, and has a schema as follows:
root
|-- document_id: string (nullable = true)
|-- client_id: string (nullable = true)
|-- normalized_channel: string (nullable = true)
|-- country: string (nullable = true)
|-- locale: string (nullable = true)
|-- app_name: string (nullable = true)
|-- app_version: string (nullable = true)
|-- os: string (nullable = true)
|-- os_version: string (nullable = true)
|-- subsession_start_date: string (nullable = true)
|-- subsession_length: long (nullable = true)
|-- sync_configured: boolean (nullable = true)
|-- sync_count_desktop: integer (nullable = true)
|-- sync_count_mobile: integer (nullable = true)
|-- timestamp: long (nullable = true)
|-- sample_id: string (nullable = true)
|-- event_timestamp: long (nullable = false)
|-- event_category: string (nullable = false)
|-- event_method: string (nullable = false)
|-- event_object: string (nullable = false)
|-- event_string_value: string (nullable = true)
|-- event_map_values: map (nullable = true)
| |-- key: string
| |-- value: string
|-- submission_date_s3: string (nullable = true)
|-- doc_type: string (nullable = true)