Main Summary
Introduction
The main_summary
table is the most direct representation of a main ping
but can be difficult to work with due to its size.
Prefer the longitudinal
dataset unless using the sampled data is prohibitive.
Contents
The main_summary
table contains one row for each ping.
Each column represents one field from the main ping payload,
though only a subset of all main ping fields are included.
This dataset does not include histograms.
Background and Caveats
This table is massive, and due to its size, it can be difficult to work with.
You should avoid querying main_summary
from re:dash.
Your queries will be slow to complete and can impact performance for other users,
since re:dash on a shared cluster.
Instead, we recommend using the longitudinal
or clients_daily
dataset where possible.
If these datasets do not suffice, consider using Spark on an
ATMO cluster.
In the odd case where these queries are necessary,
make use of the sample_id
field and limit to a short submission date range.
Accessing the Data
The data is stored as a parquet table in S3 at the following address. See this cookbook to get started working with the data in Spark.
s3://telemetry-parquet/main_summary/v4/
Though not recommended main_summary
is accessible through re:dash.
Here's an example query.
Your queries will be slow to complete and can impact performance for other users,
since re:dash is on a shared cluster.
Further Reading
The technical documentation for main_summary
is located in the
telemetry-batch-view documentation.
The code responsible for generating this dataset is here
Adding New Fields
We support a few basic types that can be easily added to main_summary
.
Non-addon scalars are automatically added to main_summary
.
User Preferences
These are added in the userPrefsList
, near the top of the
Main Summary file.
They must be available in the ping environment
to be included here. There is more information in the file itself.
Once added, they will show as top-level fields, with the string user_pref
prepended. For example, IntegerUserPref("dom.ipc.processCount")
becomes user_pref_dom_ipc_processcount
.
Histograms
Histograms can simply be added to the histogramsWhitelist
near the top of
Main Summary file.
Simply add the name of the histogram in the alphabetically-sorted position in the list.
Each process a histogram is recorded in will have a column in main_summary
, with the string histogram_
prepended. For example, CYCLE_COLLECTOR_MAX_PAUSE
is recorded in the parent
, content
, and gpu
processes (according to the definition).
It will then result in three columns:
histogram_parent_cycle_collector_max_pause
histogram_content_cycle_collector_max_pause
histogram_gpu_cycle_collector_max_pause
Addon Scalars
Addon scalars are recorded by an addon. To include one of these, add the definition to the addon scalars definition file in telemetry-batch-view. Be sure to include the section:
record_in_processes:
- 'dynamic'
The addon scalars can then be found in the associated column, depending on their type:
string_addon_scalars
keyed_string_addon_scalars
uint_addon_scalars
keyed_uint_addon_scalars
boolean_addon_scalars
keyed_boolean_addon_scalars
These columns are all maps. Each addon scalar will be a key within that map, concatenating the top-level subsection within Scalars.yaml
with its name to get the key. As an example, consider the following scalar definition:
test:
misunderestimated_nucular:
description: A test scalar, no soup for you!
expires: never
kind: string
keyed: true
notification_emails:
- frank@mozilla.com
record_in_processes:
- 'dynamic'
For example, you could find the addon scalar test.misunderestimated_nucular
, a keyed string scalar, using keyed_string_addon_scalars['test_misunderestimated_nucular']
.
In general, use element_at
, which returns NULL
when the key is not found: element_at(keyed_string_addon_scalars, 'test_misunderestimated_nucular')
Other Fields
We can include other types of fields as well, for example if there needs to be a specific transformation done. We do need the data to be available in the Main Ping
Data Reference
Example Queries
We recommend working with this dataset via Spark rather than sql.t.m.o
.
Due to the large number of records,
queries can consume a lot of resources on the
shared cluster and impact other users.
Queries via sql.t.m.o
should limit to a short submission_date_s3
range,
and ideally make use of the sample_id
field.
When using Presto to query the data from sql.t.m.o
,
you can use the UNNEST
feature to access items in the
search_counts
, popup_notification_stats
and active_addons
fields.
For example, to compare the search volume for different search source values, you could use:
WITH search_data AS (
SELECT
s.source AS search_source,
s.count AS search_count
FROM
main_summary
CROSS JOIN UNNEST(search_counts) AS t(s)
WHERE
submission_date_s3 = '20160510'
AND sample_id = '42'
AND search_counts IS NOT NULL
)
SELECT
search_source,
sum(search_count) as total_searches
FROM search_data
GROUP BY search_source
ORDER BY sum(search_count) DESC
Sampling
The main_summary
dataset contains one record for each main
ping
as long as the record contains a non-null value for
documentId
, submissionDate
, and Timestamp
.
We do not ever expect nulls for these fields.
Scheduling
This dataset is updated daily via the telemetry-airflow infrastructure. The job DAG runs every day shortly after midnight UTC. You can find the job definition here
Schema
As of 2017-12-03, the current version of the main_summary
dataset is v4
, and has a schema as follows:
root
|-- document_id: string (nullable = false)
|-- client_id: string (nullable = true)
|-- channel: string (nullable = true)
|-- normalized_channel: string (nullable = true)
|-- normalized_os_version: string (nullable = true)
|-- country: string (nullable = true)
|-- city: string (nullable = true)
|-- geo_subdivision1: string (nullable = true)
|-- geo_subdivision2: string (nullable = true)
|-- os: string (nullable = true)
|-- os_version: string (nullable = true)
|-- os_service_pack_major: long (nullable = true)
|-- os_service_pack_minor: long (nullable = true)
|-- windows_build_number: long (nullable = true)
|-- windows_ubr: long (nullable = true)
|-- install_year: long (nullable = true)
|-- is_wow64: boolean (nullable = true)
|-- memory_mb: integer (nullable = true)
|-- cpu_count: integer (nullable = true)
|-- cpu_cores: integer (nullable = true)
|-- cpu_vendor: string (nullable = true)
|-- cpu_family: integer (nullable = true)
|-- cpu_model: integer (nullable = true)
|-- cpu_stepping: integer (nullable = true)
|-- cpu_l2_cache_kb: integer (nullable = true)
|-- cpu_l3_cache_kb: integer (nullable = true)
|-- cpu_speed_mhz: integer (nullable = true)
|-- gfx_features_d3d11_status: string (nullable = true)
|-- gfx_features_d2d_status: string (nullable = true)
|-- gfx_features_gpu_process_status: string (nullable = true)
|-- gfx_features_advanced_layers_status: string (nullable = true)
|-- apple_model_id: string (nullable = true)
|-- antivirus: array (nullable = true)
| |-- element: string (containsNull = false)
|-- antispyware: array (nullable = true)
| |-- element: string (containsNull = false)
|-- firewall: array (nullable = true)
| |-- element: string (containsNull = false)
|-- profile_creation_date: long (nullable = true)
|-- profile_reset_date: long (nullable = true)
|-- previous_build_id: string (nullable = true)
|-- session_id: string (nullable = true)
|-- subsession_id: string (nullable = true)
|-- previous_session_id: string (nullable = true)
|-- previous_subsession_id: string (nullable = true)
|-- session_start_date: string (nullable = true)
|-- subsession_start_date: string (nullable = true)
|-- session_length: long (nullable = true)
|-- subsession_length: long (nullable = true)
|-- subsession_counter: integer (nullable = true)
|-- profile_subsession_counter: integer (nullable = true)
|-- creation_date: string (nullable = true)
|-- distribution_id: string (nullable = true)
|-- submission_date: string (nullable = false)
|-- sync_configured: boolean (nullable = true)
|-- sync_count_desktop: integer (nullable = true)
|-- sync_count_mobile: integer (nullable = true)
|-- app_build_id: string (nullable = true)
|-- app_display_version: string (nullable = true)
|-- app_name: string (nullable = true)
|-- app_version: string (nullable = true)
|-- timestamp: long (nullable = false)
|-- env_build_id: string (nullable = true)
|-- env_build_version: string (nullable = true)
|-- env_build_arch: string (nullable = true)
|-- e10s_enabled: boolean (nullable = true)
|-- e10s_multi_processes: long (nullable = true)
|-- locale: string (nullable = true)
|-- update_channel: string (nullable = true)
|-- update_enabled: boolean (nullable = true)
|-- update_auto_download: boolean (nullable = true)
|-- attribution: struct (nullable = true)
| |-- source: string (nullable = true)
| |-- medium: string (nullable = true)
| |-- campaign: string (nullable = true)
| |-- content: string (nullable = true)
|-- sandbox_effective_content_process_level: integer (nullable = true)
|-- active_experiment_id: string (nullable = true)
|-- active_experiment_branch: string (nullable = true)
|-- reason: string (nullable = true)
|-- timezone_offset: integer (nullable = true)
|-- plugin_hangs: integer (nullable = true)
|-- aborts_plugin: integer (nullable = true)
|-- aborts_content: integer (nullable = true)
|-- aborts_gmplugin: integer (nullable = true)
|-- crashes_detected_plugin: integer (nullable = true)
|-- crashes_detected_content: integer (nullable = true)
|-- crashes_detected_gmplugin: integer (nullable = true)
|-- crash_submit_attempt_main: integer (nullable = true)
|-- crash_submit_attempt_content: integer (nullable = true)
|-- crash_submit_attempt_plugin: integer (nullable = true)
|-- crash_submit_success_main: integer (nullable = true)
|-- crash_submit_success_content: integer (nullable = true)
|-- crash_submit_success_plugin: integer (nullable = true)
|-- shutdown_kill: integer (nullable = true)
|-- active_addons_count: long (nullable = true)
|-- flash_version: string (nullable = true)
|-- vendor: string (nullable = true)
|-- is_default_browser: boolean (nullable = true)
|-- default_search_engine_data_name: string (nullable = true)
|-- default_search_engine_data_load_path: string (nullable = true)
|-- default_search_engine_data_origin: string (nullable = true)
|-- default_search_engine_data_submission_url: string (nullable = true)
|-- default_search_engine: string (nullable = true)
|-- devtools_toolbox_opened_count: integer (nullable = true)
|-- client_submission_date: string (nullable = true)
|-- client_clock_skew: long (nullable = true)
|-- client_submission_latency: long (nullable = true)
|-- places_bookmarks_count: integer (nullable = true)
|-- places_pages_count: integer (nullable = true)
|-- push_api_notify: integer (nullable = true)
|-- web_notification_shown: integer (nullable = true)
|-- popup_notification_stats: map (nullable = true)
| |-- key: string
| |-- value: struct (valueContainsNull = true)
| | |-- offered: integer (nullable = true)
| | |-- action_1: integer (nullable = true)
| | |-- action_2: integer (nullable = true)
| | |-- action_3: integer (nullable = true)
| | |-- action_last: integer (nullable = true)
| | |-- dismissal_click_elsewhere: integer (nullable = true)
| | |-- dismissal_leave_page: integer (nullable = true)
| | |-- dismissal_close_button: integer (nullable = true)
| | |-- dismissal_not_now: integer (nullable = true)
| | |-- open_submenu: integer (nullable = true)
| | |-- learn_more: integer (nullable = true)
| | |-- reopen_offered: integer (nullable = true)
| | |-- reopen_action_1: integer (nullable = true)
| | |-- reopen_action_2: integer (nullable = true)
| | |-- reopen_action_3: integer (nullable = true)
| | |-- reopen_action_last: integer (nullable = true)
| | |-- reopen_dismissal_click_elsewhere: integer (nullable = true)
| | |-- reopen_dismissal_leave_page: integer (nullable = true)
| | |-- reopen_dismissal_close_button: integer (nullable = true)
| | |-- reopen_dismissal_not_now: integer (nullable = true)
| | |-- reopen_open_submenu: integer (nullable = true)
| | |-- reopen_learn_more: integer (nullable = true)
|-- search_counts: array (nullable = true)
| |-- element: struct (containsNull = false)
| | |-- engine: string (nullable = true)
| | |-- source: string (nullable = true)
| | |-- count: long (nullable = true)
|-- active_addons: array (nullable = true)
| |-- element: struct (containsNull = false)
| | |-- addon_id: string (nullable = false)
| | |-- blocklisted: boolean (nullable = true)
| | |-- name: string (nullable = true)
| | |-- user_disabled: boolean (nullable = true)
| | |-- app_disabled: boolean (nullable = true)
| | |-- version: string (nullable = true)
| | |-- scope: integer (nullable = true)
| | |-- type: string (nullable = true)
| | |-- foreign_install: boolean (nullable = true)
| | |-- has_binary_components: boolean (nullable = true)
| | |-- install_day: integer (nullable = true)
| | |-- update_day: integer (nullable = true)
| | |-- signed_state: integer (nullable = true)
| | |-- is_system: boolean (nullable = true)
| | |-- is_web_extension: boolean (nullable = true)
| | |-- multiprocess_compatible: boolean (nullable = true)
|-- disabled_addons_ids: array (nullable = true)
| |-- element: string (containsNull = false)
|-- active_theme: struct (nullable = true)
| |-- addon_id: string (nullable = false)
| |-- blocklisted: boolean (nullable = true)
| |-- name: string (nullable = true)
| |-- user_disabled: boolean (nullable = true)
| |-- app_disabled: boolean (nullable = true)
| |-- version: string (nullable = true)
| |-- scope: integer (nullable = true)
| |-- type: string (nullable = true)
| |-- foreign_install: boolean (nullable = true)
| |-- has_binary_components: boolean (nullable = true)
| |-- install_day: integer (nullable = true)
| |-- update_day: integer (nullable = true)
| |-- signed_state: integer (nullable = true)
| |-- is_system: boolean (nullable = true)
| |-- is_web_extension: boolean (nullable = true)
| |-- multiprocess_compatible: boolean (nullable = true)
|-- blocklist_enabled: boolean (nullable = true)
|-- addon_compatibility_check_enabled: boolean (nullable = true)
|-- telemetry_enabled: boolean (nullable = true)
|-- user_prefs: struct (nullable = true)
| |-- dom_ipc_process_count: integer (nullable = true)
| |-- extensions_allow_non_mpc_extensions: boolean (nullable = true)
|-- events: array (nullable = true)
| |-- element: struct (containsNull = false)
| | |-- timestamp: long (nullable = false)
| | |-- category: string (nullable = false)
| | |-- method: string (nullable = false)
| | |-- object: string (nullable = false)
| | |-- string_value: string (nullable = true)
| | |-- map_values: map (nullable = true)
| | | |-- key: string
| | | |-- value: string (valueContainsNull = true)
|-- ssl_handshake_result_success: integer (nullable = true)
|-- ssl_handshake_result_failure: integer (nullable = true)
|-- ssl_handshake_result: map (nullable = true)
| |-- key: string
| |-- value: integer (valueContainsNull = true)
|-- active_ticks: integer (nullable = true)
|-- main: integer (nullable = true)
|-- first_paint: integer (nullable = true)
|-- session_restored: integer (nullable = true)
|-- total_time: integer (nullable = true)
|-- plugins_notification_shown: integer (nullable = true)
|-- plugins_notification_user_action: struct (nullable = true)
| |-- allow_now: integer (nullable = true)
| |-- allow_always: integer (nullable = true)
| |-- block: integer (nullable = true)
|-- plugins_infobar_shown: integer (nullable = true)
|-- plugins_infobar_block: integer (nullable = true)
|-- plugins_infobar_allow: integer (nullable = true)
|-- plugins_infobar_dismissed: integer (nullable = true)
|-- experiments: map (nullable = true)
| |-- key: string
| |-- value: string (valueContainsNull = true)
|-- search_cohort: string (nullable = true)
|-- gfx_compositor: string (nullable = true)
|-- quantum_ready: boolean (nullable = true)
|-- gc_max_pause_ms_main_above_150: long (nullable = true)
|-- gc_max_pause_ms_main_above_250: long (nullable = true)
|-- gc_max_pause_ms_main_above_2500: long (nullable = true)
|-- gc_max_pause_ms_content_above_150: long (nullable = true)
|-- gc_max_pause_ms_content_above_250: long (nullable = true)
|-- gc_max_pause_ms_content_above_2500: long (nullable = true)
|-- cycle_collector_max_pause_main_above_150: long (nullable = true)
|-- cycle_collector_max_pause_main_above_250: long (nullable = true)
|-- cycle_collector_max_pause_main_above_2500: long (nullable = true)
|-- cycle_collector_max_pause_content_above_150: long (nullable = true)
|-- cycle_collector_max_pause_content_above_250: long (nullable = true)
|-- cycle_collector_max_pause_content_above_2500: long (nullable = true)
|-- input_event_response_coalesced_ms_main_above_150: long (nullable = true)
|-- input_event_response_coalesced_ms_main_above_250: long (nullable = true)
|-- input_event_response_coalesced_ms_main_above_2500: long (nullable = true)
|-- input_event_response_coalesced_ms_content_above_150: long (nullable = true)
|-- input_event_response_coalesced_ms_content_above_250: long (nullable = true)
|-- input_event_response_coalesced_ms_content_above_2500: long (nullable = true)
|-- ghost_windows_main_above_1: long (nullable = true)
|-- ghost_windows_content_above_1: long (nullable = true)
|-- user_pref_dom_ipc_plugins_sandbox_level_flash: integer (nullable = true)
|-- user_pref_dom_ipc_processcount: integer (nullable = true)
|-- user_pref_extensions_allow_non_mpc_extensions: boolean (nullable = true)
|-- user_pref_extensions_legacy_enabled: boolean (nullable = true)
|-- user_pref_browser_search_widget_innavbar: boolean (nullable = true)
|-- user_pref_general_config_filename: string (nullable = true)
|-- ** dynamically included scalar fields, see source **
|-- ** dynamically included whitelisted histograms, see source **
|-- boolean_addon_scalars: map (nullable = true)
| |-- key: string
| |-- value: boolean (valueContainsNull = true)
|-- keyed_boolean_addon_scalars: map (nullable = true)
| |-- key: string
| |-- value: map (valueContainsNull = true)
| | |-- key: string
| | |-- value: boolean (valueContainsNull = true)
|-- string_addon_scalars: map (nullable = true)
| |-- key: string
| |-- value: string (valueContainsNull = true)
|-- keyed_string_addon_scalars: map (nullable = true)
| |-- key: string
| |-- value: map (valueContainsNull = true)
| | |-- key: string
| | |-- value: string (valueContainsNull = true)
|-- uint_addon_scalars: map (nullable = true)
| |-- key: string
| |-- value: integer (valueContainsNull = true)
|-- keyed_uint_addon_scalars: map (nullable = true)
| |-- key: string
| |-- value: map (valueContainsNull = true)
| | |-- key: string
| | |-- value: integer (valueContainsNull = true)
|-- submission_date_s3: string (nullable = true)
|-- sample_id: string (nullable = true)
For more detail on where these fields come from in the
raw data,
please look
in the MainSummaryView
code.
in the buildSchema
function.
Most of the fields are simple scalar values, with a few notable exceptions:
- The
search_count
field is an array of structs, each item in the array representing a 3-tuple of (engine
,source
,count
). Theengine
field represents the name of the search engine against which the searches were done. Thesource
field represents the part of the Firefox UI that was used to perform the search. It contains values such asabouthome
,urlbar
, andsearchbar
. Thecount
field contains the number of searches performed against this engine+source combination during that subsession. Any of the fields in the struct may be null (for example if the search key did not match the expected pattern, or if the count was non-numeric). - The
loop_activity_counter
field is a simple struct containing inner fields for each expected value of theLOOP_ACTIVITY_COUNTER
Enumerated Histogram. Each inner field is a count for that histogram bucket. - The
popup_notification_stats
field is a map ofString
keys to struct values, each field in the struct being a count for the expected values of thePOPUP_NOTIFICATION_STATS
Keyed Enumerated Histogram. - The
places_bookmarks_count
andplaces_pages_count
fields contain the mean value of the corresponding Histogram, which can be interpreted as the average number of bookmarks or pages in a given subsession. - The
active_addons
field contains an array of structs, one for each entry in theenvironment.addons.activeAddons
section of the payload. More detail in Bug 1290181. - The
disabled_addons_ids
field contains an array of strings, one for each entry in thepayload.addonDetails
which is not already reported in theenvironment.addons.activeAddons
section of the payload. More detail in Bug 1390814. Please note that while using this field is generally OK, this was introduced to support the TAAR project and you should not count on it in the future. The field can stay in themain_summary
, but we might need to slightly change the ping structure to something better thanpayload.addonDetails
. - The
theme
field contains a single struct in the same shape as the items in theactive_addons
array. It contains information about the currently active browser theme. - The
user_prefs
field contains a struct with values for preferences of interest. - The
events
field contains an array of event structs. - Dynamically-included histogram fields are present as key->value maps, or key->(key->value) nested maps for keyed histograms.
Time formats
Columns in main_summary
may use one of a handful of time formats with different precisions:
Column Name | Origin | Description | Example | Spark | Presto |
---|---|---|---|---|---|
timestamp | stamped at ingestion | nanoseconds since epoch | 1504689165972861952 | from_unixtime(timestamp/1e9) | from_unixtime(timestamp/1e9) |
submission_date_s3 | derived from timestamp | YYYYMMDD date string of timestamp | 20170906 | from_unixtime(unix_timestamp(submission_date, 'yyyyMMdd')) | date_parse(submission_date, '%Y%m%d') |
client_submission_date | derived from HTTP header: Fields[Date] | HTTP date header string sent with the ping | Tue, 27 Sep 2016 16:28:23 GMT | unix_timestamp(client_submission_date, 'EEE, dd M yyyy HH:mm:ss zzz') | date_parse(substr(client_submission_date, 1, 25), '%a, %d %b %Y %H:%i:%s') |
creation_date | creationDate | time of ping creation ISO8601 at UTC+0 | 2017-09-06T08:21:36.002Z | to_timestamp(creation_date, "yyyy-MM-dd'T'HH:mm:ss.SSSXXX") | from_iso8601_timestamp(creation_date) AT TIME ZONE 'GMT' |
timezone_offset | info.timezoneOffset | timezone offset in minutes | 120 | ||
subsession_start_date | info.subsessionStartDate | hourly precision, ISO8601 date in local time | 2017-09-06T00:00:00.0+02:00 | from_iso8601_timestamp(subsession_start_date) AT TIME ZONE 'GMT' | |
subsession_length | info.subsessionLength | subsession length in seconds | 599 | date_add('second', subsession_length, subsession_start_date) | |
profile_creation_date | environment.profile.creationDate | days since epoch | 15,755 | from_unixtime(profile_creation_date * 86400) |
Code Reference
This dataset is generated by telemetry-batch-view. Refer to this repository for information on how to run or augment the dataset.