Raw Ping Data

Introduction

We receive data from our users via pings. There are several types of pings, each containing different measurements and sent for different purposes. To review a complete list of ping types and their schemata, see this section of the Mozilla Source Tree Docs.

Many pings are also described by a JSONSchema specification which can be found in this repository.

Background and Caveats

The large majority of analyses can be completed using only the main ping. This ping includes histograms, scalars, events, and other performance and diagnostic data.

Few analyses actually rely directly on the raw ping data. Instead, we provide derived datasets which are processed versions of these data, made to be:

  • Easier and faster to query
  • Organized to make the data easier to analyze
  • Cleaned of erroneous or misleading data

Before analyzing raw ping data, check to make sure there isn't already a derived dataset made for your purpose. If you do need to work with raw ping data, be aware that loading the data can take a while. Try to limit the size of your data by controlling the date range, etc.

Accessing the Data

You can access raw ping data from an ATMO cluster using the Dataset API. Raw ping data are not available in re:dash.

Further Reading

You can find the complete ping documentation. To augment our data collection, see Collecting New Data and the Data Collection Policy.

Data Reference

You can find the reference documentation for all ping types here.