Projects
Below are a number of trailheads that lead into the projects and code that comprise the Firefox Data Platform.
Telemetry APIs
Name and repo | Description |
---|---|
python_moztelemetry | Python APIs for Mozilla Telemetry |
moztelemetry | Scala APIs for Mozilla Telemetry |
spark-hyperloglog | Algebird's HyperLogLog support for Apache Spark |
ETL code and Datasets
Name and repo | Description |
---|---|
telemetry-batch-view | Scala ETL code for derived datasets |
python_mozetl | Python ETL code for derived datasets |
telemetry-airflow | Airflow configuration and DAGs for scheduled jobs |
python_mozaggregator | Aggregation job for telemetry.mozilla.org aggregates |
telemetry-streaming | Spark Streaming ETL jobs for Mozilla Telemetry |
See also firefox-data-docs
for documentation on datasets.
Infrastructure
Name and repo | Description |
---|---|
mozilla-pipeline-schemas | JSON and Parquet Schemas for Mozilla Telemetry and other structured data |
hindsight | Real-time data processing |
lua_sandbox | Generic sandbox for safe data analysis |
lua_sandbox_extensions | Modules and packages that extend the Lua sandbox |
nginx_moz_ingest | Nginx module for Telemetry data ingestion |
puppet-config | Cloud services puppet config for deploying infrastructure |
parquet2hive | Hive import statement generator for Parquet datasets |
EMR Bootstrap scripts
Name and repo | Description |
---|---|
emr-bootstrap-spark | AWS bootstrap scripts for Spark. |
emr-bootstrap-presto | AWS bootstrap scripts for Presto. |
Data applications
Name and repo | Description |
---|---|
telemetry.mozilla.org | Main entry point for viewing aggregate Telemetry data |
Cerberus & Medusa | Automatic alert system for telemetry aggregates |
analysis.t.m.o | Self serve data analysis platform |
Mission Control | Low latency dashboard for stability and health metrics |
Experiments Viewer | Visualization for Shield experiment results |
Re:dash | Mozilla's fork of the data query / visualization system |
TAAR | Telemetry-aware addon recommender |
Ensemble | A minimalist platform for publishing data |
Hardware Report | Firefox Hardware Report, available here |
python-zeppelin | Convert Zeppelin notebooks to Markdown |
St. Mocli | A command-line interface to STMO |
probe-scraper | Scrape and publish Telemetry probe data from Firefox |
test-tube | Compare data across branches in experiments |
experimenter | A web application for managing experiments |
St. Moab | Automatically generate Re:dash dashboard for A/B experiments |
Reference materials
Public
Name and repo | Description |
---|---|
firefox-data-docs | All the info you need to answer questions about Firefox users with data |
Firefox source docs | Mozilla Source Tree Docs - Telemetry section |
reports.t.m.o | Knowledge repository for public reports |
Non-public
Name and repo | Description |
---|---|
Fx-Data-Planning | Quarterly goals and internal documentation |