Search Aggregates

Introduction

search_aggregates is designed to power high level search dashboards. It's quick and easy to query, but the data are coarse. In particular, this dataset allows you to segment by a limited number of client characteristics which are relevant to search markets. However, it is not possible to normalize by client count. If you need fine-grained data, consider using search_clients_daily which breaks down search counts by client

Contents

Each row of search_aggregates contains the standard search count aggregations for each unique combination of the following columns. Unless otherwise noted, these columns are taken directly from main_summary.

  • submission_date - yyyymmdd
  • engine - e.g. google, bing, yahoo
  • source - The UI component used to issue a search - e.g. urlbar, abouthome
  • country
  • locale
  • addon_version - The installed version of the [followonsearch addon]
  • app_version
  • distribution_id - NULL means the standard Firefox build
  • search_cohort - NULL except for small segments relating to search experimentation

There are three aggregation columns: sap, tagged-sap, and tagged-follow-on. Each of these columns represent different types of searches. For more details, see the search data documentation Note that, if there were no such searches in a row's segment (i.e. the count would be 0), the column value is null.

Accessing the Data

Access to search_aggregates is heavily restricted. You will not be able to access this table without additional permissions. For more details see the search data documentation.

Data Reference

Example Queries

This query calculates daily US searches. If you have trouble viewing this query, it's likely you don't have the proper permissions. For more details see the search data documentation.

Scheduling

This job is scheduled on airflow to run daily.

Schema

As of 2018-02-13, the current version of search_aggregates is v3, and has a schema as follows. The dataset is backfilled through 2016-06-06

root
 |-- country: string (nullable = true)
 |-- engine: string (nullable = true)
 |-- source: string (nullable = true)
 |-- submission_date: string (nullable = true)
 |-- app_version: string (nullable = true)
 |-- distribution_id: string (nullable = true)
 |-- locale: string (nullable = true)
 |-- search_cohort: string (nullable = true)
 |-- addon_version: string (nullable = true)
 |-- tagged-sap: long (nullable = true)
 |-- tagged-follow-on: long (nullable = true)
 |-- sap: long (nullable = true)

Code Reference

The search_aggregates job is defined in python_mozetl