Creating a custom Re:dash dataset - Firefox Data Documentation

Create a spark notebook that does the transformations you need, either on raw data (using Dataset API) or on parquet data
Output the results of that to an S3 location, usually telemetry-parquet/user/$YOUR_DATASET/v$VERSION_NUMBER/submission_date=$YESTERDAY/. This would partition by submission_date, meaning each day this runs and is outputted to a new location in S3. Do NOT put the submission_date in the parquet file as well! A column name cannot also be the name of a partition. Partitioning is optional, but datasets should have a version in the path.
Using this template, open a bug to publish the dataset (making it available in Spark and Re:dash) with the following attributes:
- Add whiteboard tag [DataOps]
- Title: "Publish dataset"
- Content: Location of the dataset in S3 (from step 2 above) and the desired table name

Firefox Data Documentation