Skip to content

SageData-OOD/snowplow_fractribution

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

early-release Release License Discourse posts

snowplow-logo sagedata-logo

snowplow-fractribution-redshift

This dbt package:

  • Adapted by SageData team from the Snowplow Fractribution v0.2.0 to run on Redshift (see changelog for details)
  • Uses page view and conversion events to perform Attribution Modelling on your Snowplow data.
  • Removed the need to use Python script or Docker image to create the final output table.
  • This is a BASE package, that will create base tables. Please see SageData-snowplow-attribution package that will create follow-up tables specific to your data needs and desired attribution model.

Please refer to the doc site for a full breakdown of the original package. Please see the GitRepo SageData-snowplow-attribution package that will create subsequent table and views.

Adapter Support

This package supports Redshift.

The snowplow-fractribution v0.2.0 package (original) currently supports Snowflake, Databricks and BigQuery.

Warehouse dbt versions snowplow-fractribution version
Snowflake, BigQuery, Databricks >=1.3.0 to <2.0.0 0.2.0
Snowflake >=1.0.0 to <2.0.0 0.1.0

Requirements

  • A dataset of web events from the Snowplow Javascript Tracker and familiarity with the snowplow-web dbt package
  • The snowplow_web_page_views table generated by the snowplow-web package available in your warehouse
  • A conversions dataset available in your warehouse (a dataset containing revenue data by users).
  • A channel spend dataset available in your warehouse (a dataset containing total spend by channel)

Installation

Check dbt Hub for the latest installation instructions, or read the dbt docs for more information on installing packages.

Configuration & Operation

Please see below or refer to the doc site for details on how to configure and run the package and the python script that creates the final output table.

Models

The package contains multiple models that are used by the Python script for the final attribution calculation:

Model Description
snowplow_fractribution_channel_counts A count of sessions per channel, campaign, source and medium
snowplow_fractribution_channel_spend The amount spent on advertising for each channel
snowplow_fractribution_conversions_by_customer_id Each conversion and associated revenue per customer_id
snowplow_fractribution_path_summary For each unique path, a summary of associated conversions, non conversions and revenue
snowplow_fractribution_paths_to_conversion Customer id and the the paths the customer has followed that have lead to conversion
snowplow_fractribution_paths_to_non_conversion Customer id and the the paths the customer has followed that have not lead to conversion
snowplow_fractribution_sessions_by_customer_id Channels per session by customer id

Setup steps

  1. Fractribution is dependent on the snowplow_web_page_views model created by the snowplow_web dbt package. Run snowplow_web if you do not have data in the snowplow_web_page_views table for the period of time you will run fractribution for.
  2. Configure the conversion_clause macro to filter your raw Snowplow events to successful conversion events.
  3. Configure the conversion_value macro to return the value of the conversion event.
  4. Configure the default channel_classification macro to yield your expected channels. The ROAS calculations / attribution calculations will run against these channel definitions.
  5. Configure the channel_spend macro to query your own spend data if you do not wish to use the default values.
  6. Overwrite default variables provided by the package in your dbt_project.yml file, if necessary. E.g.: make sure your snowplow__page_views_source and snowplow__conversions_source are aligned to what is available in your warehouse, and update snowplow__conversion_window_start_date and snowplow__conversion_window_end_date if you don't want the default of the last 30 days.​

Running (for detailed instructions check out our Quick Start guide)

  1. Ensure the setup steps have been completed above.
  2. Run dbt run, or dbt run --select snowplow_fractribution.

Differences to Google's Fractribution

There are some changes from Google's Fractribution code that have been noted below.

  • Temporary UDFs have been converted to persistent / permanent UDFs
  • Some temporary tables converted to permanent tables
  • Users without a user_id are treated as 'anonymous' ('f') users and the domain_userid is used to identify these sessions
  • Users with a user_id are treated as identified ('u') users
  • Templating is now run almost entirely within dbt rather than the custom SQL / Jinja templating in the original Fractribution project
  • Channel changes and contributions within a session can be considered using the snowplow__consider_intrasession_channels variable.

Intrasession channels

In Google Analytics (Universal Analytics) a new session is started if a campaign source changes (referrer of campaign tagged URL) which is used in Fractribution. Snowplow utilises activity based sessionisation rather than campaign based sessionisation. Setting snowplow__consider_intrasession_channels to false will take only the campaign information from the first page view in a given Snowplow session and not give credit to other channels in the converting session if they occur after the initial page view.

Filter unwanted channels

You can specify a list of channels for the variable snowplow__channels_to_exclude to exclude them from analysis (if kept empty all channels are kept). For example, users may want to exclude the 'Direct' channel from the analysis.

Copyright and license

The snowplow-fractribution package is Copyright 2022 Snowplow Analytics Ltd.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this software except in compliance with the License.

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

About

Snowplow dbt Fractribution with Redshift connector

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published