Skip to content

SageData-OOD/sagedata_snowplow_attribution

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

License

sagedata-logo

snowplow-attribution-redshift

This dbt package:

  • Uses data from modified snowplow-fractribution-redshift package to do First touch, Last touch attribution, as well as define all intermediary touchpoints so you can apply whatever attribution model you like.
  • This package also uses base tables created by snowplow_web

Please refer to the SageData doc site for SageData documentation. Please refer to the Snowplow doc site for a Snowplow related documentation and full breakdown of the fractribution package.

Adapter Support

The snowplow-attribution-redshift package currently supports only Redshift.

Requirements

  • A dataset of web events from the Snowplow Javascript Tracker and familiarity with the snowplow-web dbt package
  • The snowplow_web_page_views table generated by the snowplow-web package available in your warehouse
  • The snowplow_web_sessions table generated by the snowplow-web package available in your warehouse
  • The snowplow_web_users table generated by the snowplow-web package available in your warehouse
  • The snowplow_web_user_mapping table generated by the snowplow-web package available in your warehouse
  • A conversions dataset available in your warehouse (a dataset containing revenue data by users).
  • A channel spend dataset available in your warehouse (a dataset containing total spend by channel)
  • A snowplow_fractribution_conversions_by_customer_id dataset created by snowplow-fractribution package

Installation

Check dbt Hub for the latest installation instructions, or read the dbt docs for more information on installing packages.

Configuration & Operation

Please see below or refer to the doc site for details on how to configure and run the original Snowplow package.

Models

The package contains multiple models that are created from the base snowplow-fractribution-redshift script

Model Description
sagedata_conversions_w_touchpoints Table containing all of the conversions with all the First and Last touchpoints that have led to those conversions
sagedata_channel_spend_day The amount spent on advertising for each channel by day
sagedata_page_views_to_conversion Table that shows all the pageviews that have happened prior to the conversionwithin the conversion window
snowplow_web_page_views_w_channel A view on top of snowplow_web_page_views, but with added Channel Classification from macro channel_classification
snowplow_web_sessions_w_channel A view on top of snowplow_web_sessions, but with added Channel Classification from macro channel_classification
snowplow_web_users_w_channel A view on top of snowplow_web_users, but with added Channel Classification from macro channel_classification

Setup steps

  1. Fractribution is dependent on the snowplow_web_page_views model created by the snowplow_web dbt package. Run snowplow_web if you do not have data in the snowplow_web_page_views table for the period of time you will run fractribution for.
  2. Configure the conversion_clause macro to filter your raw Snowplow events to successful conversion events.
  3. Configure the conversion_value macro to return the value of the conversion event.
  4. Configure the default channel_classification macro to yield your expected channels. The ROAS calculations / attribution calculations will run against these channel definitions.
  5. Configure the channel_spend macro to query your own spend data if you do not wish to use the default values.
  6. Overwrite default variables provided by the package in your dbt_project.yml file, if necessary. E.g.: make sure your snowplow__page_views_source and snowplow__conversions_source are aligned to what is available in your warehouse, and update snowplow__conversion_window_start_date and snowplow__conversion_window_end_date if you don't want the default of the last 30 days.​

Running (for detailed instructions check out our Quick Start guide)

  1. Ensure the setup steps have been completed above.
  2. Run dbt run, or dbt run --select sagedata_snowplow_attribution.

Differences to Google's Fractribution

There are some changes from Google's Fractribution code that have been noted below.

  • Temporary UDFs have been converted to persistent / permanent UDFs
  • Some temporary tables converted to permanent tables
  • Users without a user_id are treated as 'anonymous' ('f') users and the domain_userid is used to identify these sessions
  • Users with a user_id are treated as identified ('u') users
  • Templating is now run almost entirely within dbt rather than the custom SQL / Jinja templating in the original Fractribution project
  • Channel changes and contributions within a session can be considered using the snowplow__consider_intrasession_channels variable.

Intrasession channels

In Google Analytics (Universal Analytics) a new session is started if a campaign source changes (referrer of campaign tagged URL) which is used in Fractribution. Snowplow utilises activity based sessionisation rather than campaign based sessionisation. Setting snowplow__consider_intrasession_channels to false will take only the campaign information from the first page view in a given Snowplow session and not give credit to other channels in the converting session if they occur after the initial page view.

Filter unwanted channels

You can specify a list of channels for the variable snowplow__channels_to_exclude to exclude them from analysis (if kept empty all channels are kept). For example, users may want to exclude the 'Direct' channel from the analysis.

Copyright and license

The snowplow-fractribution package is Copyright 2022 Snowplow Analytics Ltd.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this software except in compliance with the License.

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

About

Part 2 of SageData Snowplow Attribution designed to run on Redshift

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published