Skip to content

Conversation

@lucor
Copy link
Contributor

@lucor lucor commented Jan 17, 2025

This PR adds the WURFL Device Enrichment Module.

Overview

The WURFL Device Enrichment Module for Prebid Server enhances the OpenRTB 2.x payload
with comprehensive device detection data powered by ScientiaMobile’s WURFL device detection framework.
Thanks to WURFL's device database, the module provides accurate and comprehensive device-related information,
enabling bidders to make better-informed targeting and optimization decisions.

Key features

Device Field Enrichment

The WURFL module populates missing or empty fields in ortb2.device with the following data:

  • make: Manufacturer of the device (e.g., "Apple", "Samsung").
  • model: Device model (e.g., "iPhone 14", "Galaxy S22").
  • os: Operating system (e.g., "iOS", "Android").
  • osv: Operating system version (e.g., "16.0", "12.0").
  • h: Screen height in pixels.
  • w: Screen width in pixels.
  • ppi: Screen pixels per inch (PPI).
  • pxratio: Screen pixel density ratio.
  • devicetype: Device type (e.g., mobile, tablet, desktop).
  • js: Support for JavaScript, where 0 = no, 1 = yes

Note: If these fields are already populated in the bid request, the module will not overwrite them.

Publisher-Specific Enrichment

Device enrichment is selectively enabled for publishers based on their account ID.
The module identifies publishers through the following fields:

  • site.publisher.id (for web environments).
  • app.publisher.id (for mobile app environments).
  • dooh.publisher.id (for digital out-of-home environments).

Build prerequisites

To build the WURFL module, you need to install the WURFL Infuze from ScientiaMobile.
For more details, visit: ScientiaMobile WURFL Infuze.

Note

The WURFL module requires CGO at compile time to link against the WURFL Infuze library.

To enable the WURFL module, the wurfl build tag must be specified:

go build -tags wurfl .

If the wurfl tag is not provided, the module will compile a demo version that returns sample data,
allowing basic testing without an Infuze license.

@lucor lucor marked this pull request as ready for review January 20, 2025 17:06

// newWurflEngine creates a new Enricher
func newWurflEngine(c config) (wurflDeviceDetection, error) {
err := wurfl.Download(c.WURFLSnapshotURL, c.WURFLFileDirPath)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is called on PBS startup.

  1. How it will affect on PBS startup time?
  2. Does it take any timeout to download the data?
  3. Can be data downloaded async, separately from PBS initialisation?
  4. If it returns a downloaded error the PBS will crash

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for raising these points. To avoid affecting PBS startup time or introducing additional complexity, we've updated the module to manually copy the WURFL file during the setup step. This keeps the initialization simple and avoids the need for synchronous downloads at startup.

We've also simplified the configuration: only wurfl_snapshot_url is needed if periodic updates are desired, and those updates run asynchronously after startup.

Let us know if you'd prefer a different approach or see any remaining concerns.

@lucor lucor requested a review from linux019 May 21, 2025 14:19
This commit removes the code for building Docker images with WURFL
support from the Dockerfile and documents the build steps in the README
instead.
}

// HandleEntrypointHook implements hookstage.Entrypoint.
func (m Module) HandleEntrypointHook(ctx context.Context, invocationCtx hookstage.ModuleInvocationContext, payload hookstage.EntrypointPayload) (hookstage.HookResult[hookstage.EntrypointPayload], error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (golang - receiver types): Value receivers for non-trivial types should be pointer receivers. This struct is 24 bytes and is copied on every method call, creating unnecessary memory traffic on the hot path.

Performance impact: At 100K req/s, this creates ~622 GB/day of unnecessary copying.

Go best practices recommend pointer receivers when:

  • The receiver is a struct larger than a few words (this is 24 bytes = 3 words)
  • For consistency when the type implements interfaces
  • When the receiver represents a service object with identity

Authoritative sources:

Recommended changes:

// Change interface conformity checks
var (
    _ hookstage.RawAuctionRequest = (*Module)(nil)
    _ hookstage.Entrypoint        = (*Module)(nil)
)

// Change all methods to pointer receivers
func (m *Module) HandleEntrypointHook(...) (...)
func (m *Module) HandleRawAuctionHook(...) (...)
func (m *Module) isPublisherAllowed(...) bool

// Change Builder to return pointer
func Builder(...) (interface{}, error) {
    m := &Module{  // Create pointer instead of value
        we:      we,
        extCaps: cfg.ExtCaps,
    }
    // ...
    return m, nil
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our implementation follows the same pattern used by other PBS modules (ortb2blocking, etc.) which use value receivers for consistency. The official module development guide examples also show value receivers.
Should we move this module to pointer receivers and potentially set a better pattern for future PBS modules, or maintain consistency with the existing codebase?

}

// EnrichDevice enriches OpenRTB 2.x device with WURFL data
func (we wurflEnricher) EnrichDevice(device *openrtb2.Device) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (golang - receiver types): Value receiver should be a pointer receiver. This struct is 16 bytes (2 words) and the code already creates &wurflEnricher at line 118 in module.go, so the methods should match.

Why pointer receivers:

  • Semantic consistency: The code creates we := &wurflEnricher{...} (line 118 module.go)
  • Performance: Eliminates 48 bytes of copying per request
  • Best practice: For consistency, if the type is used as a pointer, methods should use pointer receivers

Change all three methods:

Suggested change
func (we wurflEnricher) EnrichDevice(device *openrtb2.Device) {
func (we *wurflEnricher) EnrichDevice(device *openrtb2.Device) {

Also update:

  • Line 139: func (we *wurflEnricher) wurflExtData() ([]byte, error)
  • Line 150: func (we *wurflEnricher) makeDeviceType() adcom1.DeviceType

See Go Wiki - Receiver Type and Uber Go Style Guide for guidance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added in 5c6433a

func makeBrandList(brandVersions []openrtb2.BrandVersion) string {
var builder strings.Builder
first := true
for _, version := range brandVersions {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

performance: Use iterutil to avoid copying BrandVersion structs

The openrtb2.BrandVersion struct is ~64 bytes (string + 2 slice headers). The current range loop copies each struct:

for _, version := range brandVersions {  // Copies ~64 bytes per iteration
    if version.Brand == "" {
        continue
    }
    // ...
}

Consider using iterutil.SlicePointerValues to avoid the copy:

for version := range iterutil.SlicePointerValues(brandVersions) {  // No copy, just pointer
    if version.Brand == "" {
        continue
    }
    // ...
    brandName := escapeClientHintField(version.Brand)
    builder.WriteString(brandName)
    builder.WriteString(`;v="`)
    builder.WriteString(strings.Join(version.Version, "."))
    builder.WriteString(`"`)
}

See the iterutil package documentation for more details on avoiding struct copies during iteration.

References:

  • util/iterutil/doc.go - Package documentation
  • util/iterutil/slices.go - SlicePointerValues implementation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added in e59c190

}

func escapeClientHintField(value string) string {
return `"` + strings.ReplaceAll(value, `"`, `\"`) + `"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: Incomplete escaping and missing backslash handling

The current implementation only escapes double quotes but has several issues:

func escapeClientHintField(value string) string {
    return `"` + strings.ReplaceAll(value, `"`, `\"`) + `"`
}

Problems:

  1. Missing backslash escaping: According to RFC 9651 § 3.3.3, both " AND \ must be escaped in structured header strings. A value containing \ will break the header.

  2. Wrong escape order: Must escape backslashes BEFORE quotes, otherwise \" becomes \\\" instead of \\\".

  3. Backticks: No, backticks do NOT need escaping in RFC 9651 strings. Only " and \ are escapable.

Standard library option:

strconv.Quote() handles escaping but adds Go-specific escape sequences (\t, \n, \xFF, \u0100) which are NOT valid in RFC 9651. RFC 9651 explicitly states: "Only DQUOTE and '\' can be escaped; other characters after '\' MUST cause parsing to fail."

Recommended fix using strings.Replacer:

// Package-level initialized once
var clientHintEscaper = strings.NewReplacer(
    `\`, `\\`,  // Must escape backslash FIRST
    `"`, `\"`,  // Then escape quotes
)

func escapeClientHintField(value string) string {
    return `"` + clientHintEscaper.Replace(value) + `"`
}

Yes, a global strings.Replacer is useful here because:

  • Initialized once at package load time (no allocation overhead)
  • Optimized for multi-character replacements
  • Guarantees correct escape order
  • Hot path function called for every brand/platform/model value

Alternative - third-party library:

Consider using github.com/shogo82148/go-sfv which implements RFC 9651 correctly, but this adds a dependency.

References:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in b194499 removing unnecessary escaping logic. SUA values from OpenRTB are already decoded strings that only need quoting for HTTP header format.

lucor and others added 19 commits January 7, 2026 15:27
Co-authored-by: Sheridan C Rawlins <41922797+scr-oath@users.noreply.github.com>
Co-authored-by: Sheridan C Rawlins <41922797+scr-oath@users.noreply.github.com>
Co-authored-by: Sheridan C Rawlins <41922797+scr-oath@users.noreply.github.com>
Co-authored-by: Sheridan C Rawlins <41922797+scr-oath@users.noreply.github.com>
Co-authored-by: Sheridan C Rawlins <41922797+scr-oath@users.noreply.github.com>
Co-authored-by: Sheridan C Rawlins <41922797+scr-oath@users.noreply.github.com>
Co-authored-by: Sheridan C Rawlins <41922797+scr-oath@users.noreply.github.com>
Co-authored-by: Sheridan C Rawlins <41922797+scr-oath@users.noreply.github.com>
Removed unnecessary escaping logic. SUA values from OpenRTB are already
decoded strings that only need quoting for HTTP header format.
@lucor lucor requested a review from scr-oath January 8, 2026 15:49
}

func quoteClientHintField(value string) string {
return `"` + value + `"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical: Escaping removed entirely - quotes and backslashes will break headers

The function was renamed from escapeClientHintField to quoteClientHintField but all escaping logic was removed:

func quoteClientHintField(value string) string {
    return `"` + value + `"`  // NO ESCAPING!
}

This is a critical regression. The previous version at least escaped quotes (though incompletely):

return `"` + strings.ReplaceAll(value, `"`, `\"`) + `"`

Problem scenarios:

  1. Brand name with quote: "My Browser" → Produces invalid header: ""My Browser""
  2. Brand name with backslash: My\Browser → Produces unparseable header: "My\Browser"

Per RFC 9651 § 3.3.3, structured header strings MUST escape both " and \ characters:

var clientHintEscaper = strings.NewReplacer(
    `\`, `\\`,  // Escape backslash FIRST
    `"`, `\"`,  // Then escape quotes
)

func quoteClientHintField(value string) string {
    return `"` + clientHintEscaper.Replace(value) + `"`
}

The rename to quoteClientHintField is good (more accurate name), but the escaping logic must be restored and completed.

References:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added proper RFC 9651 escaping (backslash and double-quote) as defensive programming in d08bff2

}

// HandleEntrypointHook implements hookstage.Entrypoint.
func (m Module) HandleEntrypointHook(ctx context.Context, invocationCtx hookstage.ModuleInvocationContext, payload hookstage.EntrypointPayload) (hookstage.HookResult[hookstage.EntrypointPayload], error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

performance: Module methods still use value receivers - should be pointer receivers

While wurflEnricher was correctly changed to use pointer receivers, the Module struct (24 bytes) still uses value receivers for all its methods:

func (m Module) HandleEntrypointHook(...) (...)  // Copies 24 bytes
func (m Module) HandleRawAuctionHook(...) (...)  // Copies 24 bytes  
func (m Module) isPublisherAllowed(...) bool     // Copies 24 bytes

This was identified in an earlier review comment. At 100K requests/sec, this causes ~622GB/day of unnecessary copying.

Recommended fix:

func (m *Module) HandleEntrypointHook(...) (...)  // No copy
func (m *Module) HandleRawAuctionHook(...) (...)  // No copy
func (m *Module) isPublisherAllowed(...) bool     // No copy

Module struct size breakdown:

  • we wurflDeviceDetection (interface): 16 bytes
  • allowedPublisherIDs map[string]struct{}: 8 bytes
  • extCaps bool: 1 byte (+ 7 padding)
  • Total: 24 bytes per copy

This is the same issue that was fixed for wurflEnricher - Module should follow the same pattern.

References:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our implementation follows the same pattern used by other PBS modules (ortb2blocking, etc.) which use value receivers for consistency. The official module development guide examples also show value receivers.
Should we move this module to pointer receivers and potentially set a better pattern for future PBS modules, or maintain consistency with the existing codebase?

@lucor lucor requested a review from scr-oath January 12, 2026 13:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants