-
Notifications
You must be signed in to change notification settings - Fork 857
New module: WURFL Device Enrichment Module #4158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
|
||
| // newWurflEngine creates a new Enricher | ||
| func newWurflEngine(c config) (wurflDeviceDetection, error) { | ||
| err := wurfl.Download(c.WURFLSnapshotURL, c.WURFLFileDirPath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is called on PBS startup.
- How it will affect on PBS startup time?
- Does it take any timeout to download the data?
- Can be data downloaded async, separately from PBS initialisation?
- If it returns a downloaded error the PBS will crash
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for raising these points. To avoid affecting PBS startup time or introducing additional complexity, we've updated the module to manually copy the WURFL file during the setup step. This keeps the initialization simple and avoids the need for synchronous downloads at startup.
We've also simplified the configuration: only wurfl_snapshot_url is needed if periodic updates are desired, and those updates run asynchronously after startup.
Let us know if you'd prefer a different approach or see any remaining concerns.
This commit removes the code for building Docker images with WURFL support from the Dockerfile and documents the build steps in the README instead.
| } | ||
|
|
||
| // HandleEntrypointHook implements hookstage.Entrypoint. | ||
| func (m Module) HandleEntrypointHook(ctx context.Context, invocationCtx hookstage.ModuleInvocationContext, payload hookstage.EntrypointPayload) (hookstage.HookResult[hookstage.EntrypointPayload], error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (golang - receiver types): Value receivers for non-trivial types should be pointer receivers. This struct is 24 bytes and is copied on every method call, creating unnecessary memory traffic on the hot path.
Performance impact: At 100K req/s, this creates ~622 GB/day of unnecessary copying.
Go best practices recommend pointer receivers when:
- The receiver is a struct larger than a few words (this is 24 bytes = 3 words)
- For consistency when the type implements interfaces
- When the receiver represents a service object with identity
Authoritative sources:
- Effective Go - Pointers vs Values
- Go Wiki - Receiver Type
- Go FAQ - Methods on Values or Pointers
- Uber Go Style Guide: "Pointers to structs should almost always be used as receivers"
Recommended changes:
// Change interface conformity checks
var (
_ hookstage.RawAuctionRequest = (*Module)(nil)
_ hookstage.Entrypoint = (*Module)(nil)
)
// Change all methods to pointer receivers
func (m *Module) HandleEntrypointHook(...) (...)
func (m *Module) HandleRawAuctionHook(...) (...)
func (m *Module) isPublisherAllowed(...) bool
// Change Builder to return pointer
func Builder(...) (interface{}, error) {
m := &Module{ // Create pointer instead of value
we: we,
extCaps: cfg.ExtCaps,
}
// ...
return m, nil
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our implementation follows the same pattern used by other PBS modules (ortb2blocking, etc.) which use value receivers for consistency. The official module development guide examples also show value receivers.
Should we move this module to pointer receivers and potentially set a better pattern for future PBS modules, or maintain consistency with the existing codebase?
| } | ||
|
|
||
| // EnrichDevice enriches OpenRTB 2.x device with WURFL data | ||
| func (we wurflEnricher) EnrichDevice(device *openrtb2.Device) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (golang - receiver types): Value receiver should be a pointer receiver. This struct is 16 bytes (2 words) and the code already creates &wurflEnricher at line 118 in module.go, so the methods should match.
Why pointer receivers:
- Semantic consistency: The code creates
we := &wurflEnricher{...}(line 118 module.go) - Performance: Eliminates 48 bytes of copying per request
- Best practice: For consistency, if the type is used as a pointer, methods should use pointer receivers
Change all three methods:
| func (we wurflEnricher) EnrichDevice(device *openrtb2.Device) { | |
| func (we *wurflEnricher) EnrichDevice(device *openrtb2.Device) { |
Also update:
- Line 139:
func (we *wurflEnricher) wurflExtData() ([]byte, error) - Line 150:
func (we *wurflEnricher) makeDeviceType() adcom1.DeviceType
See Go Wiki - Receiver Type and Uber Go Style Guide for guidance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added in 5c6433a
| func makeBrandList(brandVersions []openrtb2.BrandVersion) string { | ||
| var builder strings.Builder | ||
| first := true | ||
| for _, version := range brandVersions { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
performance: Use iterutil to avoid copying BrandVersion structs
The openrtb2.BrandVersion struct is ~64 bytes (string + 2 slice headers). The current range loop copies each struct:
for _, version := range brandVersions { // Copies ~64 bytes per iteration
if version.Brand == "" {
continue
}
// ...
}Consider using iterutil.SlicePointerValues to avoid the copy:
for version := range iterutil.SlicePointerValues(brandVersions) { // No copy, just pointer
if version.Brand == "" {
continue
}
// ...
brandName := escapeClientHintField(version.Brand)
builder.WriteString(brandName)
builder.WriteString(`;v="`)
builder.WriteString(strings.Join(version.Version, "."))
builder.WriteString(`"`)
}See the iterutil package documentation for more details on avoiding struct copies during iteration.
References:
- util/iterutil/doc.go - Package documentation
- util/iterutil/slices.go - SlicePointerValues implementation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added in e59c190
| } | ||
|
|
||
| func escapeClientHintField(value string) string { | ||
| return `"` + strings.ReplaceAll(value, `"`, `\"`) + `"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue: Incomplete escaping and missing backslash handling
The current implementation only escapes double quotes but has several issues:
func escapeClientHintField(value string) string {
return `"` + strings.ReplaceAll(value, `"`, `\"`) + `"`
}Problems:
-
Missing backslash escaping: According to RFC 9651 § 3.3.3, both
"AND\must be escaped in structured header strings. A value containing\will break the header. -
Wrong escape order: Must escape backslashes BEFORE quotes, otherwise
\"becomes\\\"instead of\\\". -
Backticks: No, backticks do NOT need escaping in RFC 9651 strings. Only
"and\are escapable.
Standard library option:
strconv.Quote() handles escaping but adds Go-specific escape sequences (\t, \n, \xFF, \u0100) which are NOT valid in RFC 9651. RFC 9651 explicitly states: "Only DQUOTE and '\' can be escaped; other characters after '\' MUST cause parsing to fail."
Recommended fix using strings.Replacer:
// Package-level initialized once
var clientHintEscaper = strings.NewReplacer(
`\`, `\\`, // Must escape backslash FIRST
`"`, `\"`, // Then escape quotes
)
func escapeClientHintField(value string) string {
return `"` + clientHintEscaper.Replace(value) + `"`
}Yes, a global strings.Replacer is useful here because:
- Initialized once at package load time (no allocation overhead)
- Optimized for multi-character replacements
- Guarantees correct escape order
- Hot path function called for every brand/platform/model value
Alternative - third-party library:
Consider using github.com/shogo82148/go-sfv which implements RFC 9651 correctly, but this adds a dependency.
References:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in b194499 removing unnecessary escaping logic. SUA values from OpenRTB are already decoded strings that only need quoting for HTTP header format.
Co-authored-by: Sheridan C Rawlins <41922797+scr-oath@users.noreply.github.com>
Co-authored-by: Sheridan C Rawlins <41922797+scr-oath@users.noreply.github.com>
Co-authored-by: Sheridan C Rawlins <41922797+scr-oath@users.noreply.github.com>
Co-authored-by: Sheridan C Rawlins <41922797+scr-oath@users.noreply.github.com>
Co-authored-by: Sheridan C Rawlins <41922797+scr-oath@users.noreply.github.com>
Co-authored-by: Sheridan C Rawlins <41922797+scr-oath@users.noreply.github.com>
Co-authored-by: Sheridan C Rawlins <41922797+scr-oath@users.noreply.github.com>
Co-authored-by: Sheridan C Rawlins <41922797+scr-oath@users.noreply.github.com>
Removed unnecessary escaping logic. SUA values from OpenRTB are already decoded strings that only need quoting for HTTP header format.
| } | ||
|
|
||
| func quoteClientHintField(value string) string { | ||
| return `"` + value + `"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
critical: Escaping removed entirely - quotes and backslashes will break headers
The function was renamed from escapeClientHintField to quoteClientHintField but all escaping logic was removed:
func quoteClientHintField(value string) string {
return `"` + value + `"` // NO ESCAPING!
}This is a critical regression. The previous version at least escaped quotes (though incompletely):
return `"` + strings.ReplaceAll(value, `"`, `\"`) + `"`Problem scenarios:
- Brand name with quote:
"My Browser"→ Produces invalid header:""My Browser"" - Brand name with backslash:
My\Browser→ Produces unparseable header:"My\Browser"
Per RFC 9651 § 3.3.3, structured header strings MUST escape both " and \ characters:
var clientHintEscaper = strings.NewReplacer(
`\`, `\\`, // Escape backslash FIRST
`"`, `\"`, // Then escape quotes
)
func quoteClientHintField(value string) string {
return `"` + clientHintEscaper.Replace(value) + `"`
}The rename to quoteClientHintField is good (more accurate name), but the escaping logic must be restored and completed.
References:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added proper RFC 9651 escaping (backslash and double-quote) as defensive programming in d08bff2
| } | ||
|
|
||
| // HandleEntrypointHook implements hookstage.Entrypoint. | ||
| func (m Module) HandleEntrypointHook(ctx context.Context, invocationCtx hookstage.ModuleInvocationContext, payload hookstage.EntrypointPayload) (hookstage.HookResult[hookstage.EntrypointPayload], error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
performance: Module methods still use value receivers - should be pointer receivers
While wurflEnricher was correctly changed to use pointer receivers, the Module struct (24 bytes) still uses value receivers for all its methods:
func (m Module) HandleEntrypointHook(...) (...) // Copies 24 bytes
func (m Module) HandleRawAuctionHook(...) (...) // Copies 24 bytes
func (m Module) isPublisherAllowed(...) bool // Copies 24 bytesThis was identified in an earlier review comment. At 100K requests/sec, this causes ~622GB/day of unnecessary copying.
Recommended fix:
func (m *Module) HandleEntrypointHook(...) (...) // No copy
func (m *Module) HandleRawAuctionHook(...) (...) // No copy
func (m *Module) isPublisherAllowed(...) bool // No copyModule struct size breakdown:
we wurflDeviceDetection(interface): 16 bytesallowedPublisherIDs map[string]struct{}: 8 bytesextCaps bool: 1 byte (+ 7 padding)- Total: 24 bytes per copy
This is the same issue that was fixed for wurflEnricher - Module should follow the same pattern.
References:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our implementation follows the same pattern used by other PBS modules (ortb2blocking, etc.) which use value receivers for consistency. The official module development guide examples also show value receivers.
Should we move this module to pointer receivers and potentially set a better pattern for future PBS modules, or maintain consistency with the existing codebase?
This PR adds the WURFL Device Enrichment Module.
Overview
The WURFL Device Enrichment Module for Prebid Server enhances the OpenRTB 2.x payload
with comprehensive device detection data powered by ScientiaMobile’s WURFL device detection framework.
Thanks to WURFL's device database, the module provides accurate and comprehensive device-related information,
enabling bidders to make better-informed targeting and optimization decisions.
Key features
Device Field Enrichment
The WURFL module populates missing or empty fields in
ortb2.devicewith the following data:Publisher-Specific Enrichment
Device enrichment is selectively enabled for publishers based on their account ID.
The module identifies publishers through the following fields:
site.publisher.id(for web environments).app.publisher.id(for mobile app environments).dooh.publisher.id(for digital out-of-home environments).Build prerequisites
To build the WURFL module, you need to install the WURFL Infuze from ScientiaMobile.
For more details, visit: ScientiaMobile WURFL Infuze.
Note
The WURFL module requires CGO at compile time to link against the WURFL Infuze library.
To enable the WURFL module, the
wurflbuild tag must be specified:If the
wurfltag is not provided, the module will compile a demo version that returns sample data,allowing basic testing without an Infuze license.