Replies: 1 comment
-
The poll does not have a preference system, so putting it here in case it matters:
PS: @rroupski thanks for putting this together and adding the examples, much nicer. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Background and Assumptions:
Proposal 0: Unix-time, single timestamp (the current format)
The number of milliseconds since the Epoch 01/01/1970 00:00:00 UTC.
Example:
Pros: Compute optimized, no redundancy
Cons: Loss of precision for some timestamps, loss of variability, not human friendly
Proposal 1: Timestamp object
Constraints: must have one of unixtime_seconds, unixtime_millis, unixtime_micros, unixtime_nanos, iso8601, or int96
Pros: Maximum flexibility, no redundancy
Cons: No guaranteed consistency across records (major con!), though implementations could enforce it manually
Example:
Proposal 2: Multiplicity of timestamps, no constraints
Each timestamp will be represented in a variety of methods:
How will this handle variable precision? Additional fields?
Example:
Pros: Consistency across records, allows for a multiplicity of use cases optimized for each method.
Cons: Requires additional actions to reconstitute timestamps (query and append/addition). Large amount of required fields for each timestamp. Ambiguity and redundancy are increased.
Proposal 3: Multiplicity of timestamps, with constraints
This one looks the same as the Proposal 2 above?
Example:
Outstanding question: Will constraints be universal or specific to each class?
Pros: OCSF is not opinionated; decision left to implementation.
Cons: No guaranteed internal consistency.
Proposal 4: Timestamps with profiles, with default/required profile
Profile:
Notes
Could apply multiple profiles or just one (to mimic above with a multiplicity of timestamps)
Because "time" is a very important field, OCSF should ensure that a default "time" type of profile is utilized if no other is specified.
Example:
Pros: Maximum flexibility and personalization for custom use cases - the decision is left to the implementer, and OCSF is not opinionated.
Cons: Requires substantial rework of code base. Queries are only consistent with records that satisfy the same profile.
Proposal 5: ISO8601 or RFC3339 / Java SQL Timestamp (string)
Example:
Pros: No loss of precision; handles variability; no redundancy. Native output format for most logs, works for easy translation to many programming languages. Natural expression of many tools that support parquet.
Cons: Not compute optimized (for some systems and operations). Does not natively handle timezones.
Proposal 6: see Proposal 0
Proposal 7: Unix-time single timestamp with optional fields
time: timestamp (unixtime)
time_millis: int
time_micros: int
time_nanos: int
original_timezone: int
Example:
Pros: Compute quasi-optimized (have to do joins for precision and checks for variability)
Cons: Not human friendly
Proposal 8: Similar to the Java Instant class
— 64-bit number for UTC time in seconds
— 32-bit number the nanosecond-of-second
— Timezone offset
Note: the timezone offset should be restricted to -18:00 to 18:00 inclusive.
Optional attribute for a user friendly time:
Example:
Proposal 9: Similar to proposal 8, but a single timestamp with precision
— 64-bit number for UTC time
— 8-bit number representing precision (0, 3, 6, 9)
— Timezone offset
Note: the timezone offset should be restricted to -18:00 to 18:00 inclusive.
Optional attribute for a user friendly time:
Example:
Pros: same as option 8 but storage (and performance?) optimized
Cons: timestamp range decreases as precision increases (does it matter in the real world if we can't have a timestamp beyond 2262?)
Proposal 10: 2 timestamp types; timestamp_int and timestamp_string
Timestamp_int would handle ONLY the time field, and timestamp_string would be used for all other timestamps in the log. In this proposal, the expected utility and requirements of the fields can determine the timestamp type.
The
time
field would be in unixtime. As this is considered a core, reserved, base_event field, it doesn't need to convey the precision or accommodate for variability. The precision may be further clarified by OCSF collaborators (seconds, milliseconds, microseconds, or nanoseconds), but as this is not meant to convey a field from the log, it may exist per OCSF specs only. The intent of this field is to allow for optimized quick sort and filtering of logs.An optional field should be added to base_event, called event_time (may be renamed per OCSF discussion). It should account for variable precision. As such, it will be in RFC3339/string (no timezome informatin). This would be utilized when logs convey their own reported "event_time". It may correspond with
time
, but will be in a different format with potentially different precision.Every other timestamp throughout the record should be normalized to RFC3339. This will allow for a standard convention that accounts for precision and allows for variability.
Queries, duration, or filtering should be optimized with a standard, normalized, universal "time" field. The additional timestamps may be most commonly displayed in queries, making a string format ideal. This does not prevent them from being queried or utilized, but should allow for an optimization of the most commonly expected use cases.
Timezone information should be captured just once, not per timestamp.
Example:
Pros: Optimized for queries and display without requiring any conversions.
Cons: Complex queries may have to account for an unexpected divergence in timestamp formats.
Proposal 11: 2 timestamp types; timestamp_t (int Unix time) and timestamp_ex_t (RFC 3339 string)
The timestamp_t would be used for all time fields (presently 19), and timestamp_ex_t would be used in a core profile, Timestamp_EX. Data sources that support extended precision would apply the profile. The core profile can be automatically updated when new timestamp attributes are added to the schema, keeping things consistent.
Optionally, the profile could carry and mix in two attributes: a instant_t (int64) and timestamp_ex_t (RFC 3339 string).
The
time
field would always be in timestamp_t Unix time. This is a normalized time after processing by the receiver (i.e. correcting for clock synchronization where possible). As this is considered a core, reserved, base_event field, it doesn't need to convey the precision or accommodate for variability. The precision may be further clarified by OCSF collaborators (seconds, milliseconds, microseconds, or nanoseconds), but as this is not meant to convey a field from the log, it may exist per OCSF specs only. The intent of this field is to allow for optimized quick sort and filtering of logs.An optional field should be added to base_event, called event_time (may be renamed per OCSF discussion). When requiring variable precision the Timestamp_EX profile would be applied as with all other timestamps with the exception of the reserved 'time' field. This would be utilized when logs convey their own reported "event_time". It may correspond with
time
, but via the profile may be in a different format with potentially different precision.Queries, duration, or filtering should be optimized with a standard, normalized, universal "time" field. The additional timestamps may be most commonly displayed in queries, making a string format ideal. This does not prevent them from being queried or utilized, but should allow for an optimization of the most commonly expected use cases.
Timezone information should be captured just once, not per timestamp.
Example:
Pros: Consistent across core schema, optimized for space and speed. Can handle additional precision via profile when needed.
Cons: Profile will add attributes and needs to be automatically updated as new timestamp fields are added to the schema.
Alternative: rather than use timestamp_t (int), for time (or _time), use instant_t.
13 votes ·
Beta Was this translation helpful? Give feedback.
All reactions