Skip to content

Conversation

gpop63
Copy link
Contributor

@gpop63 gpop63 commented Jun 11, 2024

Overview

This change simplifies the configuration by making the fields.yml file optional, as long as the config provides fields definitions — this is being checked in the command.

A new FieldMapping struct that matches the Field struct has been added in config. If a path to a fields definitions file is provided, the generator uses the field definitions from that file. Otherwise, the field definitions are created directly from the config file.

The Value field inside the Field struct has been changed from string to any. This eliminates the need for type conversion. Converting would be straightforward for integers and floats, but we can also have slices as values. Making the Value field match the type of the Value field from the Config struct seems like a more suitable approach.

How I tested

Used aws.ec2_metrics/schema-b.

configs.yml

fields:
  - name: dimensionType
    type: keyword
    # no dimension: 2.5%, AutoScalingGroupName: 10%, ImageId: 5%, InstanceType: 2.5%, InstanceId: 80%
    enum: ["", "AutoScalingGroupName", "AutoScalingGroupName", "AutoScalingGroupName", "AutoScalingGroupName", "ImageId", "ImageId", "InstanceType", "InstanceId", "InstanceId", "InstanceId", "InstanceId", "InstanceId", "InstanceId", "InstanceId", "InstanceId", "InstanceId", "InstanceId", "InstanceId", "InstanceId", "InstanceId", "InstanceId", "InstanceId", "InstanceId", "InstanceId", "InstanceId", "InstanceId", "InstanceId", "InstanceId", "InstanceId", "InstanceId", "InstanceId", "InstanceId", "InstanceId", "InstanceId", "InstanceId", "InstanceId", "InstanceId", "InstanceId", "InstanceId"]
    cardinality: 600
      # we want every single different "dimension identifier", regardless of its type, to have always the same generated fixed "metadata" once the cardinality kicks in
      # for this we must take the ordered highest enum length appending one by one the ones that does not have a 0 module between each others.
      # we start from the first two, multiple between their values and exclude from the order list the ones that have a 0 module on the result of the multiplication.
      # we end up with the list of enum lengths whose value, multiplied, define the least common multiple: this is the value we must use for the cardinality of all fields.
      # in this case the remaining enum are two: `dimensionType` (40) and `region` (15), resulting in cardinality `600`
  - name: Region
    type: keyword
    enum: ["ap-south-1", "eu-north-1", "eu-west-3", "eu-west-2", "eu-west-1", "ap-northeast-3", "ap-northeast-2", "ap-northeast-1", "ap-southeast-1", "ap-southeast-2", "eu-central-1", "us-east-1", "us-east-2", "us-west-1", "us-west-2"]
    cardinality: 600
  - name: AutoScalingGroupName
    type: keyword
    cardinality: 600
  - name: ImageId
    type: keyword
    cardinality: 600
  - name: InstanceId
    type: keyword
    cardinality: 600
  - name: instanceTypeIdx
    type: long
    # we generate and index for the instance type enums, so that all the information related to a given type are properly matched
    range:
      min: 0
      max: 19
    cardinality: 600
  - name: InstanceType
    type: keyword
    value: ["a1.medium", "c3.2xlarge", "c4.4xlarge", "c5.9xlarge", "c5a.12xlarge", "c5ad.16xlarge", "c5d.24xlarge", "c6a.32xlarge", "g5.48xlarge", "d2.2xlarge", "d3.xlarge", "t2.medium", "t2.micro", "t2.nano", "t2.small", "t3.large", "t3.medium", "t3.micro", "t3.nano", "t3.small"]
  - name: instanceCoreCount
    type: keyword
    # they map instance types
    value: ["1", "4", "8", "18", "24", "32", "48", "64", "96", "4", "2", "2", "1", "1", "1", "1", "1", "1", "1", "1"]
  - name: instanceThreadPerCore
    type: keyword
    # they map instance types
    value: ["1", "2", "2", " 2", " 2", " 2", " 2", " 2", " 2", "2", "2", "1", "1", "1", "1", "2", "2", "2", "2", "2"]
  - name: instanceImageId
    type: keyword
    cardinality: 600
  - name: instanceMonitoringState
    type: keyword
    # enable: 10%, disabled: 90%
    enum: ["enabled", "disabled", "disabled", "disabled", "disabled", "disabled", "disabled", "disabled", "disabled", "disabled"]
    cardinality: 600
  - name: instancePrivateIP
    type: ip
    cardinality: 600
  - name: instancePrivateDnsEmpty
    type: keyword
    # without private dns entry: 10%, with private dns entry: 90%
    enum: ["empty", "fromPrivateIP", "fromPrivateIP", "fromPrivateIP", "fromPrivateIP", "fromPrivateIP", "fromPrivateIP", "fromPrivateIP", "fromPrivateIP", "fromPrivateIP"]
    cardinality: 600
  - name: instancePublicIP
    type: ip
    cardinality: 600
  - name: instancePublicDnsEmpty
    type: keyword
    # without public dns entry: 20%, with public dns entry: 80%
    enum: ["empty", "fromPublicIP", "fromPublicIP", "fromPublicIP", "fromPublicIP"]
    cardinality: 600
  - name: instanceStateName
    type: keyword
    # terminated: 10%, running: 90%
    enum: ["terminated", "running", "running", "running", "running", "running", "running", "running", "running", "running"]
    cardinality: 600
  - name: cloudInstanceName
    type: keyword
    cardinality: 600
  - name: StatusCheckFailed_InstanceAvg
    type: double
    range:
      min: 0
      max: 10
    fuzziness: 0.05
  - name: StatusCheckFailed_SystemAvg
    type: double
    range:
      min: 0
      max: 10
    fuzziness: 0.05
  - name: StatusCheckFailedAvg
    type: double
    range:
      min: 0
      max: 10
    fuzziness: 0.05
  - name: CPUUtilizationAvg
    type: double
    range:
      min: 0
      max: 100
    fuzziness: 0.05
  - name: NetworkPacketsInSum
    type: double
    range:
      min: 0
      max: 1500000
    fuzziness: 0.05
  - name: NetworkPacketsOutSum
    type: double
    range:
      min: 0
      max: 1500000
    fuzziness: 0.05
  - name: CPUCreditBalanceAvg
    type: double
    range:
      min: 0
      max: 5000
    fuzziness: 0.05
  - name: CPUSurplusCreditBalanceAvg
    type: double
    range:
      min: 0
      max: 5000
    fuzziness: 0.05
  - name: CPUSurplusCreditsChargedAvg
    type: double
    range:
      min: 0
      max: 5000
    fuzziness: 0.05
  - name: CPUCreditUsageAvg
    type: double
    range:
      min: 0
      max: 10
    fuzziness: 0.05
  - name: DiskReadBytesSum
    type: double
    range:
      min: 0
      max: 1500000
    fuzziness: 0.05
  - name: DiskReadOpsSum
    type: double
    range:
      min: 0
      max: 1000
    fuzziness: 0.05
  - name: DiskWriteBytesSum
    type: double
    range:
      min: 0
      max: 1500000000
    fuzziness: 0.05
  - name: DiskWriteOpsSum
    type: double
    range:
      min: 0
      max: 1000
    fuzziness: 0.05
  - name: EventDuration
    type: long
    range:
      min: 1
      max: 1000
  - name: partOfAutoScalingGroup
    type: long
    # we dived this value by 20 in the template, giving 20% chance to be part of an autoscaling group: in this case we append the related aws.tags
    range:
      min: 1
      max: 100
  - name: EventIngested
    type: date

go run main.go generate-with-template ./assets/templates/aws.ec2_metrics/schema-b/gotext.tpl --config-file ./assets/templates/aws.ec2_metrics/schema-b/configs.yml --tot-events 10

Closes: #148

@gpop63 gpop63 requested a review from a team as a code owner June 11, 2024 19:43
@gpop63 gpop63 force-pushed the simplify_config branch from e2102af to b1b4f42 Compare June 11, 2024 19:45
@shmsr shmsr self-requested a review July 1, 2024 19:39
@shmsr shmsr added the enhancement New feature or request label Jul 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Simplify config

2 participants