Skip to content

Resource Model Definition

Manpreet edited this page Jul 22, 2023 · 4 revisions

The resource model definition YAML/JSON input is used by the test data generator tool to understand and generate all of the resources for which the metrics/logs/traces data is generated. The input file consists of a list of resource definitions and each item in the list describes the type of resource to be generated. Live examples are available here.

In the definition of each resource the following fields must be provided:

Field Description
name Type of the resource to be defined.
count The number of resources of this type to be generated.
attributes List of key-value pairs in the format <ATTRIBUTE_NAME, VALUE_EXPRESSION> where each pair defines an attribute to be sent for the specified resource type and an expression for the value that should be evaluated to generate the value for that attribute. See Attribute Value Expressions.

Other than the above mandatory fields, the following fields can also be defined for each resource:

Field Description
childrenDistribution Used to define a parent-child type relationship between two resource types. You define the relationship on the parent resource as a list of key-value pairs in the format <CHILD_TYPE, DISTRIBUTION_EXPRESSION> where each pair defines a child type and an expression that should be evaluated for each parent resource to obtain the number of child type resources to be mapped as its children. See Children Distribution Expressions.
attributeOperations List of expressions that should be evaluated to perform certain operations for all resources of the defined type. For example, copying attributes from parent resources. See Attribute Operation Expressions.
runtimeModifications List of resource ADD, REMOVE or CHURN operations that should be performed when the data generation is in progress. See Runtime Modifications.

Attribute Value Expressions

In the OpenTelemetry model, the resource section is composed of the attributes for that resource. In our model definition, we define each resource type and then we also specify the number of such resources to be generated. For example, we can say there's an resource type cluster and we want to have 5 clusters sending some telemetry data. Now each of these clusters will have the same set of attributes but the values must be different. To handle this in a succinct manner, instead of just providing hardcoded values for each resource type, we use expressions that are evaluated at runtime. These expressions leverage a variety of methods we have defined and listed below and use the Jakarta Expression Language for evaluation. Each of the following methods available are deterministic, i.e. every time you trigger them in a new JVM process, they will result in the same values.

Method Name Return Type Description
counter(String INPUT) String Returns the INPUT string prefixed to a counter value. The counter is stateful and is linked to the input string. For example, counter("cluster-name-") would return cluster-name-1, cluster-name-2, cluster-name-3 and so on for subsequent calls.
UUIDFromStringCounter(String INPUT) String Returns a type 3 UUID based on the counter of the INPUT string. So a call to this method with "resource" means UUID is generated with the value "resource1" and a subsequent call would result in the UUID being generated using "resource2" and so on. This ensures that the UUIDs are generated in sequence and are deterministic.
roundRobin([String VAL1, String VAL2, ...]) String Returns one of the strings provided in the input list in a sequential and stateful manner. Returns VAL1, VAL2, VAL3 and so on for each subsequent call.
alphanumericSequence(String INPUT) String Returns the next alphanumeric string derived from the INPUT in a stateful manner. The next alphanumeric string means the rightmost character is incremented to the next character in the sequence. For example, alphanumericSequence("abc8") → "abc8" and next calls return "abc9", "abca", "abcb" and so on.
alphanumericSequenceFromEnv() String There might be cases where you want to set the starting string for the alphanumeric sequence yourself. To handle this use case, this expression method obtains the string stored in the environment variable/system property for the ENV_ALPHANUMERIC key. This value is used as the seed value for the alphanumericSequence call.
IPv4Sequence(String START_IP) String Returns the next IPv4 address derived from the START_IP in a stateful manner. The next IPv4 address means the rightmost possible octet is incremented. For example, IPv4Sequence("128.10.114.254") → "128.10.114.254" and next calls return "128.10.114.255", "128.10.115.1", "128.10.115.2"
getLong(String EXPRESSION) Long An arithmetic expression is specified along with the count() method call, which always returns the value from a counter in the context of a specific attribute. For example, getLong("count() * 3 + 1") → 4 and next calls return 7, 10, 13 and so on. Obviously, you may omit the arithmetic expression and just use count() inside the getLong() call.
getDouble(String EXPRESSION) Double An arithmetic expression is specified along with the count() method call, which always returns the value from a counter in the context of a specific attribute. Works exactly like getLong method except that it returns double values. For example, getDouble("count() / 4") → 0.25 and next calls return 0.5, 0.75, 1.0 and so on.
getBoolean(String EXPRESSION) Boolean An arithmetic expression is specified along with count() method call, which whenever the expression evaluates to 0 returns false and true in all other cases. For example, getBoolean("count() % 2") → false and next calls return true, false, true and so on.

These methods can be used separately or can also be combined to create fairly realistic attribute values. For example:

  • Attribute name: aws.instance.id
    Value Expression: alphanumericSequence("ia1").concat("-").concat(alphanumericSequence("bims1tec7"))
    Values in sequence: ia1-bims1tec7, ia2-bims1tec8, ia3-bims1tec9, ia4-bims1teca
  • Attribute name: host.filesystem.mount_point
    Value Expression: counter("/dev/sda")
    Values in sequence: /dev/sda1, /dev/sda2, /dev/sda3

For attributes values of List and Map types, you can use the same expressions and specify literal attribute value. For example:

  • List → '[ counter("abc"), IPv4Sequence("10.10.111.1"), "xyz", getLong("count() * 2") ]'
    The lists generated on subsequent calls are: ["abc1", "10.10.111.1", "xyz", 2 ], ["abc2", "10.10.111.2", "xyz", 4], ["abc3", "10.10.111.3", "xyz", 6] and so on.
  • Map → '{"app": alphanumericSequence("svc-vodka"), "ip": IPv4Sequence("10.10.111.1"), "version": roundRobin(["latest", "22.5.0-142"])}'
    The maps generated on subsequent calls are: {"app": "svc-vodka", "ip": "10.10.111.1", "version": "latest"}, {"app": "svc-vodkb", "ip": "10.10.111.2", "version": "22.5.0-142"}, {"app": "svc-vodkc", "ip": "10.10.111.3", "version": "latest"} and so on.

It is possible to implement and provide your own expression methods for attribute values generation. See User Defined Expressions.


Children Distribution Expressions

In the resource model, we may want to express a parent-child relationship between some resources. This is primarily useful when you want to copy the attributes of the parent resource types into the child resource types without defining them separately for each of them and then having to make the changes everywhere in case there is a need to change them at a later time. It can also be useful to express these relationships in case the platform where you are sending the data also has the parent-child resource relationships and you want a similar representation in the tool to have some parallels.

In our telemetry generator, such relationships are defined on the parent resource type and in each entry we describe one child resource type. For example, we want to have a relationship between a node and a pod. However, in the resource model, we have let's say 5 nodes and 25 pods. Although a simple mapping wherein each node gets 5 pods is easy to do but it may not be as simple every time and you may want a different number of child resources for each parent resource. This is also handled via an expression method which looks like this:
int distribution(int BASE_VALUE, int EVERY_OTHER_INDEX, int MORE_VALUE)
The int value returned for each call of this method signifies the number of child resources to be mapped with each parent. The input parameters are defined as follows:

  • The BASE_VALUE is the number of child resources that should be assigned to all the parent resources. This must be greater than 0.
  • EVERY_OTHER_INDEX is a 1-based index value representing every Nth resource. For example, 3 here would mean every 3rd parent resource.
  • MORE_VALUE is the extra number of child resources that should be mapped to only the parent resources identified by EVERY_OTHER_INDEX.

Let's look at this using an example definition:

name: node
count: 3
childrenDistribution:
  pod: 'distribution(2, 2, 3)

In this example, we have specified that every node should get 2 pods and every 2nd node should get extra 3 pods. As a result, the pod count for each of the nodes in order would be:
node1 - 2 pods
node2 - 5 pods
node3 - 2 pods

Obviously, a pod must be a defined resource in the model and the count of pods defined is also considered. In the above example, the distribution expression mapped 9 pods in total. If the count of pods in the definition file is set to be more than 9, all the remaining unmapped child resources (pods) are mapped to the last parent (node3). On the other hand, if the number of pods defined is less than 9, this results in a many-many mapping as follows:
Consider pod count = 5. In this case, the same distribution expression results in the following mapping:
node1 - pod1, pod2
node2 - pod3, pod4, pod5, pod1, pod2
node3 - pod3, pod4
After the child resources run out, the expression starts picking the child resources from the beginning again. In all the mapping operations, the resources are picked in order so that the mapping is always deterministic.

It is possible to implement and provide your own expression methods for parent-child distribution. See User Defined Expressions.


Attribute Operation Expressions

This is the section in the resource definition which allows you to perform attribute operations like copy from the parent to the child resources. Since this has to be performed after the rest of the model is generated internally including the parent-child types mapping, it is defined in a separate section for each resource. As of now we have the following two methods available:

Method Name Description
copyFromParent(String PARENT_TYPE, String ATTRIBUTE_KEY) To copy an attribute key-value from a parent resource type to the current (child) resource type.
modifyFromParent(String PARENT_TYPE, String PARENT_ATTR_KEY, String TARGET_ATTRIBUTE, String SUFFIX_EXPRESSION) Get the value of PARENT_ATTR_KEY attribute from PARENT_TYPE, evaluate the SUFFIX_EXPRESSION and append the value to attribute value and set it as the value for TARGET_ATTRIBUTE in the current resource type.

In both of the above expressions, the PARENT_TYPE specified must be the immediate parent of the current resource, that is, the PARENT_TYPE must have the current resource as its defined child in the childrenDistribution. An example:

name: pod
count: 75
attributeOperations:
  -'copyFromParent("node", "cluster.name")'
  -'modifyFromParent("node", "cluster.name", "pod.name", "counter(\"-pod-\")")'

In the above example, node resource type is the immediate parent of the pod resource type. copyFromParent method copies the node's cluster.name attribute value to the pod. Whereas, modifyFromParent takes the attribute value in node's cluster.name appends "-pod-1" ("-pod-2", "-pod-3" and so on) as a suffix and sets it as the attribute value in pod's "pod.name" field.


Runtime Modifications

In real telemetry use cases, there are scenarios where an resource may become unavailable and other resources may have to be spun up in its place. This is called resource churn. For example a pod crashing and then Kubernetes bringing up a different pod for reconciling the desired state. By default, the telemetry data generator defines all the resources at the start and all the metrics/logs/traces data is posted against those resources. The optional runtime modifications section allows you to define scenarios where resources are added, removed or churned. The following fields are available to define each runtime modification for an resource type:

Field Is Mandatory? Description
resourceModificationType Yes Type of the modification. Valid values are ADD, REMOVE & CHURN.
modificationFrequencyMinutes Yes Number of minutes after which to repeat the operation.
modificationQuantity Yes Quantity of resources to ADD, REMOVE or CHURN (REMOVE and then ADD). For REMOVE and CHURN type of modifications, this number cannot be higher than the specified count of the resource.
startAfterMinutes No Instead of starting right away, start after 'N' number of minutes from the time when the data generation process was started.
endAfterMinutes Yes Minutes after which to end the modification loop running every modificationFrequencyMinutes. This number is from the start of the data generation process even if the startAfterMinutes is defined. This is mandatory because we generate these resources at the start of the data generation process due to parent-child mapping constraints and we need to calculate the total count of resources needed during the data generation process.

Expressions implementation

The attribute expressions and the children distribution expression is implemented here while the attribute operations expressions are implemented here.

It is possible to implement and provide your own expression methods. See User Defined Expressions.

Clone this wiki locally