Skip to content

Framework for setting up environment needed for ASV tests

grusev edited this page Feb 27, 2025 · 8 revisions

General

Writing ASV tests against a single type of storage, moreover of none shared type, such as LMDB, is generally straightforward. All the moving parts, including the storage, are isolated on a specific developer machine or a GitHub runner. However, writing tests that need to use shared storage and can be executed concurrently in multiple locations can become challenging and lead to unpredictable results, as they will start to ovewrite each other's data. Additionally, we aim to reuse the same tests with minimal changes (such as modifying a single parameter) to run against different storage types like LMDB, Amazon S3, GCP, and Azure.

Furthermore, if we want to set up similar environments for other types of tests, we cannot use our current approach for writing tests and setups that are tightly coupled with the ASV test classes.

We need a safer approach that requires adding an additional layer of abstraction to help us achieve this safely and effectively.

What abstraction could solve the problem?

An abstraction that makes all storages appear as shared storages (including LMDB) and safeguards against potential issues, such as overwriting other test data.

To achieve this, we can use prefixes for Amazon S3 and paths for file locations in LMDB.

At the top level (root), there is:

  • one Bucket for Amazon S3
  • one root folder for LMDB storages

That will be dedicated space for ASV test

From there, we can create several different spaces, dedicated to different needs.

Persistant space

This shared space is meant to be accessed by all machines and clients. It is used to store data that is written once and then read by all clients. Therefore, this space is designated with a common prefix (PERSISTENT_LIBS_PREFIX) in the bucket.

Modifyiable space

This private space is meant for each client and test, isolated from others (all modifiable operations require such space). The prefix for this space is constructed from several parts:

  • First, the common name for this prefix (MODIFIABLE_LIBS_PREFIX).
  • Then, a unique part for each machine added via an environment variable (ARCTICDB_PERSISTENT_STORAGE_SHARED_PATH_PREFIX). This can be done for GitHub and individually by each person.
  • The third part should be defined in each test case.

By concatenating these three parts, we create a separate unique space in the storage for tests that require private control.

Test Space

This space is exactly like the Persistent Space. It is used to test the framework to ensure that every operation working on the persistent space functions correctly. We can take the risk and execute unknown operations on the shared persistent space. Therefore, we need a safe space similar to the real one but not identical. Again, a shared space with a different prefix is needed, but this time, no concatenations are required to make it private.

Note, that above abstraction with prefixes works with LMDB too as this time it will be the folder structure and the name of folders.

EnvConfigurationBase class

That abstraction is used as foundation behind the motivation to create an abstract class EnvConfigurationBase that will hide all those complexitied and provide easy and safe way to access shared storages and write asv tests that are safely isolated from each other.

For persistent space operations it provides the asv test developer with following methods:

  • get_library(<optional_suffix>) gets or creates a library if it doesn't exist, using the name of the test with an optional suffix on shared space.
  • get_arctic_client_persistent()
  • For obvious reasons, no delete convenience methods are available.
  • set_test_mode() - A special method that makes the 'persistent' space a test space, allowing all operations to be executed safely. Can and should be used for writing tests
  • The following methods only make sense for persistent space and not for modifiable space: ** setup_environment() - Provides an implementation to check if the necessary components are present and creates them if they are not. Useful for setting up the cache. ** setup_all() - An abstract method called by setup_environment(), intended to be implemented by subclasses. ** check_ok() - Checks if the necessary components are present. Also used by setup_environment() and needs to be implemented because it is abstract.

For modifiable storage:

  • get_modifiable_library(<optional_suffix>) - Gets or creates a library with the name of the test, along with an optional suffix, on the modifiable space. To be used in the setup() method of an ASV test. As ASV runs tests in several processes it is trongly recommended to pass as suxif unique id like process pid - self.lib = get_modifiable_library(os.getpid())
  • get_arctic_client_modifiable()
  • remove_all_modifiable_libraries() - Defined only for modifiable libraries. Safely removes all libraries on the modifyable private space of the test. Can be uset in setup_cache() method of ASV test
  • delete_modifiable_library(<optional_suffix>) - to be use in teartown() methods - self.lib.delete_modifiable_library(os.getpid())

This design provides the foundation for creating concrete implementations for setting up different libraries and symbol setups for various needs.

Some setups can be reusable in the future for ASV or other tests. In those cases, it makes sense to provide concrete solutions (implementations of the base class).

However, some tests may have unique needs and thus reuse of any setup logic outside of the base framework for accessing shared persistent and modifiable storage space is not applicable. For these cases, a general-purpose, no-setup class is available - GeneralUseCaseNoSetup. It provides only the framework for accessing storage spaces safely, no other additional logic

Clone this wiki locally