Skip to content

Commit f1fa4e0

Browse files
committed
Update documentation
1 parent 229ef0e commit f1fa4e0

File tree

2 files changed

+80
-5
lines changed

2 files changed

+80
-5
lines changed

Utilities/StorageFactory/README.md

Lines changed: 55 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ Factory interface for constructing `edm::storage::Storage` instances. Also provi
1111
`StorageFactory` provides two implementations of `edm::storage::Storage` classes which can be used to wrap around any other `Storage` object.
1212

1313
### `edm::storage::LocalCacheFile`
14-
Does memory mapped caching of the wrapped `Storage` object. This is only applied if `CACHE_HINT_LAZY_DOWNLOAD` is set for `cacheHint` or the protocol handling code explicit passes `IOFlags::OpenWrap` to `StorageFactory::wrapNonLocalFile`. The wrapping does not happen if the Storage is open for writing nor if the Storage is associated with a file on the local file system.
14+
Does memory mapped caching of the wrapped `Storage` object. This is only applied if `CACHE_HINT_LAZY_DOWNLOAD` is set for `cacheHint` or the protocol handling code explicit passes `IOFlags::OpenWrap` to `StorageFactory::wrapNonLocalFile`. The wrapping does not happen if the Storage is open for writing nor if the Storage is associated with a file on the local file system. Note that files using the `file:` protocol _can_ end up using `LocalCacheFile` if the path is determined to be on a non-local file system.
1515

1616
### `edm::storage::StorageAccountProxy`
1717
This wraps the `Storage` object and provides per protocol accounting information (e.g. number of bytes read) to `edm::storage::StorageAccount`. This is only used if `StorageFactory::accounting()` returns `true`.
@@ -27,16 +27,66 @@ A singleton used to aggragate statistics about all storage calls for each protoc
2727
### `edm::storage::StorageAccount::StorageClassToken`
2828
Each protocol is associated to a token for quick lookup.
2929

30+
31+
## Generic storage proxies
32+
33+
This facility resembles the `edm::storage::LocalCacheFile` and `edm::storage::StorageAccountProxy` in the way that `edm::storage::Storage` objects constructed by the concrete `edm::storage::StorageMaker` are wrapped into other `edm::storage::Storage` objects.
34+
35+
The proxies are configured via `TFileAdaptor`'s `storageProxies` `VPSet` configuration parameter. The proxies are wrapped in the order they are specified in the `VPSet`, i.e. the first element wraps the concrete `edm::storage::Storage`, second element wraps the first element etc. The `edm::storage::StorageAccountProxy` and `edm::storage::LocalCacheFile` wrap the last storage proxy according to their usual behavior.
36+
37+
Each concrete proxy comes with two classes, the proxy class itself (inheriting from the `edm::storage::StorageProxyBase`) and a maker class (inheriting from the `edm::storage::StorageProxyMaker`). This "factory of factories" pattern is used because a maker is created once per job (in `TFileAdaptor`), and the maker object is used to create a proxy object for each file.
38+
39+
### Concrete proxy classes
40+
41+
The convention is to use the proxy class name as the plugin name for the maker, as the proxy is really what the user would care for. The headings of the subsections correspond to the plugin names.
42+
43+
#### `StorageTracerProxy`
44+
45+
The `edm::storage::StorageTracerProxy` (and the corresponding `edm::storage::StorageTracerProxyMaker`) produces a text file with a trace of all IO operations at the `StorageFactory` level. The behavior of each concrete `Storage` object (such as further splitting of read requests in `XrdAdaptor`) is not captured in these tracers. The structure of the trace file is described in a preamble in the trace file.
46+
47+
The plugin has a configuration parameter for a pattern for the trace files. The pattern must contain at least one `%I`. The maker has an atomic counter for the files, and all occurrences of `%I` are replaced with the value of that counter for the given file.
48+
49+
There is an `edmStorageTracer.py` script for doing some analyses of the traces.
50+
51+
The `StorageTracerProxy` also provides a way to correlate the trace entries with the rest of the framework via [MessageLogger](../../FWCore/MessageService/Readme.md) messages. These messages are issued with the DEBUG severity and `IOTrace` category. There are additional, higher-level messages as part of the `PoolSource`. To see these messages, compile the `Utilities/Storage` and `IOPool/Input` packages with `USER_CXXFLAGS="-DEDM_ML_DEBUG", and customize the MessageLogger configuration along
52+
```py
53+
process.MessageLogger.cerr.threshold = "DEBUG"
54+
process.MessageLogger.debugModules = ["*"]
55+
process.MessageLogger.IOTrace = dict()
56+
```
57+
58+
#### `StorageAddLatencyProxy`
59+
60+
The `edm::storage::StorageAddLatencyProxy` (and the corresponding `edm::storage::StorageAddLatencyProxyMaker`) can be used to add artifical latency to the IO operations. The plugin has configuration parameters for latencies of singular reads, vector reads, singular writes, and vector writes.
61+
62+
If used together with `StorageTracerProxy` to e.g. simulate the behavior of high-latency storage systems with e.g. local files, the `storageProxies` `VPSet` should have `StorageAddLatencyProxy` first, followed by `StorageTracerProxy`.
63+
64+
### Other components
65+
66+
#### `edm::storage::StorageProxyBase`
67+
68+
Inherits from `edm::storage::Storage` and is the base class for the proxy classes.
69+
70+
#### `edm::storage::StorageProxyMaker`
71+
72+
Base class for the proxy makers.
73+
74+
3075
## Related classes in other packages
3176

3277
### TStorageFactoryFile
3378
Inherits from `TFile` but uses `edm::storage::Storage` instances when doing the actual read/write operations. The class explicitly uses `"tstoragefile"` when communicating with `edm::storage::StorageAccount`.
3479

35-
### TFileAdaptor
36-
TFileAdaptor is a cmsRun Service. It explicitly registers the use of `TStorageFactoryFile` with ROOT's `TFile::Open` system. The parameters passed to `TFileAdaptor` are relayed to `edm::storage::StorageFactory` to setup the defaults for the job.
80+
### `TFileAdaptor`
3781

38-
### CondorStatusService
82+
`TFileAdaptor` is a cmsRun Service (with a plugin name of `AdaptorConfig`, see [IOPool/TFileAdaptor/README.md](../../IOPool/TFileAdaptor/README.md)). It explicitly registers the use of `TStorageFactoryFile` with ROOT's `TFile::Open` system. The parameters passed to `TFileAdaptor` are relayed to `edm::storage::StorageFactory` to setup the defaults for the job.
83+
84+
### `CondorStatusService`
3985
Sends condor _Chirp_ messages periodically from cmsRun. These include the most recent aggregated `edm::storage::StorageAccount` information for all protocols being used except for the `"tstoragefile"` protocol.
4086

41-
### StatisticsSenderService
87+
### `StatisticsSenderService`
4288
A cmsRun Service which sends out UDP packets about the state of the system. The information is sent when a primary file closes and includes the recent aggregated `edm::storage::StorageAccount` information for all protocols being used except for the `"tstoragefile"` protocol.
89+
90+
### `XrdAdaptor`
91+
92+
A `edm::storage::Storage` implementation for xrootd (see [Utilities/XrdAdaptor/README.md](../../Utilities/XrdAdaptor/README.md)).

Utilities/XrdAdaptor/README.md

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,31 @@ The `XrdAdaptor` package is the CMSSW implementation of CMS' AAA infrastructure.
66
* Recovery from some errors via re-tries
77
* Use of multiple XRootD sources (described further [here](doc/multisource_algorithm_design.md))
88

9+
The `XrdAdaptor` behavior can be simulated to some extent with local files with
10+
```py
11+
# application-only cache hint implies similar edm::storage::Storage::prefetch()
12+
# behavior as in XrdFile::prefetch()
13+
process.add_(cms.Service("SiteLocalConfigService",
14+
overrideSourceCacheHintDir = cms.untracked.string("application-only")
15+
))
16+
17+
# Add e.g. 10-millisecond latency to singular and vector reads
18+
# If the job reads local files via TFile::Open() in addition to PoolSource,
19+
# you want to exclude those from the latency addition
20+
process.add_(cms.Service("AdaptorConfig",
21+
storageProxies = cms.untracked.VPSet(
22+
cms.PSet(
23+
type = cms.untracked.string("StorageAddLatencyProxy"),
24+
read = cms.untracked.uint32(10000), # microseconds
25+
readv = cms.untracked.uint32(10000), # microseconds
26+
exclude = cms.untracked.vstring(...),
27+
)
28+
)
29+
))
30+
```
31+
The `StorageAddLatencyProxy` is described in [`Utilities/StorageFactory/README.md`](../../Utilities/StorageFactory/README.md). Another useful component in this context is `StorageTracerProxy` (e.g. to find out the other-than-`PoolSource`-accessed files mentioned above)
32+
33+
934
## Short description of components
1035

1136
### `ClientRequest`

0 commit comments

Comments
 (0)