Skip to content

Commit 6d55dc4

Browse files
authored
Merge pull request #218 from sy-c/master
v2.10.1
2 parents 639809b + 6552dec commit 6d55dc4

File tree

9 files changed

+58
-10
lines changed

9 files changed

+58
-10
lines changed

doc/configurationParameters.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -132,10 +132,11 @@ The parameters related to 3rd-party libraries are described here for convenience
132132
| equipment-dummy-* | eventMinSize | bytes | 128k | Minimum size of randomly generated event. |
133133
| equipment-dummy-* | fillData | int | 0 | Pattern used to fill data page: (0) no pattern used, data page is left untouched, with whatever values were in memory (1) incremental byte pattern (2) incremental word pattern, with one random word out of 5. |
134134
| equipment-player-* | autoChunk | int | 0 | When set, the file is replayed once, and cut automatically in data pages compatible with memory bank settings and RDH information. In this mode the preLoad and fillPage options have no effect. |
135-
| equipment-player-* | autoChunkLoop | int | 0 | When set, the file is replayed in loops. Trigger orbit counter in RDH are modified for iterations after the first one, so that they keep increasing. If value is negative, only that number of loop is executed (-5 -> 5x replay). |
135+
| equipment-player-* | autoChunkLoop | int | 0 | When set, the file is replayed in loops. If value is negative, only that number of loop is executed (-5 -> 5x replay). |
136136
| equipment-player-* | filePath | string | | Path of file containing data to be injected in readout. |
137137
| equipment-player-* | fillPage | int | 1 | If 1, content of data file is copied multiple time in each data page until page is full (or almost full: on the last iteration, there is no partial copy if remaining space is smaller than full file size). If 0, data file is copied exactly once in each data page. |
138138
| equipment-player-* | preLoad | int | 1 | If 1, data pages preloaded with file content on startup. If 0, data is copied at runtime. |
139+
| equipment-player-* | updateOrbits | int | 1 | When set, trigger orbit counters in all RDH are modified for iterations after the first one (in file loop replay mode), so that they keep increasing. |
139140
| equipment-rorc-* | cardId | string | | ID of the board to be used. Typically, a PCI bus device id. c.f. AliceO2::roc::Parameters. |
140141
| equipment-rorc-* | channelNumber | int | 0 | Channel number of the board to be used. Typically 0 for CRU, or 0-5 for CRORC. c.f. AliceO2::roc::Parameters. |
141142
| equipment-rorc-* | cleanPageBeforeUse | int | 0 | If set, data pages are filled with zero before being given for writing by device. Slow, but usefull to readout incomplete pages (driver currently does not return correctly number of bytes written in page. |
@@ -148,9 +149,11 @@ The parameters related to 3rd-party libraries are described here for convenience
148149
| equipment-zmq-* | type | string | SUB | Type of ZMQ socket to use to get data (PULL, SUB). |
149150
| readout | aggregatorSliceTimeout | double | 0 | When set, slices (groups) of pages are flushed if not updated after given timeout (otherwise closed only on beginning of next TF, or on stop). |
150151
| readout | aggregatorStfTimeout | double | 0 | When set, subtimeframes are buffered until timeout (otherwise, sent immediately and independently for each data source). |
152+
| readout | customCommands | string | | List of key=value pairs defining some custom shell commands to be executed at before/after state change commands. |
151153
| readout | disableAggregatorSlicing | int | 0 | When set, the aggregator slicing is disabled, data pages are passed through without grouping/slicing. |
152154
| readout | disableTimeframes | int | 0 | When set, all timeframe related features are disabled (this may supersede other config parameters). |
153155
| readout | exitTimeout | double | -1 | Time in seconds after which the program exits automatically. -1 for unlimited. |
156+
| readout | flushConsumerTimeout | double | 1 | Time in seconds to wait before stopping the consumers (ie wait allocated pages released). 0 means stop immediately. |
154157
| readout | flushEquipmentTimeout | double | 1 | Time in seconds to wait for data once the equipments are stopped. 0 means stop immediately. |
155158
| readout | logbookApiToken | string | | The token to be used for the logbook API. |
156159
| readout | logbookEnabled | int | 0 | When set, the logbook is enabled and populated with readout stats at runtime. |

doc/memory.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
Here is a typical buffer organisation for a production FLP:
2+
- there is one buffer per CRU end point or CRORC channel, providing empty data pages (superpages) to the hardware for DMA transfer from the readout card to the computer memory.
3+
- there is one buffer for the FMQ channel to Data Distribution, to copy data which overlap data pages to a single block of memory. Data Distribution requires that data from a HeartBeatFrame (HBF) is shipped in a single FMQ message.
4+
5+
All buffers and data transfer operations are independant one from the other. If a single buffer gets full, the whole FLP system can be affected: CRU packets dropped, incomplete timeframes, data synchronization or consistency issues.
6+
7+
Buffers are circular, pages are used in the order they are put in the buffer. On startup, the pages are in the order of their memory address.
8+
Buffer and pages size should be adapted to the throughput and data pattern at runtime.
9+
10+
11+
In case of some "buffer low" issues, there are 3 log messages for each episode:
12+
13+
1) "buffer usage is high", when reaching 90% usage.
14+
2) "buffer full" at 100% usage.
15+
3) "buffer back to reasonable" when down to below 80% usage.
16+
17+
When one of the buffer is full, other messages will start to appear, depending on the context (no page left, packets dropped, etc).
18+

doc/releaseNotes.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -447,3 +447,7 @@ This file describes the main feature changes for each readout.exe released versi
447447
- readout.flushConsumerTimeout: when set, readout waits up to this amount of time that all data pages locked by consumers are released before stopping.
448448
- Added warning message on buffers low.
449449
- Added message at end of run showing the links which have provided data for each equipment and how much per link.
450+
451+
## v2.10.1 - 20/04/2022
452+
- Updated configuration parameters:
453+
- equipment-file-*.updateOrbits: when set to zero, RDH orbits are not updated in file loop replay. This is needed for some reconstruction tests. This however creates a stream of data with inconsistent orbit ids and mismatching timeframe information.

src/DataBlock.h

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -62,16 +62,17 @@ struct DataBlockHeader {
6262
uint32_t timeframeOrbitLast; ///< from timeframe
6363
uint8_t flagEndOfTimeframe; ///< flag to signal this is the last TF block
6464
uint8_t isRdhFormat; ///< flag set when payload is RDH-formatted
65+
uint32_t orbitOffset; ///< set when RDH orbits should be added given offset to match TFid
6566

6667
uint8_t userSpace[DataBlockHeaderUserSpace]; ///< spare area for user data
6768
};
6869

6970
// Version of this header
7071
// with DB marker for DataBlock start, 1st byte in header little-endian
71-
const uint32_t DataBlockVersion = 0x0002DBDB;
72+
const uint32_t DataBlockVersion = 0x0003DBDB;
7273

7374
// DataBlockHeader instance with all default fields
74-
const DataBlockHeader defaultDataBlockHeader = { .headerVersion = DataBlockVersion, .headerSize = sizeof(DataBlockHeader), .dataSize = 0, .blockId = undefinedBlockId, .pipelineId = undefinedBlockId, .timeframeId = undefinedTimeframeId, .runNumber = undefinedRunNumber, .systemId = undefinedSystemId, .feeId = undefinedFeeId, .equipmentId = undefinedEquipmentId, .linkId = undefinedLinkId, .timeframeOrbitFirst = undefinedOrbit, .timeframeOrbitLast = undefinedOrbit, .flagEndOfTimeframe = 0, .isRdhFormat = 1, .userSpace = { 0 } };
75+
const DataBlockHeader defaultDataBlockHeader = { .headerVersion = DataBlockVersion, .headerSize = sizeof(DataBlockHeader), .dataSize = 0, .blockId = undefinedBlockId, .pipelineId = undefinedBlockId, .timeframeId = undefinedTimeframeId, .runNumber = undefinedRunNumber, .systemId = undefinedSystemId, .feeId = undefinedFeeId, .equipmentId = undefinedEquipmentId, .linkId = undefinedLinkId, .timeframeOrbitFirst = undefinedOrbit, .timeframeOrbitLast = undefinedOrbit, .flagEndOfTimeframe = 0, .isRdhFormat = 1, .orbitOffset = undefinedOrbit, .userSpace = { 0 } };
7576

7677
// DataBlock
7778
// Pair of header + payload data

src/ReadoutEquipment.cxx

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -634,8 +634,9 @@ int ReadoutEquipment::tagDatablockFromRdh(RdhHandle& h, DataBlockHeader& bh)
634634
isError = 1;
635635
} else {
636636
// timeframe ID
637-
hbOrbit = h.getHbOrbit();
637+
hbOrbit = h.getHbOrbit() + bh.orbitOffset;
638638
tfId = getTimeframeFromOrbit(hbOrbit);
639+
// printf("orbit %X + offset %X = %X -> TFid %d\n",(int)h.getHbOrbit(), (int)bh.orbitOffset, (int)hbOrbit, (int)tfId);
639640

640641
// system ID
641642
systemId = h.getSystemId();
@@ -661,6 +662,9 @@ int ReadoutEquipment::tagDatablockFromRdh(RdhHandle& h, DataBlockHeader& bh)
661662
bh.equipmentId = equipmentId;
662663
bh.linkId = linkId;
663664
getTimeframeOrbitRange(tfId, bh.timeframeOrbitFirst, bh.timeframeOrbitLast);
665+
bh.timeframeOrbitFirst -= bh.orbitOffset;
666+
bh.timeframeOrbitLast -= bh.orbitOffset;
667+
// printf("TF %d eq %d link %d : orbits %X - %X\n", (int)bh.timeframeId, (int)bh.equipmentId, (int)bh.linkId, (int)bh.timeframeOrbitFirst, (int)bh.timeframeOrbitLast);
664668
return isError;
665669
}
666670

src/ReadoutEquipmentPlayer.cxx

Lines changed: 15 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,7 @@ class ReadoutEquipmentPlayer : public ReadoutEquipment
5454
PacketHeader lastPacketHeader; // keep track of last packet header
5555

5656
uint32_t orbitOffset = 0; // to be applied to orbit after 1st loop
57+
int cfgUpdateOrbits = 1; // when set, all RDHs are modified to update orbit, according to orbitOffset
5758

5859
void copyFileDataToPage(void* page); // fill given page with file data according to current settings
5960
};
@@ -97,11 +98,16 @@ ReadoutEquipmentPlayer::ReadoutEquipmentPlayer(ConfigFile& cfg, std::string cfgE
9798
cfg.getOptionalValue<int>(cfgEntryPoint + ".fillPage", fillPage, 1);
9899
// configuration parameter: | equipment-player-* | autoChunk | int | 0 | When set, the file is replayed once, and cut automatically in data pages compatible with memory bank settings and RDH information. In this mode the preLoad and fillPage options have no effect. |
99100
cfg.getOptionalValue<int>(cfgEntryPoint + ".autoChunk", autoChunk, 0);
100-
// configuration parameter: | equipment-player-* | autoChunkLoop | int | 0 | When set, the file is replayed in loops. Trigger orbit counter in RDH are modified for iterations after the first one, so that they keep increasing. If value is negative, only that number of loop is executed (-5 -> 5x replay). |
101+
// configuration parameter: | equipment-player-* | autoChunkLoop | int | 0 | When set, the file is replayed in loops. If value is negative, only that number of loop is executed (-5 -> 5x replay). |
101102
cfg.getOptionalValue<int>(cfgEntryPoint + ".autoChunkLoop", autoChunkLoop, 0);
103+
// configuration parameter: | equipment-player-* | updateOrbits | int | 1 | When set, trigger orbit counters in all RDH are modified for iterations after the first one (in file loop replay mode), so that they keep increasing. |
104+
cfg.getOptionalValue<int>(cfgEntryPoint + ".updateOrbits", cfgUpdateOrbits, 1);
102105

103106
// log config summary
104-
theLog.log(LogInfoDevel_(3002), "Equipment %s: using data source file=%s preLoad=%d fillPage=%d autoChunk=%d autoChunkLoop=%d", name.c_str(), filePath.c_str(), preLoad, fillPage, autoChunk, autoChunkLoop);
107+
theLog.log(LogInfoDevel_(3002), "Equipment %s: using data source file=%s preLoad=%d fillPage=%d autoChunk=%d autoChunkLoop=%d updateOrbits=%d", name.c_str(), filePath.c_str(), preLoad, fillPage, autoChunk, autoChunkLoop, cfgUpdateOrbits);
108+
if ((!cfgUpdateOrbits)&&(autoChunkLoop)) {
109+
theLog.log(LogWarningDevel_(3104), "Equipment %s: RDH orbits auto-update is disabled, generated data will be inconsistent (TFid and orbit counters mismatch)", name.c_str());
110+
}
105111

106112
// open data file
107113
fp = fopen(filePath.c_str(), "rb");
@@ -205,6 +211,10 @@ DataBlockContainerReference ReadoutEquipmentPlayer::getNextBlock()
205211
// no need to fill header defaults, this is done by getNewDataBlockContainer()
206212
// only adjust payload size
207213
b->header.dataSize = 0;
214+
// and possibly set orbit offset
215+
if (!cfgUpdateOrbits) {
216+
b->header.orbitOffset = orbitOffset;
217+
}
208218

209219
if (autoChunk) {
210220
bool isOk = 1;
@@ -231,6 +241,7 @@ DataBlockContainerReference ReadoutEquipmentPlayer::getNextBlock()
231241
loopCount++;
232242
fileOffset = 0;
233243
orbitOffset = lastPacketHeader.timeframeId * getTimeframePeriodOrbits();
244+
// printf("loop %d: offset = %X\n",(int)loopCount, (int)orbitOffset);
234245
isOk = 1;
235246
}
236247
}
@@ -251,7 +262,7 @@ DataBlockContainerReference ReadoutEquipmentPlayer::getNextBlock()
251262
isOk = 0;
252263
break;
253264
}
254-
if (orbitOffset) {
265+
if (cfgUpdateOrbits) {
255266
// update RDH orbit when applicable
256267
h.incrementHbOrbit(orbitOffset);
257268
}
@@ -261,7 +272,7 @@ DataBlockContainerReference ReadoutEquipmentPlayer::getNextBlock()
261272
currentPacketHeader.linkId = (int)h.getLinkId();
262273
currentPacketHeader.equipmentId = (int)(h.getCruId() * 10 + h.getEndPointId());
263274

264-
int hbOrbit = h.getHbOrbit();
275+
int hbOrbit = h.getHbOrbit() + b->header.orbitOffset;
265276
currentPacketHeader.timeframeId = getTimeframeFromOrbit(hbOrbit);
266277

267278
// fill page metadata

src/ReadoutVersion.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,5 +9,5 @@
99
// granted to it by virtue of its status as an Intergovernmental Organization
1010
// or submit itself to any jurisdiction.
1111

12-
#define READOUT_VERSION "2.10.0"
12+
#define READOUT_VERSION "2.10.1"
1313

src/readoutConfigEditor.tcl

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -110,10 +110,11 @@ set configurationParametersDescriptor {
110110
| equipment-dummy-* | eventMinSize | bytes | 128k | Minimum size of randomly generated event. |
111111
| equipment-dummy-* | fillData | int | 0 | Pattern used to fill data page: (0) no pattern used, data page is left untouched, with whatever values were in memory (1) incremental byte pattern (2) incremental word pattern, with one random word out of 5. |
112112
| equipment-player-* | autoChunk | int | 0 | When set, the file is replayed once, and cut automatically in data pages compatible with memory bank settings and RDH information. In this mode the preLoad and fillPage options have no effect. |
113-
| equipment-player-* | autoChunkLoop | int | 0 | When set, the file is replayed in loops. Trigger orbit counter in RDH are modified for iterations after the first one, so that they keep increasing. If value is negative, only that number of loop is executed (-5 -> 5x replay). |
113+
| equipment-player-* | autoChunkLoop | int | 0 | When set, the file is replayed in loops. If value is negative, only that number of loop is executed (-5 -> 5x replay). |
114114
| equipment-player-* | filePath | string | | Path of file containing data to be injected in readout. |
115115
| equipment-player-* | fillPage | int | 1 | If 1, content of data file is copied multiple time in each data page until page is full (or almost full: on the last iteration, there is no partial copy if remaining space is smaller than full file size). If 0, data file is copied exactly once in each data page. |
116116
| equipment-player-* | preLoad | int | 1 | If 1, data pages preloaded with file content on startup. If 0, data is copied at runtime. |
117+
| equipment-player-* | updateOrbits | int | 1 | When set, trigger orbit counters in all RDH are modified for iterations after the first one (in file loop replay mode), so that they keep increasing. |
117118
| equipment-rorc-* | cardId | string | | ID of the board to be used. Typically, a PCI bus device id. c.f. AliceO2::roc::Parameters. |
118119
| equipment-rorc-* | channelNumber | int | 0 | Channel number of the board to be used. Typically 0 for CRU, or 0-5 for CRORC. c.f. AliceO2::roc::Parameters. |
119120
| equipment-rorc-* | cleanPageBeforeUse | int | 0 | If set, data pages are filled with zero before being given for writing by device. Slow, but usefull to readout incomplete pages (driver currently does not return correctly number of bytes written in page. |
@@ -126,9 +127,11 @@ set configurationParametersDescriptor {
126127
| equipment-zmq-* | type | string | SUB | Type of ZMQ socket to use to get data (PULL, SUB). |
127128
| readout | aggregatorSliceTimeout | double | 0 | When set, slices (groups) of pages are flushed if not updated after given timeout (otherwise closed only on beginning of next TF, or on stop). |
128129
| readout | aggregatorStfTimeout | double | 0 | When set, subtimeframes are buffered until timeout (otherwise, sent immediately and independently for each data source). |
130+
| readout | customCommands | string | | List of key=value pairs defining some custom shell commands to be executed at before/after state change commands. |
129131
| readout | disableAggregatorSlicing | int | 0 | When set, the aggregator slicing is disabled, data pages are passed through without grouping/slicing. |
130132
| readout | disableTimeframes | int | 0 | When set, all timeframe related features are disabled (this may supersede other config parameters). |
131133
| readout | exitTimeout | double | -1 | Time in seconds after which the program exits automatically. -1 for unlimited. |
134+
| readout | flushConsumerTimeout | double | 1 | Time in seconds to wait before stopping the consumers (ie wait allocated pages released). 0 means stop immediately. |
132135
| readout | flushEquipmentTimeout | double | 1 | Time in seconds to wait for data once the equipments are stopped. 0 means stop immediately. |
133136
| readout | logbookApiToken | string | | The token to be used for the logbook API. |
134137
| readout | logbookEnabled | int | 0 | When set, the logbook is enabled and populated with readout stats at runtime. |
@@ -144,6 +147,9 @@ set configurationParametersDescriptor {
144147
| readout | timeStop | string | | In standalone mode, time at which to execute stop. If not set, on int/term/quit signal. |
145148
| readout-monitor | broadcastHost | string | | used by readout-status to connect to readout-monitor broadcast channel. |
146149
| readout-monitor | broadcastPort | int | 0 | when set, the process will create a listening TCP port and broadcast statistics to connected clients. |
150+
| readout-monitor | logFile | string | | when set, the process will log received metrics to a file. |
151+
| readout-monitor | logFileHistory | int | 1 | defines the maximum number of previous log files to keep, when a maximum size is set. |
152+
| readout-monitor | logFileMaxSize | int | 128 | defines the maximum size of log file (in MB). When reaching this threshold, the log file is rotated. |
147153
| readout-monitor | monitorAddress | string | tcp://127.0.0.1:6008 | Address of the receiving ZeroMQ channel to receive readout statistics. |
148154
| readout-monitor | outputFormat | int | 0 | 0: default, human readable. 1: raw bytes. |
149155
| receiverFMQ | channelAddress | string | ipc:///tmp/pipe-readout | c.f. parameter with same name in consumer-FairMQchannel-* |

src/readoutErrorCodes.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,7 @@
3333
{ 3101, "A feature is configured but not supported by this readout build.", nullptr},
3434
{ 3102, "Syntax error in configuration.", nullptr},
3535
{ 3103, "Inconsistent parameters in configuration.", nullptr},
36+
{ 3104, "Some valid but unsafe configuration parameters are used.", nullptr},
3637
3738
{ 3210, "Logbook problem", nullptr},
3839
{ 3220, "Timeframe server problem", nullptr},

0 commit comments

Comments
 (0)