Skip to content

Conversation

@jpreston-cern
Copy link
Contributor

@jpreston-cern jpreston-cern commented Jan 6, 2026

This is the first in a series of PR's that aim to bring GBTS up to date with ACTS coding standards.

This PR focuses on refactoring the parsing code for the LUT so that its in its own file mimicking that of the connector file as well as moving it to the constructor of GbtsSeedingAlgorithm. It also includes some code clean up, as well as moving the seedFinder constructor to initialize as well. This gives some speed increase due to not parsing the LUT every event.

--- END COMMIT MESSAGE ---

This will affect the Athena API but a branch is already ready to submit to the canary

@timadye @andiwand

@github-actions github-actions bot added this to the next milestone Jan 6, 2026
@github-actions github-actions bot added Component - Core Affects the Core module Component - Examples Affects the Examples module Seeding Track Finding labels Jan 6, 2026
@jpreston-cern jpreston-cern changed the title Ref: Refactoring of string parsing code for LUT and connection table refactor: Refactoring of string parsing code for LUT and connection table Jan 6, 2026
@jpreston-cern jpreston-cern changed the title refactor: Refactoring of string parsing code for LUT and connection table refactor: Refactoring of string parsing code for Gbts LUT and connection table Jan 6, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Jan 6, 2026

📊: Physics performance monitoring for ca2fd45

Full contents

physmon summary

Comment on lines 50 to 55
const std::vector<std::array<float, 5>>& getParsedLut() const {
return m_mlLUT;
}

private:
std::vector<std::array<float, 5>> m_mlLUT{};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking at the rest of the implementation I would rather use a free function to model the behavior. generally I would avoid doing heavy lifting in constructors as this will be unexpected. the behavior can be easily modeled in a functional way params -> std::vector<std::array<float, 5>>

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is reasonable to have something like GbtsLutParser as a holder for the lookup table, rather than using std::vector<std::array<float, 5>> directly. Perhaps it could better be called something like GbtsLookupTable. This could go all the way into GbtsDataStorage, so all usage could be via GbtsLookupTable::get() (more compact than getParsedLut()) or other methods.

As @andiwand suggests, it is probably good to move the file parsing to separate code, perhaps in a GbtsLookupTable.cpp file. It could be a private method called from the constructor - I personally don't see a problem with constructors that do a lot of work.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think conceptually the class GbtsLutParser is not a parser right now. a parser should have a method to parse. currently this class is more like a GbtsLookupTable like you suggested. but even then I would not recommend to construct it from a path and letting the constructor read the file.

so ultimately I would recommend either having a class GbtsLookupTable or simply a typedef with that name to std::vector<std::array<float, 5>> in case this is the only property to hold. to get a GbtsLookupTable from file I would either have a static method on GbtsLookupTable or a free function that returns the object

Copy link
Contributor Author

@jpreston-cern jpreston-cern Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ive pushed some changes to the branch which move the creation of that LUT structure used by datastorage m_mlLut to its own function and also cpp file, although i pushed this before reading these comments so the naming hasnt been changed (although easily renamed),

I quite like the idea of some sort of parsing class being passed in to avoid copying in potentially large vectors in the future, currently it is only 60 lines of an array size 5 long but as i imagine this part of the code is still under active development this might change and could cause slow down if it has to be copied in every event.

I now have this parsing class being passed in as a shared pointer where the storage which can then pull a ref to the LUT information where it is used

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the interface ergonomics are still a bit odd. IMO a parser should return something and not change its state if the parsing is done in a single step. so I would either use a parser + data struct where the parser returns the struct or simply omit the parser as it does not have any configuration other than the bool

copying the data is no concern IMO because the std::vector can simply be std::moved and/or put into a std::shared_ptr

Copy link
Contributor Author

@jpreston-cern jpreston-cern Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice ok that makes sense, for GbtsDataStorage it is important that even if the std::vector<std::array<float, 5>>m_mlLut veriable is empty, that it is still passed in as the size of the vector is used for the logic in terms of if clusterwidth is used. would it be reasonable to either:

change to the parser style class that returns the data struct like you said which can then be used to pass in if it is available, maybe having another constructor which accepts the creation of GbtsdataStorage without passing in the data struct.

or have that gbtsLookupTable class and pass that in as a whole which can then be used by GbtsDataStorage to get the vector (empty or not)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to me the easiest solution seems to be something like this

// lookup table alias
using GbtsMLLookupTable = std::vector<std::array<float, 5>>;

// unconditionally parse the file
GbtsMLLookupTable parseGbtsMLLookupTable(const std::string &path);

then the GbtsDataStorage will take GbtsMLLookupTable by value in the constructor and move it to the member.

GbtsDataStorage(
      std::shared_ptr<const GbtsGeometry> geometry, const SeedFinderGbtsConfig& config,
      GbtsMLLookupTable mlLut) : m_geometry(std::move(geometry)), m_config(config), m_mlLut(std::move(mlLut)) {}

where parseGbtsMLLookupTable is called, it is done conditionally. if it is not parsed an empty GbtsMLLookupTable is created. then the GbtsMLLookupTable instance is std::moved into GbtsDataStorage

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok that makes sense thank you! in terms of placing this parsing function, should it be in examples the same as the connector or within core in SeedFinderGbts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the parsing function can be part of the Core functionality and be called from the Examples

@jpreston-cern
Copy link
Contributor Author

jpreston-cern commented Jan 7, 2026

i have added all the changes suggested so far, the only thing i was thinking was that as the shared pointer of both geometry and Lutparser (name to be changed) is first defined as a member function of SeedFinderGbts and is then passed in per instance of the createSeed function, if there is a lot of multi threading of events, would that cause slow down because of the atomic reference count being high/ changing a lot?

const SeedFinderGbtsConfig& m_config;
std::vector<std::array<float, 5>> m_mlLUT;
std::shared_ptr<const GbtsGeometry> m_geo;
const SeedFinderGbtsConfig m_config;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const SeedFinderGbtsConfig m_config;
SeedFinderGbtsConfig m_config;

no need to use const here as the type GbtsDataStorage itself can be used as const

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GbtsDataStorage has a member veriable std::vector<GbtsEtaBin> m_etaBins which is modified as a part of the loadnodes function, if GbtsDataStorage is made const is it ok to make that veriable a mutable ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no I think in this case GbtsDataStorage should just stay non-const downstream. but since m_config is hidden from the user it does not need to be `const internally

explicit GbtsDataStorage(const GbtsGeometry& geometry,
const SeedFinderGbtsConfig& config);
explicit GbtsDataStorage(std::shared_ptr<const GbtsGeometry> geometry,
const SeedFinderGbtsConfig config,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const SeedFinderGbtsConfig config,
const SeedFinderGbtsConfig& config,

this should either be passed in as a const & or by value and std::moved

Comment on lines 50 to 55
const std::vector<std::array<float, 5>>& getParsedLut() const {
return m_mlLUT;
}

private:
std::vector<std::array<float, 5>> m_mlLUT{};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the interface ergonomics are still a bit odd. IMO a parser should return something and not change its state if the parsing is done in a single step. so I would either use a parser + data struct where the parser returns the struct or simply omit the parser as it does not have any configuration other than the bool

copying the data is no concern IMO because the std::vector can simply be std::moved and/or put into a std::shared_ptr

}
};

SeedContainer2 CreateSeeds(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
SeedContainer2 CreateSeeds(
SeedContainer2 createSeeds(


std::vector<std::vector<SeedFinderGbts::GNN_Node>> CreateNodes(
const auto& container, int MaxLayers);
std::vector<std::vector<GbtsNode>> CreateNodes(const auto& container,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
std::vector<std::vector<GbtsNode>> CreateNodes(const auto& container,
std::vector<std::vector<GbtsNode>> createNodes(const auto& container,

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also I would avoid auto here

@andiwand
Copy link
Contributor

andiwand commented Jan 8, 2026

i have added all the changes suggested so far, the only thing i was thinking was that as the shared pointer of both geometry and Lutparser (name to be changed) is first defined as a member function of SeedFinderGbts and is then passed in per instance of the createSeed function, if there is a lot of multi threading of events, would that cause slow down because of the atomic reference count being high/ changing a lot?

GbtsDataStorage is only created once per event, no? this also contains the GbtsLutParser right now which is held by the std::shared_ptr. in this case I would not worry about performance in any way. there might be a bit more indirection but without detailed profiling this should be not acted on

in case this happens for every seed you might be able to see some performance effects but still I would expect it to be extremely small

@jpreston-cern
Copy link
Contributor Author

jpreston-cern commented Jan 8, 2026

I have added the parsing function to the SeedFinderGbts class which stores the parsed information as a member veriable, this is then passed to each instance of GbtsDataStorage . i thought this made the most sense due to the seed finder creating the instances of the GbtsDataStorage and already having a copy of the config, but i can move it if needed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Component - Core Affects the Core module Component - Examples Affects the Examples module Seeding Track Finding

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants