What is the meaning of ecole.rewards.NNodes, and can we fully access the underlying SCIP search tree? #34

cwfparsonson · 2021-07-14T14:20:10Z

cwfparsonson
Jul 14, 2021

Hi,

I have 2 questions which would be great to have answers for to understand ecole, pyscipopt, and branch-and-bound:

Q1

In the documentation, ecole.rewards.NNodes is defined as 'the total number of nodes processed since the previous state.' What does 'nodes processed' mean in this context? My understanding of branching was that a chosen node is given to the brancher, the brancher chooses a variable at this node to branch on, then 2 new child nodes are added to the chosen parent node and 'processed' (i.e. dual and primal bounds assigned), then the process is repeated. In this context, how is the number of 'nodes processed' between each branching state ever not equal to 2?

Q2

Is it possible to fully access the underlying search tree being evolved by pyscipopt and build the full B&B search tree ourselves? To help me understand how ecole and pyscipopt are working, I'm solving a simple B&B problem (the first walk-through example given here: http://web.tecnico.ulisboa.pt/mcasquilho/compute/_linpro/TaylorB_module_c.pdf) and printing the state at each step.

import ecole
import pyscipopt
import numpy as np

Initialising the problem instance with pyscipopt:

# init instance
pyscipopt_instance = pyscipopt.Model('example_1') # pyscipmodel

# register decision variables w/ category and bounds
variables = {f'x{i}': pyscipopt_instance.addVar(name=f'x{i}', vtype='I', lb=0, ub=None) for i in range(1, 3)}

# register non-category and non-bounds constraints
pyscipopt_instance.addCons(8000*variables['x1'] + 4000*variables['x2'] <= 40000)
pyscipopt_instance.addCons(15*variables['x1'] + 30*variables['x2'] <= 200)

# register objective
pyscipopt_instance.setObjective(100*variables['x1'] + 150*variables['x2'], sense='maximize')

# turn off helpers so don't instantly solve on env.reset()
pyscipopt_instance.setPresolve(pyscipopt.SCIP_PARAMSETTING.OFF)
pyscipopt_instance.setHeuristics(pyscipopt.SCIP_PARAMSETTING.OFF)
pyscipopt_instance.disablePropagation()
pyscipopt_instance.setSeparating(pyscipopt.SCIP_PARAMSETTING.OFF)

Create an ecole branching environment:

env = ecole.environment.Branching(
    observation_function=(
        ecole.observation.NodeBipartite()
    ),
    information_function=({}),
    reward_function=({}),
)
env.seed(0) # seed the environment for reproducibility

Run a random brancher in the ecole environment until the ILP is solved or max_step is exceeded, where print_env_params() is just a function which prints out a few different search tree and current node parameters (happy to post if useful):

max_step = 100
n_step = 0
observation, action_set, reward_offset, done, info = env.reset(ecole_instance)
while (not done and n_step<max_step):
   print(f'\nStep {n_step}')
   print_env_params(env)

    # select action and move to next step
    action = np.random.choice(action_set)
    observation, action_set, reward, done, info = env.step(action)
    n_step += 1

This results in the following output at each step:

Step 0
>>> tree params <<<
status: unknown
best primal bound: -1e+20 | best dual bound: 1055.5555555555554 | primal-dual gap: 1e+20
solving time: 11.067652 s
num lp iterations: 2
num processed nodes in current run (including focus node): 1
num processed nodes across all runs (including focus node): 1
num leaves in search tree: 0
num leaf nodes processed with feasible relaxed (dual) solutions: 0
num leaf nodes processed with infeasible (relaxed) solutions: 0
open nodes (leaves, children, siblings): ([], [], [])
best node (child, sibling, or leaf) in tree w.r.t. node selection strategy: None
>>> current node params <<<
node lower bound: -21.11111111111111
parent node: None
node number: 1
node depth: 0
node type: 0
estimated best feasible solution in sub-tree of node: -20.444444444444446
is node marked to be propagated again: False
num children of focus node: 0
num siblings of focus node: 0

Step 1
>>> tree params <<<
status: unknown
best primal bound: -1e+20 | best dual bound: 1055.5555555555554 | primal-dual gap: 1e+20
solving time: 11.070899 s
num lp iterations: 3
num processed nodes in current run (including focus node): 2
num processed nodes across all runs (including focus node): 2
num leaves in search tree: 0
num leaf nodes processed with feasible relaxed (dual) solutions: 0
num leaf nodes processed with infeasible (relaxed) solutions: 0
open nodes (leaves, children, siblings): ([], [], [<pyscipopt.scip.Node object at 0x7f82056545f0>])
best node (child, sibling, or leaf) in tree w.r.t. node selection strategy: <pyscipopt.scip.Node object at 0x7f82056545f0>
>>> current node params <<<
node lower bound: -21.0
parent node: <pyscipopt.scip.Node object at 0x7f8205654690>
node number: 2
node depth: 1
node type: 0
estimated best feasible solution in sub-tree of node: -20.666666666666668
is node marked to be propagated again: False
num children of focus node: 0
num siblings of focus node: 1

Step 2
>>> tree params <<<
status: unknown
best primal bound: -1e+20 | best dual bound: 1055.5555555555554 | primal-dual gap: 1e+20
solving time: 11.073118 s
num lp iterations: 4
num processed nodes in current run (including focus node): 3
num processed nodes across all runs (including focus node): 3
num leaves in search tree: 1
num leaf nodes processed with feasible relaxed (dual) solutions: 0
num leaf nodes processed with infeasible (relaxed) solutions: 0
open nodes (leaves, children, siblings): ([<pyscipopt.scip.Node object at 0x7f82056544b0>], [], [<pyscipopt.scip.Node object at 0x7f8205654a00>])
best node (child, sibling, or leaf) in tree w.r.t. node selection strategy: <pyscipopt.scip.Node object at 0x7f82056544b0>
>>> current node params <<<
node lower bound: -20.666666666666668
parent node: <pyscipopt.scip.Node object at 0x7f8205654a00>
node number: 5
node depth: 2
node type: 0
estimated best feasible solution in sub-tree of node: -20.499850000000002
is node marked to be propagated again: False
num children of focus node: 0
num siblings of focus node: 1

Step 3
>>> tree params <<<
status: unknown
best primal bound: -1e+20 | best dual bound: 1055.5555555555554 | primal-dual gap: 1e+20
solving time: 11.075674 s
num lp iterations: 5
num processed nodes in current run (including focus node): 4
num processed nodes across all runs (including focus node): 4
num leaves in search tree: 2
num leaf nodes processed with feasible relaxed (dual) solutions: 0
num leaf nodes processed with infeasible (relaxed) solutions: 0
open nodes (leaves, children, siblings): ([<pyscipopt.scip.Node object at 0x7f8205654960>, <pyscipopt.scip.Node object at 0x7f8205654910>], [], [<pyscipopt.scip.Node object at 0x7f8205654870>])
best node (child, sibling, or leaf) in tree w.r.t. node selection strategy: <pyscipopt.scip.Node object at 0x7f8205654960>
>>> current node params <<<
node lower bound: -20.5
parent node: <pyscipopt.scip.Node object at 0x7f8205654870>
node number: 6
node depth: 3
node type: 0
estimated best feasible solution in sub-tree of node: -20.416604166666666
is node marked to be propagated again: False
num children of focus node: 0
num siblings of focus node: 1

Done.
objective value: 1000.0
>>> tree params <<<
status: optimal
best primal bound: 1000.0 | best dual bound: 1000.0 | primal-dual gap: 0.0
solving time: 11.07781 s
num lp iterations: 7
num processed nodes in current run (including focus node): 6
num processed nodes across all runs (including focus node): 6
num leaves in search tree: 0
num leaf nodes processed with feasible relaxed (dual) solutions: 1
num leaf nodes processed with infeasible (relaxed) solutions: 0
open nodes (leaves, children, siblings): ([], [], [])
best node (child, sibling, or leaf) in tree w.r.t. node selection strategy: None

What I find is that although there seem to be 2 nodes being added at each branching step (the node ID of env.model.as_pyscipopt().getCurrentNode() seems to go up by either 1 or 2 depending on which of the new nodes was chosen), env.model.as_pyscipopt().getNTotalNodes() ('num processed nodes across all runs') only goes up by 1. Furthermore, env.model.as_pyscipopt().getNLeaves() ('num leaves in search tree') is 0 at step 0 (I expected 1 since there will be one root node with no children), 0 at step 1 (I expected 2 since 2 child nodes will have been added from branching at the root), and so on. Perhaps to keep track of the search tree, I need to also keep track of the siblings of the focus node, since those seem to be excluded from the total number of tree nodes, open nodes, etc.?

Any help in understanding how pyscipopt is evolving the search tree at each step in the ecole environment would be much appreciated!

Answered by gasse

Jul 20, 2021

Hi @cwfparsonson ,

Here is my understanding of the B&B process and the counting of nodes in SCIP (see also this discussion in SCIP's mailing list).

During the solving process, at any point in time the B&B tree consists of:

processed nodes: nodes which have already been selected by node selection and processed. This includes the currently selected node (the focus node) as well.
open nodes: nodes which have been created by branching and are yet to be processed.

Processed nodes

Processed really means that a node has already been selected by the node selection strategy. This includes previously processed nodes, and also the currently focused node (see SCIPgetNNodes()). Assuming SCIP does no…

View full answer

gasse · 2021-07-20T20:04:42Z

gasse
Jul 20, 2021
Maintainer

Hi @cwfparsonson ,

Here is my understanding of the B&B process and the counting of nodes in SCIP (see also this discussion in SCIP's mailing list).

During the solving process, at any point in time the B&B tree consists of:

processed nodes: nodes which have already been selected by node selection and processed. This includes the currently selected node (the focus node) as well.
open nodes: nodes which have been created by branching and are yet to be processed.

Processed nodes

Processed really means that a node has already been selected by the node selection strategy. This includes previously processed nodes, and also the currently focused node (see SCIPgetNNodes()). Assuming SCIP does not perform a restart (in which case the B&B tree is discarded and a new one is build), this is also equal to SCIPgetNTotalNodes(). When SCIP selects a node it does the following by default:

SCIP pre-processes the node (propagation, conflict analysis, cutting planes, etc.)
SCIP solves the LP relaxation
SCIP decides on what to do with the node
3.a if the LP is infeasible, prune the node (it becomes an infeasible leaf node, see SCIPgetNInfeasibleLeaves()).
3.b else if the LP solution value if above the current global upper bound, prune the node (it becomes an objective limit leaf node, see SCIPgetNObjlimLeaves()).
3.c else if the LP solution satisfies the integrality constraint, register the node as feasible (it becomes a feasible leaf node, see SCIPgetNFeasibleLeaves() ).
3.d else if the LP solution does not satisfy integrality constraints, call the branching rule, which may or may not create children nodes. In the most usual case, 2 children nodes are created. Note that these nodes are still open nodes at this point, as they have not been processed yet.
finally, SCIP calls the node selector to extract the next node to be processed from the open nodes, and moves on to process that node (which will become a processed node).

Open nodes

The open nodes constitute the boundary of the open B&B search space. They are the nodes that are left to be processed (see SCIPgetNNodesLeft()). Open nodes are leaf nodes in the tree, but technically speaking they do not constitute all the leaf nodes. Some leaf nodes might have been processed without producing any child node, and thus remain leaf nodes while they are not open any more (see objective limit leaf node or infeasible leaf nodes above).

For practical reasons, SCIP further partitions the open nodes into three sets:

leaves: the set of all open nodes minus the child and sibling nodes relative to the currently focused node. Despite its name, this set does not store all the open leaf nodes in the B&B tree (see SCIPgetNLeaves()).
children: the children of the currently focused node, which are necessarily all open (see SCIPgetNChildren()).
siblings: the open nodes which are siblings to the currently focused node, that is, which have the same parent (see SCIPgetNSiblings()).

The reason for SCIP to use this partitioning to store the open nodes is I think related to diving. Indeed, it is much more efficient to process immediate siblings or children of the currently focused node next, as a lot of the data structures can be reused efficiently. Hence, node selection rules usually consider child and sibling nodes differently from the rest of the leaf nodes.

A final remark: some open nodes sometimes are just discarded by SCIP, and do not appear in any particular counter. This might happen when SCIP finds a new best primal solution, which immediately improve the current global upper bound. Any open node whose value estimate (basically the LP solution value of the parent node) falls above the new global upper bound can be safely pruned by SCIP. Those nodes could be considered objective limit nodes, but are not registered as such by SCIP since they have not been processed. The only way that I am aware of to account for those nodes, assuming that SCIP did not do a restart, is with the following formula:

<pruned nodes> = scip->stat->ncreatednodesrun - SCIPgetNNodes() - SCIPgetNNodesLeft()

If you unfold this process, you'll realize that in-between two branching calls, usually only there is one additional processed node, as well as one additional open node (two children get in, one selected node gets out). But in some cases SCIP does not branch (infeasible node or objective limit node), which results in 2 or more additional processed node, and no change or a decrease in the number of open nodes. This is supposed to happen, and is actually required for SCIP to prove optimality or infeasibility (when no open node is left).

Hope that helps clarify things.

@ambros-gleixner, @antoniach or @CGraczyk, please feel free to chime in and correct me if there is any mistake above.

Best,
Maxime

1 reply

cwfparsonson Jul 23, 2021
Author

Hi @cwfparsonson ,

Here is my understanding of the B&B process and the counting of nodes in SCIP (see also this discussion in SCIP's mailing list).

During the solving process, at any point in time the B&B tree consists of:

processed nodes: nodes which have already been selected by node selection and processed. This includes the currently selected node (the focus node) as well.

open nodes: nodes which have been created by branching and are yet to be processed.

Processed nodes

Processed really means that a node has already been selected by the node selection strategy. This includes previously processed nodes, and also the currently focused node (see SCIPgetNNodes()). Assuming SCIP does not perform a restart (in which case the B&B tree is discarded and a new one is build), this is also equal to SCIPgetNTotalNodes(). When SCIP selects a node it does the following by default:

SCIP pre-processes the node (propagation, conflict analysis, cutting planes, etc.)

SCIP solves the LP relaxation

SCIP decides on what to do with the node
3.a if the LP is infeasible, prune the node (it becomes an infeasible leaf node, see SCIPgetNInfeasibleLeaves()).
3.b else if the LP solution value if above the current global upper bound, prune the node (it becomes an objective limit leaf node, see SCIPgetNObjlimLeaves()).
3.c else if the LP solution satisfies the integrality constraint, register the node as feasible (it becomes a feasible leaf node, see SCIPgetNFeasibleLeaves() ).
3.d else if the LP solution does not satisfy integrality constraints, call the branching rule, which may or may not create children nodes. In the most usual case, 2 children nodes are created. Note that these nodes are still open nodes at this point, as they have not been processed yet.

finally, SCIP calls the node selector to extract the next node to be processed from the open nodes, and moves on to process that node (which will become a processed node).

Open nodes

The open nodes constitute the boundary of the open B&B search space. They are the nodes that are left to be processed (see SCIPgetNNodesLeft()). Open nodes are leaf nodes in the tree, but technically speaking they do not constitute all the leaf nodes. Some leaf nodes might have been processed without producing any child node, and thus remain leaf nodes while they are not open any more (see objective limit leaf node or infeasible leaf nodes above).

For practical reasons, SCIP further partitions the open nodes into three sets:

leaves: the set of all open nodes minus the child and sibling nodes relative to the currently focused node. Despite its name, this set does not store all the open leaf nodes in the B&B tree (see SCIPgetNLeaves()).

children: the children of the currently focused node, which are necessarily all open (see SCIPgetNChildren()).

siblings: the open nodes which are siblings to the currently focused node, that is, which have the same parent (see SCIPgetNSiblings()).

The reason for SCIP to use this partitioning to store the open nodes is I think related to diving. Indeed, it is much more efficient to process immediate siblings or children of the currently focused node next, as a lot of the data structures can be reused efficiently. Hence, node selection rules usually consider child and sibling nodes differently from the rest of the leaf nodes.

A final remark: some open nodes sometimes are just discarded by SCIP, and do not appear in any particular counter. This might happen when SCIP finds a new best primal solution, which immediately improve the current global upper bound. Any open node whose value estimate (basically the LP solution value of the parent node) falls above the new global upper bound can be safely pruned by SCIP. Those nodes could be considered objective limit nodes, but are not registered as such by SCIP since they have not been processed. The only way that I am aware of to account for those nodes, assuming that SCIP did not do a restart, is with the following formula:
<pruned nodes> = scip->stat->ncreatednodesrun - SCIPgetNNodes() - SCIPgetNNodesLeft()
If you unfold this process, you'll realize that in-between two branching calls, usually only there is one additional processed node, as well as one additional open node (two children get in, one selected node gets out). But in some cases SCIP does not branch (infeasible node or objective limit node), which results in 2 or more additional processed node, and no change or a decrease in the number of open nodes. This is supposed to happen, and is actually required for SCIP to prove optimality or infeasibility (when no open node is left).

Hope that helps clarify things.

@ambros-gleixner, @antoniach or @CGraczyk, please feel free to chime in and correct me if there is any mistake above.

Best,
Maxime

Thank you very much for the detailed answer Maxime!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What is the meaning of ecole.rewards.NNodes, and can we fully access the underlying SCIP search tree? #34

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Processed nodes

Open nodes

Select a reply

Uh oh!

What is the meaning of ecole.rewards.NNodes, and can we fully access the underlying SCIP search tree? #34

Uh oh!

Uh oh!

cwfparsonson Jul 14, 2021

Q1

Q2

Processed nodes

Replies: 1 comment · 1 reply

Uh oh!

Uh oh!

gasse Jul 20, 2021 Maintainer

Processed nodes

Open nodes

Uh oh!

cwfparsonson Jul 23, 2021 Author

Processed nodes

Open nodes

cwfparsonson
Jul 14, 2021

Replies: 1 comment 1 reply

gasse
Jul 20, 2021
Maintainer

cwfparsonson Jul 23, 2021
Author