Reinforcement Learning

Neural Networks

SeaPearl.CPNNType
CPNN(;
    graphChain::Flux.Chain = Flux.Chain()
    nodeChain::Flux.Chain = Flux.Chain()
    globalChain::Flux.Chain = Flux.Chain()
    outputChain::Union{Flux.Dense, Flux.Chain} = Flux.Chain()
)

This structure is here to provide a flexible way to create a nn model which respect this approach: Making modification on the graph, then extract one node feature and modify it. The CPNN pipeline uses the features of the variable branched on (variableFeatures) and, if specified, the global features of the graph (globalFeatures). Contrary to FullFeaturedCPNN, it does not use the features of the possible values to make predictions. This pipeline is made of 4 networks:

  • graphChain: a graph convolutional network (GCN). It takes the original featured graph (states.fg) as an input,
  • nodeChain: a fully connected neural network (FCNN). It takes the features of the variable to branch on as an input,
  • globalChain: FCNN. It takes the global features of the graph as an input,
  • outputChain: FCNN. It takes the concatenation of the outputs of nodeChain and globalChain as an input.
source
SeaPearl.FullFeaturedCPNNType
FullFeaturedCPNN(;
    graphChain::Flux.Chain
    nodeChain::Flux.Chain
    globalChain::Flux.Chain
    outputChain::Flux.Dense
)

This structure is here to provide a flexible way to create a nn model which respect this approach: Making modification on the graph, then extract one node feature and modify it.

This pipeline is made of 4 networks:

  • graphChain: a graph convolutional network (GCN). It takes the original featured graph (states.fg) as an input,
  • nodeChain: a fully connected neural network (FCNN). It is used both on variableFeatures and valueFeatures.
  • globalChain: FCNN. It takes the global features of the graph as an input,
  • outputChain: FCNN. It takes the concatenation of the outputs of nodeChain and globalChain as an input.

Like the CPNN pipeline, the FullFeaturedCPNN pipeline uses the features (variableFeatures) of the variable branched on and, if specified, the global features of the graph (globalFeatures). But contrary to CPNN, it also uses the features (valueFeatures) of the possible values that the variable can be assigned to.

FullFeaturedCPNN generates a Q-table containing the Q-value of all the values, both the possible and impossible ones. A mask then selects the possible values.

source
SeaPearl.HeterogeneousCPNNType
HeterogeneousCPNN(;
    graphChain::Flux.Chain = Flux.Chain()
    nodeChain::Flux.Chain = Flux.Chain()
    globalChain::Flux.Chain = Flux.Chain()
    outputChain::Union{Flux.Dense, Flux.Chain} = Flux.Chain()
)

This structure is here to provide a flexible way to create a nn model which respect this approach: Making modification on the graph, then extract one node feature and modify it.

source
SeaPearl.HeterogeneousFullFeaturedCPNNType
HeterogeneousFullFeaturedCPNN(;
    graphChain::Flux.Chain
    varChain::Flux.Chain = Flux.Chain()
    valChain::Flux.Chain = Flux.Chain()
    globalChain::Flux.Chain
    outputChain::Flux.Dense
)

This structure is here to provide a flexible way to create a nn model which respect this approach: Making modification on the graph, then extract one node feature and modify it. This CPNN is designed to process HeterogeneousFeaturedGraphs.

This pipeline works in the following way :

  1. We apply a GNN on the input featured graph.
  2. We extract the contextualized-feature of the branching variable and pass it trought varChain
  3. We extract the contextualized-feature of each value that is part of the variable's domain of definition and pass it trought valChain.
  4. We concat the two reshaped previous results (optionnaly with a global feature) and pass it trought outputChain to generate the output Q-vector.
source
SeaPearl.HeterogeneousVariableOutputCPNNType
HeterogeneousVariableOutputCPNN(;
    graphChain::Flux.Chain
    nodeChain::Flux.Chain
    outputChain::Flux.Dense
)

This structure is here to provide a flexible way to create a nn model which respect this approach: Making modification on the graph, then extract one node feature and modify it.

source
SeaPearl.VariableOutputCPNNType
VariableOutputCPNN(;
    graphChain::Flux.Chain
    nodeChain::Flux.Chain
    outputChain::Flux.Dense
)

This structure is here to provide a flexible way to create a nn model which respect this approach: Making modification on the graph, then extract one node feature and modify it.

source

State Representation

SeaPearl.DefaultStateRepresentationType
DefaultStateRepresentation{F, TS}

This is the default representation used by SeaPearl unless the user define his own.

It consists in a tripartite graph representation of the CP Model, with features associated with each node and an index specifying the variable that should be branched on.

Fields:

  • cplayergraph: representation of the problem as a tripartite graph.
  • nodeFeatures: Feature matrix of the nodes. Each column corresponds to a node.
  • globalFeatures: Feature vector of the entire graph.
  • variableIdx: Index of the variable we are currently considering.
  • allValuesIdx: Index of value nodes in cplayergraph.
  • valueToPos: Dictionary mapping the value of an action to its position in the one-hot encoding of the value.

The boolean corresponds to the fact that the feature is used or not, the integer corresponds to the position of the feature in the vector.

  • chosenFeatures: Dictionary of featurization options. The boolean corresponds to whether the feature is active or not,

the integer corresponds to the position of the feature in the vector. See below for details about the options.

  • constraintTypeToId: Dictionary mapping the type of a constraint to its position in the one-hot encoding of the constraint type.

The chosenFeatures dictionary specify a boolean -notifying whether the features are active or not - and a position for the following features:

  • "constraint_activity": whether or not the constraint is active for constraint nodes
  • "constraint_type": a one-hot encoding of the constraint type for constraint nodes
  • "nbinvolvedconstraint_propagation": the number of times the constraint has been put in the fixPoint call stack for constraint nodes.
  • "nbnotbounded_variable": the number of non-bound variable involve in the constraint for constraint nodes
  • "variabledomainsize": the current size of the domain of the variable for variable nodes
  • "variableinitialdomain_size": the initial size of the domain of the variable for variable nodes
  • "variableisbound": whether or not the variable is bound for variable nodes
  • "values_onehot": a one-hot encoding of the value for value nodes
  • "values_raw": the raw value of the value for value nodes
source
SeaPearl.DefaultTrajectoryStateType
DefaultTrajectoryState

The most basic state representation, with a featured graph, the index of the variable to branch on and optionnaly a list of possible values.

fg::FeaturedGraph                                   Associated tri-partite graph containing features on each node.
variableIdx::Int                                    Index of the node in the feature graph that represents the choosen variable. 
allValuesIdx::Union{Nothing, AbstractVector{Int}}   Indexes of the nodes in the feature graph that reprensents values
possibleValuesIdx::Union{Nothing, Vector{Int64}}    Indexes of the nodes in the feature graph that reprensents values in the domain of definition of the variable.
source
SeaPearl.BatchedDefaultTrajectoryStateType
BatchedDefaultTrajectoryState

The batched version of the DefaultTrajectoryState.

It contains all the information that would be stored in a FeaturedGraph but reorganised to enable simultaneous computation on a few graphs.

source
SeaPearl.HeterogeneousStateRepresentationType
HeterogeneousStateRepresentation{F, TS}

Similar to the DefaultStateRepresentation, except that the feature matrices are specific to each type of node (constraint, variable and value).

It consists in a tripartite graph representation of the CP Model, with features associated with each node and an index specifying the variable that should be branched on.

Fields:

  • cplayergraph: representation of the problem as a tripartite graph.
  • constraintNodeFeatures: Feature matrix of the constraint nodes. Each column corresponds to a node.
  • variableNodeFeatures: Feature matrix of the variable nodes. Each column corresponds to a node.
  • valueNodeFeatures: Feature matrix of the value nodes. Each column corresponds to a node.
  • globalFeatures: Feature vector of the entire graph.
  • variableIdx: Index of the variable we are currently considering.
  • allValuesIdx: Index of value nodes in cplayergraph.
  • valueToPos: Dictionary mapping the value of an action to its position in the one-hot encoding of the value.

The boolean corresponds to the fact that the feature is used or not, the integer corresponds to the position of the feature in the vector.

  • chosenFeatures: Dictionary of featurization options. The boolean corresponds to whether the feature is active or not,

the integer corresponds to the position of the feature in the vector. See below for details about the options.

  • constraintTypeToId: Dictionary mapping the type of a constraint to its position in the one-hot encoding of the constraint type.

The chosenFeatures dictionary specify a boolean -notifying whether the features are active or not - and a position for the following features:

  • "constraint_activity": whether or not the constraint is active for constraint nodes
  • "constraint_type": a one-hot encoding of the constraint type for constraint nodes
  • "nbinvolvedconstraint_propagation": the number of times the constraint has been put in the fixPoint call stack for constraint nodes.
  • "nbnotbounded_variable": the number of non-bound variable involve in the constraint for constraint nodes
  • "variabledomainsize": the current size of the domain of the variable for variable nodes
  • "variableinitialdomain_size": the initial size of the domain of the variable for variable nodes
  • "variableisbound": whether or not the variable is bound for variable nodes
  • "values_onehot": a one-hot encoding of the value for value nodes
  • "values_raw": the raw value of the value for value nodes
source
SeaPearl.HeterogeneousTrajectoryStateType
HeterogeneousTrajectoryState

The most basic state representation, with a featured graph, the index of the variable to branch on and optionnaly a list of possible values.

source
SeaPearl.MISStateRepresentationType
MISStateRepresentation{F, TS}

This is a standard MIS representation using the graph of the problem with nodes having their assigned value as label.

source
SeaPearl.TsptwStateRepresentationType
TsptwStateRepresentation{F, TS}

This is the Tsptw representation used by Quentin Cappart in Combining Reinforcement Learning and Constraint Programming for Combinatorial Optimization (https://arxiv.org/pdf/2006.01610.pdf).

source
SeaPearl.AbstractTrajectoryStateType
AbstractTrajectoryState

The TrajectoryState is a component of the AbstractStateRepresentation specifying how to store this representation in the trajectory of the RL agent for learning.

It might be necessary to implement your own when defining a new state representation or when another data formatting is necessary for learning. To implement a subtype of AbstractTrajectoryState one must provide:

  • a struct subtype of TabularTrajectoryState or NonTabularTrajectoryState depending on you usage.
  • a constructor with signature YourTrajectoryState(sr::YourStateRepresentation).
  • a function Flux.functor(::Type{YourTrajectoryState}, ts) (see Flux.functor for further information).
  • an NN model taking a YourTrajectoryState instance as an input.
source
SeaPearl.TabularTrajectoryStateType
TabularTrajectoryState

Abstract subtype of AbstractTrajectoryState for array based representations of state.

This type isn't currently used, but it will certainly be a lot clearer if one can easily distinguish between array based representations, and more exotic ones (e.g. graphs, named tuples...).

source
SeaPearl.NonTabularTrajectoryStateType
NonTabularTrajectoryState

Abstract subtype of AbstractTrajectoryState for non array based representations of state.

This type is th only one currently used, but it will certainly be a lot clearer if one can easily distinguish between array based representations, and more exotic ones (e.g. graphs, named tuples...).

source
SeaPearl.AbstractStateRepresentationType
AbstractStateRepresentation{TS}

The AbstractStateRepresentation is the abstract type of the structures representing the internal state of the CPModel and eventually the state of the search in a way that will be as expressive as possible.

It requires a specific AbstractTrajectoryState to make the bridge between the internal state representation and one the RL agent can use to decide of the value to be assigned when branching.

A user can use the DefaultStateRepresentation with the DefaultTrajectoryState provided by the package but he has the possibility to define his own.

To define a new one, the user must provide:

  • a new structure, subtype of AbstractStateRepresentation with the appropriate subtype of AbstractTrajectoryState and all its dependencies.
  • a constructor from a CPModel, with keyword argument action_space.
  • a function update_representation!(::AbstractStateRepresentation, ::CPModel, ::AbstractIntVar) to update the state representation at each step.

Look at the DefaultStateRepresentation to get inspired.

source
SeaPearl.AbstractFeaturizationType
AbstractFeaturization

Every subtype of FeaturizedStateRepresentation{F} requires an AbstractFeaturization.

This type gives the possibility to characterise these feature based representations, and thus gives the ability to easily define new ones. To implement your own featurization, one must provide:

  • a struct subtype of AbstractFeaturization (no field is used by the DefaultStateRepresentation).
  • a function featurize(::FeaturizedStateRepresentation{YourFeaturization, TS}) where TS returning a feature Matrix with features stored columnwise.
  • a function feature_length(::Type{<:FeaturizedStateRepresentation{YourFeaturization, TS} where TS} returning the size of a feature vector.
  • optionally a function update_features!(::FeaturizedStateRepresentation{YourFeaturization, TS}, ::CPModel) where TS to update your feature vectors based on statistics gathered by the CP Model.
source
SeaPearl.FeaturizedStateRepresentationType
FeaturizedStateRepresentation{F, TS}

Abstract subtype of AbstractStateRepresentation specializing the type of features used in the representation.

When a user wants to try a new featurization with the same organisation of the featurized elements, instead of having to completely redefine a new type of AbstractStateRepresentation, he can keep the same and just use a new AbstractFeaturization.

source
SeaPearl.trajectoryStateFunction
trajectoryState(sr::AbstractStateRepresentation{TS})

Return a TrajectoryState based on the present state represented by sr.

The type of the returned object is defined by the TS parametric type defined in sr.

source
SeaPearl.featurizeFunction
featurize(sr::DefaultStateRepresentation{DefaultFeaturization, TS})

Create features for every node of the graph. Can be overwritten for a completely custom featurization.

Default behavior consists in a 3D One-hot vector that encodes whether the node represents a Constraint, a Variable or a Value.

It is also possible to pass a chosen_features dictionary allowing to choose among some non mandatory features. It will be used in initChosenFeatures! to initialize sr.chosenFeatures. See DefaultStateRepresentation for a list of possible options. It is only necessary to specify the options you wish to activate.

source
featurize(sr::HeterogeneousStateRepresentation{DefaultFeaturization, TS})

Create features for every node of the graph. Can be overwritten for a completely custom featurization.

Default behavior consists in a 3D One-hot vector that encodes whether the node represents a Constraint, a Variable or a Value.

It is also possible to pass a chosen_features dictionary allowing to choose among some non mandatory features. It will be used in initChosenFeatures! to initialize sr.chosenFeatures. See HeterogeneousStateRepresentation for a list of possible options. It is only necessary to specify the options you wish to activate.

source
function featurize(sr::TsptwStateRepresentation{TsptwFeaturization})

Create nodeFeatures for every node of the graph. Supposed to be overwritten. Tsptw behavior is to call Tsptw_featurize.

source
function featurize(sr::GraphColoringStateRepresentation{GraphColoringFeaturization})

Create nodeFeatures for every node of the graph (current color of the node or -1 if the color has not been determined yet)

source
function featurize(sr::MISStateRepresentation{MISFeaturization})

Create nodeFeatures for every node of the graph (current color of the node or -1 if the color has not been determined yet)

source