Reinforcement Learning
Neural Networks
SeaPearl.CPNN
— TypeCPNN(;
graphChain::Flux.Chain = Flux.Chain()
nodeChain::Flux.Chain = Flux.Chain()
globalChain::Flux.Chain = Flux.Chain()
outputChain::Union{Flux.Dense, Flux.Chain} = Flux.Chain()
)
This structure is here to provide a flexible way to create a nn model which respect this approach: Making modification on the graph, then extract one node feature and modify it. The CPNN pipeline uses the features of the variable branched on (variableFeatures
) and, if specified, the global features of the graph (globalFeatures
). Contrary to FullFeaturedCPNN
, it does not use the features of the possible values to make predictions. This pipeline is made of 4 networks:
graphChain
: a graph convolutional network (GCN). It takes the original featured graph (states.fg
) as an input,nodeChain
: a fully connected neural network (FCNN). It takes the features of the variable to branch on as an input,globalChain
: FCNN. It takes the global features of the graph as an input,outputChain
: FCNN. It takes the concatenation of the outputs ofnodeChain
andglobalChain
as an input.
SeaPearl.FullFeaturedCPNN
— TypeFullFeaturedCPNN(;
graphChain::Flux.Chain
nodeChain::Flux.Chain
globalChain::Flux.Chain
outputChain::Flux.Dense
)
This structure is here to provide a flexible way to create a nn model which respect this approach: Making modification on the graph, then extract one node feature and modify it.
This pipeline is made of 4 networks:
graphChain
: a graph convolutional network (GCN). It takes the original featured graph (states.fg
) as an input,nodeChain
: a fully connected neural network (FCNN). It is used both on variableFeatures and valueFeatures.globalChain
: FCNN. It takes the global features of the graph as an input,outputChain
: FCNN. It takes the concatenation of the outputs ofnodeChain
andglobalChain
as an input.
Like the CPNN
pipeline, the FullFeaturedCPNN
pipeline uses the features (variableFeatures
) of the variable branched on and, if specified, the global features of the graph (globalFeatures
). But contrary to CPNN
, it also uses the features (valueFeatures
) of the possible values that the variable can be assigned to.
FullFeaturedCPNN
generates a Q-table containing the Q-value of all the values, both the possible and impossible ones. A mask then selects the possible values.
SeaPearl.HeterogeneousCPNN
— TypeHeterogeneousCPNN(;
graphChain::Flux.Chain = Flux.Chain()
nodeChain::Flux.Chain = Flux.Chain()
globalChain::Flux.Chain = Flux.Chain()
outputChain::Union{Flux.Dense, Flux.Chain} = Flux.Chain()
)
This structure is here to provide a flexible way to create a nn model which respect this approach: Making modification on the graph, then extract one node feature and modify it.
SeaPearl.HeterogeneousFullFeaturedCPNN
— TypeHeterogeneousFullFeaturedCPNN(;
graphChain::Flux.Chain
varChain::Flux.Chain = Flux.Chain()
valChain::Flux.Chain = Flux.Chain()
globalChain::Flux.Chain
outputChain::Flux.Dense
)
This structure is here to provide a flexible way to create a nn model which respect this approach: Making modification on the graph, then extract one node feature and modify it. This CPNN is designed to process HeterogeneousFeaturedGraphs.
This pipeline works in the following way :
- We apply a GNN on the input featured graph.
- We extract the contextualized-feature of the branching variable and pass it trought varChain
- We extract the contextualized-feature of each value that is part of the variable's domain of definition and pass it trought valChain.
- We concat the two reshaped previous results (optionnaly with a global feature) and pass it trought outputChain to generate the output Q-vector.
SeaPearl.HeterogeneousVariableOutputCPNN
— TypeHeterogeneousVariableOutputCPNN(;
graphChain::Flux.Chain
nodeChain::Flux.Chain
outputChain::Flux.Dense
)
This structure is here to provide a flexible way to create a nn model which respect this approach: Making modification on the graph, then extract one node feature and modify it.
SeaPearl.VariableOutputCPNN
— TypeVariableOutputCPNN(;
graphChain::Flux.Chain
nodeChain::Flux.Chain
outputChain::Flux.Dense
)
This structure is here to provide a flexible way to create a nn model which respect this approach: Making modification on the graph, then extract one node feature and modify it.
State Representation
SeaPearl.DefaultStateRepresentation
— TypeDefaultStateRepresentation{F, TS}
This is the default representation used by SeaPearl unless the user define his own.
It consists in a tripartite graph representation of the CP Model, with features associated with each node and an index specifying the variable that should be branched on.
Fields:
cplayergraph
: representation of the problem as a tripartite graph.nodeFeatures
: Feature matrix of the nodes. Each column corresponds to a node.globalFeatures
: Feature vector of the entire graph.variableIdx
: Index of the variable we are currently considering.allValuesIdx
: Index of value nodes incplayergraph
.valueToPos
: Dictionary mapping the value of an action to its position in the one-hot encoding of the value.
The boolean corresponds to the fact that the feature is used or not, the integer corresponds to the position of the feature in the vector.
chosenFeatures
: Dictionary of featurization options. The boolean corresponds to whether the feature is active or not,
the integer corresponds to the position of the feature in the vector. See below for details about the options.
constraintTypeToId
: Dictionary mapping the type of a constraint to its position in the one-hot encoding of the constraint type.
The chosenFeatures
dictionary specify a boolean -notifying whether the features are active or not - and a position for the following features:
- "constraint_activity": whether or not the constraint is active for constraint nodes
- "constraint_type": a one-hot encoding of the constraint type for constraint nodes
- "nbinvolvedconstraint_propagation": the number of times the constraint has been put in the fixPoint call stack for constraint nodes.
- "nbnotbounded_variable": the number of non-bound variable involve in the constraint for constraint nodes
- "variabledomainsize": the current size of the domain of the variable for variable nodes
- "variableinitialdomain_size": the initial size of the domain of the variable for variable nodes
- "variableisbound": whether or not the variable is bound for variable nodes
- "values_onehot": a one-hot encoding of the value for value nodes
- "values_raw": the raw value of the value for value nodes
SeaPearl.DefaultTrajectoryState
— TypeDefaultTrajectoryState
The most basic state representation, with a featured graph, the index of the variable to branch on and optionnaly a list of possible values.
fg::FeaturedGraph Associated tri-partite graph containing features on each node.
variableIdx::Int Index of the node in the feature graph that represents the choosen variable.
allValuesIdx::Union{Nothing, AbstractVector{Int}} Indexes of the nodes in the feature graph that reprensents values
possibleValuesIdx::Union{Nothing, Vector{Int64}} Indexes of the nodes in the feature graph that reprensents values in the domain of definition of the variable.
SeaPearl.BatchedDefaultTrajectoryState
— TypeBatchedDefaultTrajectoryState
The batched version of the DefaultTrajectoryState
.
It contains all the information that would be stored in a FeaturedGraph
but reorganised to enable simultaneous computation on a few graphs.
SeaPearl.HeterogeneousStateRepresentation
— TypeHeterogeneousStateRepresentation{F, TS}
Similar to the DefaultStateRepresentation
, except that the feature matrices are specific to each type of node (constraint, variable and value).
It consists in a tripartite graph representation of the CP Model, with features associated with each node and an index specifying the variable that should be branched on.
Fields:
cplayergraph
: representation of the problem as a tripartite graph.constraintNodeFeatures
: Feature matrix of the constraint nodes. Each column corresponds to a node.variableNodeFeatures
: Feature matrix of the variable nodes. Each column corresponds to a node.valueNodeFeatures
: Feature matrix of the value nodes. Each column corresponds to a node.globalFeatures
: Feature vector of the entire graph.variableIdx
: Index of the variable we are currently considering.allValuesIdx
: Index of value nodes incplayergraph
.valueToPos
: Dictionary mapping the value of an action to its position in the one-hot encoding of the value.
The boolean corresponds to the fact that the feature is used or not, the integer corresponds to the position of the feature in the vector.
chosenFeatures
: Dictionary of featurization options. The boolean corresponds to whether the feature is active or not,
the integer corresponds to the position of the feature in the vector. See below for details about the options.
constraintTypeToId
: Dictionary mapping the type of a constraint to its position in the one-hot encoding of the constraint type.
The chosenFeatures
dictionary specify a boolean -notifying whether the features are active or not - and a position for the following features:
- "constraint_activity": whether or not the constraint is active for constraint nodes
- "constraint_type": a one-hot encoding of the constraint type for constraint nodes
- "nbinvolvedconstraint_propagation": the number of times the constraint has been put in the fixPoint call stack for constraint nodes.
- "nbnotbounded_variable": the number of non-bound variable involve in the constraint for constraint nodes
- "variabledomainsize": the current size of the domain of the variable for variable nodes
- "variableinitialdomain_size": the initial size of the domain of the variable for variable nodes
- "variableisbound": whether or not the variable is bound for variable nodes
- "values_onehot": a one-hot encoding of the value for value nodes
- "values_raw": the raw value of the value for value nodes
SeaPearl.HeterogeneousTrajectoryState
— TypeHeterogeneousTrajectoryState
The most basic state representation, with a featured graph, the index of the variable to branch on and optionnaly a list of possible values.
SeaPearl.GraphColoringStateRepresentation
— TypeGraphColoringStateRepresentation{F, TS}
This is a standard graphcoloring representation using the graph of the problem with nodes having their assigned value as label.
SeaPearl.MISStateRepresentation
— TypeMISStateRepresentation{F, TS}
This is a standard MIS representation using the graph of the problem with nodes having their assigned value as label.
SeaPearl.TsptwFeaturization
— TypeTsptwFeaturization <: AbstractFeaturization Custom featurization for the TSPTW problem
SeaPearl.TsptwStateRepresentation
— TypeTsptwStateRepresentation{F, TS}
This is the Tsptw representation used by Quentin Cappart in Combining Reinforcement Learning and Constraint Programming for Combinatorial Optimization (https://arxiv.org/pdf/2006.01610.pdf).
SeaPearl.AbstractTrajectoryState
— TypeAbstractTrajectoryState
The TrajectoryState
is a component of the AbstractStateRepresentation
specifying how to store this representation in the trajectory
of the RL agent for learning.
It might be necessary to implement your own when defining a new state representation or when another data formatting is necessary for learning. To implement a subtype of AbstractTrajectoryState
one must provide:
- a struct subtype of
TabularTrajectoryState
orNonTabularTrajectoryState
depending on you usage. - a constructor with signature
YourTrajectoryState(sr::YourStateRepresentation)
. - a function
Flux.functor(::Type{YourTrajectoryState}, ts)
(see Flux.functor for further information). - an NN model taking a
YourTrajectoryState
instance as an input.
SeaPearl.TabularTrajectoryState
— TypeTabularTrajectoryState
Abstract subtype of AbstractTrajectoryState
for array based representations of state.
This type isn't currently used, but it will certainly be a lot clearer if one can easily distinguish between array based representations, and more exotic ones (e.g. graphs, named tuples...).
SeaPearl.NonTabularTrajectoryState
— TypeNonTabularTrajectoryState
Abstract subtype of AbstractTrajectoryState
for non array based representations of state.
This type is th only one currently used, but it will certainly be a lot clearer if one can easily distinguish between array based representations, and more exotic ones (e.g. graphs, named tuples...).
SeaPearl.AbstractStateRepresentation
— TypeAbstractStateRepresentation{TS}
The AbstractStateRepresentation is the abstract type of the structures representing the internal state of the CPModel and eventually the state of the search in a way that will be as expressive as possible.
It requires a specific AbstractTrajectoryState
to make the bridge between the internal state representation and one the RL agent can use to decide of the value to be assigned when branching.
A user can use the DefaultStateRepresentation
with the DefaultTrajectoryState
provided by the package but he has the possibility to define his own.
To define a new one, the user must provide:
- a new structure, subtype of
AbstractStateRepresentation
with the appropriate subtype ofAbstractTrajectoryState
and all its dependencies. - a constructor from a
CPModel
, with keyword argumentaction_space
. - a function
update_representation!(::AbstractStateRepresentation, ::CPModel, ::AbstractIntVar)
to update the state representation at each step.
Look at the DefaultStateRepresentation to get inspired.
SeaPearl.AbstractFeaturization
— TypeAbstractFeaturization
Every subtype of FeaturizedStateRepresentation{F}
requires an AbstractFeaturization
.
This type gives the possibility to characterise these feature based representations, and thus gives the ability to easily define new ones. To implement your own featurization, one must provide:
- a struct subtype of
AbstractFeaturization
(no field is used by theDefaultStateRepresentation
). - a function
featurize(::FeaturizedStateRepresentation{YourFeaturization, TS}) where TS
returning a feature Matrix with features stored columnwise. - a function
feature_length(::Type{<:FeaturizedStateRepresentation{YourFeaturization, TS} where TS}
returning the size of a feature vector. - optionally a function
update_features!(::FeaturizedStateRepresentation{YourFeaturization, TS}, ::CPModel) where TS
to update your feature vectors based on statistics gathered by the CP Model.
SeaPearl.FeaturizedStateRepresentation
— TypeFeaturizedStateRepresentation{F, TS}
Abstract subtype of AbstractStateRepresentation
specializing the type of features used in the representation.
When a user wants to try a new featurization with the same organisation of the featurized elements, instead of having to completely redefine a new type of AbstractStateRepresentation, he can keep the same and just use a new AbstractFeaturization
.
SeaPearl.trajectoryState
— FunctiontrajectoryState(sr::AbstractStateRepresentation{TS})
Return a TrajectoryState based on the present state represented by sr
.
The type of the returned object is defined by the TS
parametric type defined in sr
.
SeaPearl.featurize
— Functionfeaturize(sr::DefaultStateRepresentation{DefaultFeaturization, TS})
Create features for every node of the graph. Can be overwritten for a completely custom featurization.
Default behavior consists in a 3D One-hot vector that encodes whether the node represents a Constraint, a Variable or a Value.
It is also possible to pass a chosen_features
dictionary allowing to choose among some non mandatory features. It will be used in initChosenFeatures!
to initialize sr.chosenFeatures
. See DefaultStateRepresentation
for a list of possible options. It is only necessary to specify the options you wish to activate.
featurize(sr::HeterogeneousStateRepresentation{DefaultFeaturization, TS})
Create features for every node of the graph. Can be overwritten for a completely custom featurization.
Default behavior consists in a 3D One-hot vector that encodes whether the node represents a Constraint, a Variable or a Value.
It is also possible to pass a chosen_features
dictionary allowing to choose among some non mandatory features. It will be used in initChosenFeatures!
to initialize sr.chosenFeatures
. See HeterogeneousStateRepresentation
for a list of possible options. It is only necessary to specify the options you wish to activate.
function featurize(sr::TsptwStateRepresentation{TsptwFeaturization})
Create nodeFeatures for every node of the graph. Supposed to be overwritten. Tsptw behavior is to call Tsptw_featurize
.
function featurize(sr::GraphColoringStateRepresentation{GraphColoringFeaturization})
Create nodeFeatures for every node of the graph (current color of the node or -1 if the color has not been determined yet)
function featurize(sr::MISStateRepresentation{MISFeaturization})
Create nodeFeatures for every node of the graph (current color of the node or -1 if the color has not been determined yet)