本转载自:​

机器学习中基于约束编程的优化算法(转载)_ci

The core of a constraint programming solver and the relationship with mixed integer programming

There are many different ways to define and solve optimization problems. You can e.g. use greedy algorithms, constraint programming, mixed integer programming, genetic algorithms or local search. In this post, we are diving in to constraint programming. As example, the graph coloring problem is used to illustrate how constraint programming works.


If you need an introduction to optimization problems and search heuristics, you can read the post below.


Mathematical Optimization Heuristics Every Data Scientist Should Know

Local search, genetic algorithms and more.

towardsdatascience.com


Graph Coloring

Let’s start with the graph coloring problem. This problem is used throughout the post to illustrate concepts of constraint programming.


For a given map, you want to color every country. You have an unlimited amount of colors. It is not allowed to give adjacent countries the same color. What’s the lowest amount of colors you need to fill the map?


The variables are the colors you give to the countries. The constraint is that it’s not allowed to give adjacent countries the same color. The objective is to minimize the number of colors used.


Sounds easy? In practice it can be hard! Here is a part of the solution for Africa:

机器学习中基于约束编程的优化算法(转载)_ide_02

Another way to visualize this problem is by using vertices and edges. Adjacent countries are connected with an edge. The vertices correspond with the countries. This is the previous example illustrated in this way:

机器学习中基于约束编程的优化算法(转载)_sed_03

From the map to vertices and edges. This makes it easier to detect mistakes and see adjacent countries. Image by author.

What is Constraint Programming?

The key idea of constraint programming (CP) is that it uses constraints to reduce the set of values that each variable can take. In CP, the program (or solver) keeps track of values that can appear. After every move, the search space is pruned. This means that the values that can’t happen anymore are removed. It can occur that there are no more possible moves while the solver hasn’t found a feasible solution yet. In that case, the solver starts from an earlier point where it made a decision and reconsiders it.


Basics

The focus of CP is on feasibility, instead of optimality. After a decision, the solver checks for feasibility and prunes the search space. In the image below you can see a basic CP solver. Big parts are the search and the constraint store, which I will explain in detail later on. In short: search is where the decisions are made, and the constraint store contains all possible variable values in the domain store and holds the constraints. A move is made (e.g. you color a country red), and in the constraint store the domain store is pruned based on the move. There is also the feasibility check, this means that when the move is not possible because it violates one or more constraints, it needs to be rolled back (failure).

机器学习中基于约束编程的优化算法(转载)_ci_04

Constraint programming. Image by author.

In the example below, you can see that when you color a country red, for the connected countries red has to be removed from the domain store, because those moves are not possible anymore.

机器学习中基于约束编程的优化算法(转载)_ci_05

Pruning: the two adjacent countries can’t be colored red anymore.

You might wonder: How can constraint programming find an optimal solution? Especially in the graph coloring problem, it’s easy to find a feasible solution. The easiest one is to give a different color to every country. That is a feasible solution but it is far from optimal. There are different ways to solve this. One example is to continuously keep solving the problem, and add a constraint that the new solution must use less colors than the previous one.


Search

For searching, there are some nice rules you can follow that can drastically improve the search. The first one is the first-fail principle. This means that you try first where you are the most likely to fail. This makes things easier at the end and reduces the search tree the most. In graph coloring, it’s better to start with a country that’s connected to many other countries, than with a country that doesn’t have much connections:


机器学习中基于约束编程的优化算法(转载)_sed_06

First-fail principle: start with the country with the highest number of connections.

This helps, because it’s harder to give this country a color if you already colored the surrounding countries. Chances are high that when you don’t start with this country you have to give it a new color, which is something you want to avoid.


There are different types of searching you can implement. Depending on the problem, you can choose the best one (or try them all):


Variable/value labeling

With this search method, you start with the variables. In graph coloring, the countries are the variables and the colors are the values. You choose the variable (country) to assign next in a smart way. You can for example pick the one that has the lowest possible values. Then, you choose the value (color) this variable will get. Often a value is chosen that leaves as many options as possible to the other variables. Below an easy example.

机器学习中基于约束编程的优化算法(转载)_ide_07

Example of variable/value labeling.

Value/variable labeling

Another way to handle search is to start with the values. You have a color and you select a country to give this color. This is exactly the opposite compared to variable/value labeling. In the graph coloring problem, you can make sets of countries that can all have the same color.

Domain splitting

With domain splitting, you don’t assign a value to a variable directly. Instead, you split the domain (possible values) of a variable in two or more sets. This is a weaker commitment than saying: this country should have this value, because you still have options for the country, one of the values from the set.

Symmetry breaking during search

Symmetry breaking can improve search drastically. You need to prevent symmetrical solutions from being explored, because this is a waste of time. Theoretically, symmetrical solutions are exactly the same. A way to start breaking symmetry in graph coloring is to fix colors for certain countries. Better ways to break symmetry is by ordering the variables (countries) or by only taking the current values (colors) and one new value (color) into account.

机器学习中基于约束编程的优化算法(转载)_ci_08

Symmetrical solutions. Different colors, but the solutions are exactly the same.

Randomization and restarts

It’s also possible to try different solutions in a random order. You just randomly pick values for variables and you check if the solution is feasible. If no solution is found after some time or number of tries, you restart the search.

The methods above are different ways to handle the search part of constraint programming. It’s possible to combine some of them.


Constraint store

There are different types of constraints possible to implement in the constraint store. Constraints are used for feasibility checking and pruning of the domain store. The ‘normal’ constraints are, in the case of graph coloring, that a country that is connected to another country can’t have the same color. But there are other interesting constraints you can create based on the problem. The goal of adding constraints is to tighten the problem. If you make a tighter definition, the optimal solution can be found faster.


Here are different types of constraints you can add to the constraint store:


Global constraints

The most important constraints you can add are global constraints. Global constraints help in pruning the search space and can detect infeasibility earlier. A global constraint helps in conveying the problem structure to the solver directly. An example of a global constraint is the alldifferent constraint. This constraint means that all variables should have different values. In graph coloring, when four countries are all connected, they should all have a different color. If we only have three colors left for these countries, the solution is infeasible: we can roll back and report failure.

机器学习中基于约束编程的优化算法(转载)_ci_09

The countries are all connected, so they should all have a different color.

Redundant constraints

A redundant constraint is a constraint that doesn’t add value in excluding a solution. But redundant constraints are computationally significant, because they reduce the search space. A redundant constraint captures something that is not possible but not captured in a constraint. An example in graph coloring: the number of colors can’t exceed the number of countries.

Surrogate constraints

By combining existing constraints, you get surrogate constraints. Surrogate constraints are helpful because they provide a more global view. An example is merging existing constraints or taking a linear combination of them.

Implied constraints

Another type of constraints you can add are implied constraints. Deriving a property from multiple existing constraints gives implied constraints.

You can be creative in creating constraints! For a certain problem, try to understand it completely to add useful constraints to the constraint store. This can improve the speed of the solver with a huge amount.


What is the relation between CP and MIP?

You might have heard of mixed integer programming. This is another technique to model discrete optimization problems. MIP uses different ways to find the best solutions to a problem. Below, you can read what the similarities and differences are between CP and MIP.


What is mixed integer programming?

Mixed integer programming (MIP) is a mathematical optimization technique that involves finding the optimal solution to a problem by solving a system of linear equations and inequalities that contain both integer and continuous variables. MIP focuses on optimality. To understand MIP and the math behind it properly, it’s important to know how linear programming and the simplex algorithm works. Read the article below if you want to have a refresher about those:


A Beginner’s Guide to Linear Programming and the Simplex Algorithm

Tackling a wide range of optimization problems.

towardsdatascience.com


A MIP solver relaxes constraints by allowing continuous values for the integer variables. This makes it possible to use the simplex algorithm. After that, other techniques are used to find a valid solution (integer values for integer variables). These techniques are out of scope for this post, but if you are interested, you can search for terms like branch and bound, branch and cut and cutting planes (e.g. Gomory cuts and polyhedral cuts).


How CP and MIP are related

Mathematically, it’s possible to formulate any MIP problem as a CP problem (or a CP problem as a MIP problem). So in that sense, they are the same. But because their ways to come to the optimal solution are different, you might want to consider which one is the best approach for your specific problem. MIP uses linear relaxation while CP uses logical inference. CP works in most cases faster for problems with many ‘or’ constraints, while MIP can handle ‘and’ constraints faster. If you are using a commercial MIP solver, they probably implemented CP techniques inside it, like creating high level constraints and logical reasoning. MIP is usually more flexible and more reliable. If you have the time, it’s always good to test different techniques to see what works best on your problem.


Conclusion

Hopefully you enjoyed this introduction about constraint programming. It is a technique that focuses on feasibility. The constraint store and way to search for solutions are the key components of CP. You can improve the model formulation drastically if you add different types of constraints to the constraint store. The core of the CP solver propagates through the constraints and checks if the solution is still feasible and if pruning is possible. Pruning means removing values from the variable domains. CP solvers can find optimal solutions, if you give them enough time.


Mixed integer programming is another way to formulate discrete optimization problems. It uses algebraic techniques to find an optimal solution. Depending on the problem, you can choose if CP or MIP is a better fit.