Characterizing & Solving Constraints in Program Analysis: December 2011

The end of the semester was spent implementing more complex look-ahead schemas with better backtracking algorithms. This included implementing FC (Forward-Checking), FC-CBJ, and AC-2001, an improved arc-consistency algorithm.

FC is one of the best look-ahead schemas for CSPs—it is strong when dealing with CSPs of low tightness (few constraints) as well as CSPs with high tightness and high density (many constraints, many variables). When FC is paired with CBJ, it can be a very powerful search tool. This is due to the fact that FC is good at eliminating future paths in the search space, while CBJ is good at reducing the number of backtracks that occur during the search, making for a faster search all-around.

FC is a difficult algorithm to implement at first—its methods are somewhat different from your typical backtracking algorithms. In FC, you are using the current assignment of variables to eliminate future assignments that are not consistent with the partial assignment. It is also important to ensure that FC backtracks properly when domain wipe-out occurs—that is, if a certain assignment of variables causes a future assignment of a variable to be inconsistent, the algorithm must backtrack and form a new partial assignment. In my opinion, the most difficult part of implementing FC is managing domain values. It is a little tricky combining FC and CBJ, but is a lot easier if both are implemented separately first.

Another method of look-ahead is MAC (Maintaining Arc-Consistency). MAC performs a look-ahead in which a form of AC is run on the uninstantiated variables of the CSP. However, MAC is much more expensive than FC and performs better on CSPs of low density and high tightness. However, when graphically comparing the performances of FC and MAC, it is apparent that MAC uses more resources for problems with very low and very high tightness. MAC tends to make more constraint checks during look-ahead that save on backtracking, but do not make it a more efficient algorithm than FC.

We explored many other topics as well, such as search orders in CSPs. Search order can refer to variable order or value order, and the order in which variables are instantiated or values are chosen can have a great impact on the speed of an algorithm. A few ordering heuristics include min-conflict, cruciality, and promise. We implemented many ordering heuristics in our CSP solvers in order to compare the performance of our algorithms on random instances using different variable orderings.

We also examined GAC (Generalized Arc-Consistency) and studied Regín’s algorithm for applying it to a CSP with an All different constraint applying to its variables. The algorithm requires that the CSP be represented as a bipartite graph consisting of variables and the set union of all domain values. Then, a maximal matching must be found such that no two edges share a vertex. A value can be removed from the domain if its edge is not part of the matching, if it is not a strongly connected component, and if it does not begin at a free vertex.

All in all, I felt that I learned much this semester. While my main struggle was debugging my code and finding time in my busy schedule to do so, I feel that my programming and reasoning skills have improved and that this research opportunity has been very positive. I look forward to continuing next semester!

-Maggie

The AC2001 and Forward Checking algorithms were used to implement the maintaining arc consistency algorithm. AC2001 builds off of AC3 by remembering the domain value of variable V[j] that supports the variable value pair (V[i], a), which is stored in the data structure LAST((V[i],a), V[j]) . Therefore, when the algorithm checks the variable value pair (V[i],a) against V[j] again, it does not necessarily have to start at the beginning of the domain list of V[j]; rather it starts from the domain value of V[j] stored in LAST((V[i],a),V[j]). AC2001, in comparison to AC3 and AC1, saves a significant amount of consistency checks. While implementing the AC2001 algorithm, I struggled with what type of data structure I should use for LAST. I eventually decided to use a hashtable with the key being the string “V[i].Name(), a, V[j].Name()” and the value being the domain value of V[j]. For example, suppose the V[i] was the variable Q1, a was the value 1, and V[j] was the variable Q2; my hashtable key was “Q1,1,Q2” and the value would be the integer 3. While this datastructure worked, I’m not sure it was ideal. Another part of AC2001 that I struggled with was how arcs were added to the queue. I had the same struggle when I implemented AC3.

After implementing AC2001, I begin integrating it with the Forward Checking algorithm. The Forward Checking algorithm solves a CSP instance by instantiating a variable, V[i], then weeding the domain values of the uninstantiated variables, future variables. This guarantees a consistent partial solution. Maintaining Arc Consistency is also a lookahead schema for solving CSPs. The idea of Maintaining Arc Consistency is to first run a forward checking algorithm, and then, if the instantiation is consistent with the future variables, further weed the domains of the future variables by also applying an arc consistency algorithm using the constraints that involve the future variables. Maintaining Arc Consistency does more weeding of future variables’ domain and thus has fewer nodes visited, but the consistency check increase.

In my case, I used Forward Checking and AC2001. One of my struggles with implementing Maintaining Arc Consistency was managing the reductions list. I struggled particularly with the idea that even if the instantiation of variable V[i] was consistent with Forward Checking, it may not necessarily be consistent with AC2001; that is though Forward Checking did not wipe out any future variables domain, AC2001 could. However, if there were still domain values left in the current variable, V[i], it is only necessary to undo the current instantiation of V[i] and undo the reductions associated with that instantiation; it is not necessary to do a complete backtrack. Another aspect I struggled with related to managing the reductions list was the idea that there could be a case where Forward Checking did not delete a future variable, V[j]’s, domain, thus no new list was added to reductions that related the two variables. However, though this could happen, it could also be the case that AC2001 did delete some domain values from V[j]; thus a new list needed to be added. Originally I was simply appending the domain values removed by AC2001 to the last list in V[j]’s reductions; since this was incorrect I had to check whether or not Forward Checking had created a list and if it had not, I created a new list to append to the reductions list.

After completely implementing and debugging Maintaining Arc Consistency I compared the results of the benchmark problems to the results generated by just Forward Checking. Even in the small problems, such as the 3 queens problem, Maintaining Arc Consistency had more consistency checks then Forward Checking. However, Maintaining Arc Consistency had less backtracks and nodes visited the Forward Checking. In the larger problems, particularly the zebra problem, Maintianing Arc Consistency did more than double the consistency checks, but visited less than half as many nodes and did less than half as many backtracks compared to Forward Checking. The CPU time for Maintaining Arc Consistency was higher than that of Forward Checking in all cases; however that could simply be a reflection of my implementation. Overall, it is difficult to tell if the trade off of more consistency checks is worth the significant drop in nodes visited and backtracks.

Characterizing & Solving Constraints in Program Analysis

Friday, December 30, 2011

Midyear Summary

Thursday, December 15, 2011

MAC (using FC and AC2001) Summary-- Mary