Characterizing & Solving Constraints in Program Analysis

Monday, January 30, 2012

Week 2-3

These last few weeks we have continued our discussion about SAT problems. After understanding the format of the benchmark problems and running into a few issues compiling MiniSAT, we finally got the benchmark problems to run on MiniSAT AND produce the correct results :). We decided our next task is to dig into the MiniSAT code and try to understand how it works to solve the problems. To do this we decided to form a graph displaying what steps the solver is taking to solve each benchmark. We are in the process adding code to MiniSAT so that the graph is generated when the as the problem is being solved. Eventually what we would like to do is have a sort of animation of how MiniSAT is solving a problem. Right now, we are working to get the code compiling on our own computers (not using the CSE remote computer).

-Mary

Friday, January 13, 2012

Week One (Second Semester)

This week we started our weekly meetings again. This semester we are planning on meeting twice weekly as a group. Our meetings this semester will be more discussion based in comparison to last semester. This semester we are starting the SAT side of our research and including Elena and Dr. Dwyer in more in our discussions. We have started comparing some of the techniques of both fields and are noticing that a lot of the same concepts exist in both fields. For example a BCP (Boolean Constraint Propagation) in SAT is similar to the domino effect in CSPs. For the first week we decided to explore SAT using a SAT solver, particularly miniSAT. As of right now, I have downloaded it and (finally) figured out how to compile it. I plan on looking at some benchmark problems and trying to understand how it works J.

Sunday, January 8, 2012

Kick-off meeting for Spring 2012

The entire team (2 ugrad students, Maggie and Mary, 2 grad mentors, Elena and Robert, and 2 faculty sponsors, Matt and Berthe) met for a relaxed meeting on January 6, 2012 at the Oven (Indian restaurant). We discussed what was accomplished during Fall 2011 (where we focused on studying fundamental CP concepts and implementing search, backtracking, and consistency algorithms). We agreed to start Spring 2012 by reverse engineering MiniSAT (SAT solver), studying SMT solvers, then diving into the study of constraints that arise in program analysis. We will be meeting twice per week. Most likely as follows: Once as a team to discuss progress and once as a smaller team to read tutorial papers. Below is a photo of the super team taken at the Oven. From left to right: Elena, Matt, Berthe, Maggie, Robert, and Mary.

Tuesday, January 3, 2012

End of semester meeting, Dec, 22, 2011

On December 22nd, 2011, we met at the Oven (Indian restaurant) for a relaxed brainstorming meeting. Dr. Dwyer was in South Africa and could not join us. Elena Sherman (graduate mentor) came to the restaurant, looked around for us, but could not find us although we were seated at the entrance (mishaps occur...), and left :-( We are sorry, Elena, but we will coordinate better next time. Were able to attend: Mary, Maggie, Robert (graduate mentor), and Berthe. During the dinner, we discussed the joys and difficulties of the fall semester and decided to delay any discussion about the Spring semester to another meeting that Matt and Elena can attend. After the dinner, we took the photo below, which shows, from left to right, Robert, Mary, and Maggie. Berthe took the photo.

Friday, December 30, 2011

Midyear Summary

The end of the semester was spent implementing more complex look-ahead schemas with better backtracking algorithms. This included implementing FC (Forward-Checking), FC-CBJ, and AC-2001, an improved arc-consistency algorithm.

FC is one of the best look-ahead schemas for CSPs—it is strong when dealing with CSPs of low tightness (few constraints) as well as CSPs with high tightness and high density (many constraints, many variables). When FC is paired with CBJ, it can be a very powerful search tool. This is due to the fact that FC is good at eliminating future paths in the search space, while CBJ is good at reducing the number of backtracks that occur during the search, making for a faster search all-around.

FC is a difficult algorithm to implement at first—its methods are somewhat different from your typical backtracking algorithms. In FC, you are using the current assignment of variables to eliminate future assignments that are not consistent with the partial assignment. It is also important to ensure that FC backtracks properly when domain wipe-out occurs—that is, if a certain assignment of variables causes a future assignment of a variable to be inconsistent, the algorithm must backtrack and form a new partial assignment. In my opinion, the most difficult part of implementing FC is managing domain values. It is a little tricky combining FC and CBJ, but is a lot easier if both are implemented separately first.

Another method of look-ahead is MAC (Maintaining Arc-Consistency). MAC performs a look-ahead in which a form of AC is run on the uninstantiated variables of the CSP. However, MAC is much more expensive than FC and performs better on CSPs of low density and high tightness. However, when graphically comparing the performances of FC and MAC, it is apparent that MAC uses more resources for problems with very low and very high tightness. MAC tends to make more constraint checks during look-ahead that save on backtracking, but do not make it a more efficient algorithm than FC.

We explored many other topics as well, such as search orders in CSPs. Search order can refer to variable order or value order, and the order in which variables are instantiated or values are chosen can have a great impact on the speed of an algorithm. A few ordering heuristics include min-conflict, cruciality, and promise. We implemented many ordering heuristics in our CSP solvers in order to compare the performance of our algorithms on random instances using different variable orderings.

We also examined GAC (Generalized Arc-Consistency) and studied Regín’s algorithm for applying it to a CSP with an All different constraint applying to its variables. The algorithm requires that the CSP be represented as a bipartite graph consisting of variables and the set union of all domain values. Then, a maximal matching must be found such that no two edges share a vertex. A value can be removed from the domain if its edge is not part of the matching, if it is not a strongly connected component, and if it does not begin at a free vertex.

All in all, I felt that I learned much this semester. While my main struggle was debugging my code and finding time in my busy schedule to do so, I feel that my programming and reasoning skills have improved and that this research opportunity has been very positive. I look forward to continuing next semester!

-Maggie

Thursday, December 15, 2011

MAC (using FC and AC2001) Summary-- Mary

The AC2001 and Forward Checking algorithms were used to implement the maintaining arc consistency algorithm. AC2001 builds off of AC3 by remembering the domain value of variable V[j] that supports the variable value pair (V[i], a), which is stored in the data structure LAST((V[i],a), V[j]) . Therefore, when the algorithm checks the variable value pair (V[i],a) against V[j] again, it does not necessarily have to start at the beginning of the domain list of V[j]; rather it starts from the domain value of V[j] stored in LAST((V[i],a),V[j]). AC2001, in comparison to AC3 and AC1, saves a significant amount of consistency checks. While implementing the AC2001 algorithm, I struggled with what type of data structure I should use for LAST. I eventually decided to use a hashtable with the key being the string “V[i].Name(), a, V[j].Name()” and the value being the domain value of V[j]. For example, suppose the V[i] was the variable Q1, a was the value 1, and V[j] was the variable Q2; my hashtable key was “Q1,1,Q2” and the value would be the integer 3. While this datastructure worked, I’m not sure it was ideal. Another part of AC2001 that I struggled with was how arcs were added to the queue. I had the same struggle when I implemented AC3.

After implementing AC2001, I begin integrating it with the Forward Checking algorithm. The Forward Checking algorithm solves a CSP instance by instantiating a variable, V[i], then weeding the domain values of the uninstantiated variables, future variables. This guarantees a consistent partial solution. Maintaining Arc Consistency is also a lookahead schema for solving CSPs. The idea of Maintaining Arc Consistency is to first run a forward checking algorithm, and then, if the instantiation is consistent with the future variables, further weed the domains of the future variables by also applying an arc consistency algorithm using the constraints that involve the future variables. Maintaining Arc Consistency does more weeding of future variables’ domain and thus has fewer nodes visited, but the consistency check increase.

In my case, I used Forward Checking and AC2001. One of my struggles with implementing Maintaining Arc Consistency was managing the reductions list. I struggled particularly with the idea that even if the instantiation of variable V[i] was consistent with Forward Checking, it may not necessarily be consistent with AC2001; that is though Forward Checking did not wipe out any future variables domain, AC2001 could. However, if there were still domain values left in the current variable, V[i], it is only necessary to undo the current instantiation of V[i] and undo the reductions associated with that instantiation; it is not necessary to do a complete backtrack. Another aspect I struggled with related to managing the reductions list was the idea that there could be a case where Forward Checking did not delete a future variable, V[j]’s, domain, thus no new list was added to reductions that related the two variables. However, though this could happen, it could also be the case that AC2001 did delete some domain values from V[j]; thus a new list needed to be added. Originally I was simply appending the domain values removed by AC2001 to the last list in V[j]’s reductions; since this was incorrect I had to check whether or not Forward Checking had created a list and if it had not, I created a new list to append to the reductions list.

After completely implementing and debugging Maintaining Arc Consistency I compared the results of the benchmark problems to the results generated by just Forward Checking. Even in the small problems, such as the 3 queens problem, Maintaining Arc Consistency had more consistency checks then Forward Checking. However, Maintaining Arc Consistency had less backtracks and nodes visited the Forward Checking. In the larger problems, particularly the zebra problem, Maintianing Arc Consistency did more than double the consistency checks, but visited less than half as many nodes and did less than half as many backtracks compared to Forward Checking. The CPU time for Maintaining Arc Consistency was higher than that of Forward Checking in all cases; however that could simply be a reflection of my implementation. Overall, it is difficult to tell if the trade off of more consistency checks is worth the significant drop in nodes visited and backtracks.

Monday, November 28, 2011

Nov. 28, 2011

Recently I’ve been working on implementing AC2001 as and MAC. The implementation of AC2001 went fairly smoothly especially since I had already implemented AC1 and AC3 previously this semester. I had a few debugging problem most of which came from confusion about when to I was adding arcs to the queue. I tested my implementation of AC2001 on about 15 instances (including the 3,4,5,6-queens problems and a few different instances of the zebra problem). The next step was implementing MAC (Maintaining Arc Consistency) with my currently implemented forward checking algorithm. At first it seemed simple: run FC and if it does not fail form a queue with all constraints between the future variables and run AC2001. This task has been a lot more challenging than I expected. I currently have the algorithm working correctly for finding one solution both with dynamic and static ordering. I’ve noticed that in comparison to FC and FCCBJ, MAC seems to have quite a few more consistency checks but less nodes visited, backtracks, and CPU time. Right now I am working on finding all solutions.

-Mary