Characterizing & Solving Constraints in Program Analysis: 2011

Friday, December 30, 2011

Midyear Summary

The end of the semester was spent implementing more complex look-ahead schemas with better backtracking algorithms. This included implementing FC (Forward-Checking), FC-CBJ, and AC-2001, an improved arc-consistency algorithm.

FC is one of the best look-ahead schemas for CSPs—it is strong when dealing with CSPs of low tightness (few constraints) as well as CSPs with high tightness and high density (many constraints, many variables). When FC is paired with CBJ, it can be a very powerful search tool. This is due to the fact that FC is good at eliminating future paths in the search space, while CBJ is good at reducing the number of backtracks that occur during the search, making for a faster search all-around.

FC is a difficult algorithm to implement at first—its methods are somewhat different from your typical backtracking algorithms. In FC, you are using the current assignment of variables to eliminate future assignments that are not consistent with the partial assignment. It is also important to ensure that FC backtracks properly when domain wipe-out occurs—that is, if a certain assignment of variables causes a future assignment of a variable to be inconsistent, the algorithm must backtrack and form a new partial assignment. In my opinion, the most difficult part of implementing FC is managing domain values. It is a little tricky combining FC and CBJ, but is a lot easier if both are implemented separately first.

Another method of look-ahead is MAC (Maintaining Arc-Consistency). MAC performs a look-ahead in which a form of AC is run on the uninstantiated variables of the CSP. However, MAC is much more expensive than FC and performs better on CSPs of low density and high tightness. However, when graphically comparing the performances of FC and MAC, it is apparent that MAC uses more resources for problems with very low and very high tightness. MAC tends to make more constraint checks during look-ahead that save on backtracking, but do not make it a more efficient algorithm than FC.

We explored many other topics as well, such as search orders in CSPs. Search order can refer to variable order or value order, and the order in which variables are instantiated or values are chosen can have a great impact on the speed of an algorithm. A few ordering heuristics include min-conflict, cruciality, and promise. We implemented many ordering heuristics in our CSP solvers in order to compare the performance of our algorithms on random instances using different variable orderings.

We also examined GAC (Generalized Arc-Consistency) and studied Regín’s algorithm for applying it to a CSP with an All different constraint applying to its variables. The algorithm requires that the CSP be represented as a bipartite graph consisting of variables and the set union of all domain values. Then, a maximal matching must be found such that no two edges share a vertex. A value can be removed from the domain if its edge is not part of the matching, if it is not a strongly connected component, and if it does not begin at a free vertex.

All in all, I felt that I learned much this semester. While my main struggle was debugging my code and finding time in my busy schedule to do so, I feel that my programming and reasoning skills have improved and that this research opportunity has been very positive. I look forward to continuing next semester!

-Maggie

Thursday, December 15, 2011

MAC (using FC and AC2001) Summary-- Mary

The AC2001 and Forward Checking algorithms were used to implement the maintaining arc consistency algorithm. AC2001 builds off of AC3 by remembering the domain value of variable V[j] that supports the variable value pair (V[i], a), which is stored in the data structure LAST((V[i],a), V[j]) . Therefore, when the algorithm checks the variable value pair (V[i],a) against V[j] again, it does not necessarily have to start at the beginning of the domain list of V[j]; rather it starts from the domain value of V[j] stored in LAST((V[i],a),V[j]). AC2001, in comparison to AC3 and AC1, saves a significant amount of consistency checks. While implementing the AC2001 algorithm, I struggled with what type of data structure I should use for LAST. I eventually decided to use a hashtable with the key being the string “V[i].Name(), a, V[j].Name()” and the value being the domain value of V[j]. For example, suppose the V[i] was the variable Q1, a was the value 1, and V[j] was the variable Q2; my hashtable key was “Q1,1,Q2” and the value would be the integer 3. While this datastructure worked, I’m not sure it was ideal. Another part of AC2001 that I struggled with was how arcs were added to the queue. I had the same struggle when I implemented AC3.

After implementing AC2001, I begin integrating it with the Forward Checking algorithm. The Forward Checking algorithm solves a CSP instance by instantiating a variable, V[i], then weeding the domain values of the uninstantiated variables, future variables. This guarantees a consistent partial solution. Maintaining Arc Consistency is also a lookahead schema for solving CSPs. The idea of Maintaining Arc Consistency is to first run a forward checking algorithm, and then, if the instantiation is consistent with the future variables, further weed the domains of the future variables by also applying an arc consistency algorithm using the constraints that involve the future variables. Maintaining Arc Consistency does more weeding of future variables’ domain and thus has fewer nodes visited, but the consistency check increase.

In my case, I used Forward Checking and AC2001. One of my struggles with implementing Maintaining Arc Consistency was managing the reductions list. I struggled particularly with the idea that even if the instantiation of variable V[i] was consistent with Forward Checking, it may not necessarily be consistent with AC2001; that is though Forward Checking did not wipe out any future variables domain, AC2001 could. However, if there were still domain values left in the current variable, V[i], it is only necessary to undo the current instantiation of V[i] and undo the reductions associated with that instantiation; it is not necessary to do a complete backtrack. Another aspect I struggled with related to managing the reductions list was the idea that there could be a case where Forward Checking did not delete a future variable, V[j]’s, domain, thus no new list was added to reductions that related the two variables. However, though this could happen, it could also be the case that AC2001 did delete some domain values from V[j]; thus a new list needed to be added. Originally I was simply appending the domain values removed by AC2001 to the last list in V[j]’s reductions; since this was incorrect I had to check whether or not Forward Checking had created a list and if it had not, I created a new list to append to the reductions list.

After completely implementing and debugging Maintaining Arc Consistency I compared the results of the benchmark problems to the results generated by just Forward Checking. Even in the small problems, such as the 3 queens problem, Maintaining Arc Consistency had more consistency checks then Forward Checking. However, Maintaining Arc Consistency had less backtracks and nodes visited the Forward Checking. In the larger problems, particularly the zebra problem, Maintianing Arc Consistency did more than double the consistency checks, but visited less than half as many nodes and did less than half as many backtracks compared to Forward Checking. The CPU time for Maintaining Arc Consistency was higher than that of Forward Checking in all cases; however that could simply be a reflection of my implementation. Overall, it is difficult to tell if the trade off of more consistency checks is worth the significant drop in nodes visited and backtracks.

Monday, November 28, 2011

Nov. 28, 2011

Recently I’ve been working on implementing AC2001 as and MAC. The implementation of AC2001 went fairly smoothly especially since I had already implemented AC1 and AC3 previously this semester. I had a few debugging problem most of which came from confusion about when to I was adding arcs to the queue. I tested my implementation of AC2001 on about 15 instances (including the 3,4,5,6-queens problems and a few different instances of the zebra problem). The next step was implementing MAC (Maintaining Arc Consistency) with my currently implemented forward checking algorithm. At first it seemed simple: run FC and if it does not fail form a queue with all constraints between the future variables and run AC2001. This task has been a lot more challenging than I expected. I currently have the algorithm working correctly for finding one solution both with dynamic and static ordering. I’ve noticed that in comparison to FC and FCCBJ, MAC seems to have quite a few more consistency checks but less nodes visited, backtracks, and CPU time. Right now I am working on finding all solutions.

-Mary

Tuesday, November 8, 2011

Weeks 8 and 9

The last couple weeks have been spent mostly implementing the FC-CBJ algorithm. We are testing this algorithm using both static and dynamic orderings and finding one and all solutions. This algorithm has been particularly challenging to implement since it seems to have a few more methods to implement. Also, one of the most difficult parts about this assignment has been the dynamic ordering portion. Once again, I learned the importance of good design! I originally programmed using a static list of variables and simply ordered the list at the beginning then used the same ordering to solve the problem originally (I obviously was not thinking about having to eventually do dynamic ordering!). It would have been a better idea to have two separate structures: one for the uninstantiated variables and one for the instantiated variables. Static ordering for finding one solution was challenging, but I did not run into many debugging issues until I got to the portion of finding all solutions using dynamic ordering. I found that my program found one complete solution, then found several partial solutions. Again, this issue came back to the way I had originally designed my program to find all solution by simply using the static variable list. Overall, this has been the most beneficial task so far! I’m excited to continue learning about CSP’s and plan on designing my programs a little more carefully in the future.

-Mary

Tuesday, November 1, 2011

October, 2011

The last several weeks, we’ve begun exploring and implementing our first CSP solvers. Our first solver implemented the simplest backtracking algorithm, BT, and used several forms of static variable ordering. We studied Patrick Prosser’s paper on types of back-checking, forward-checking, and hybrids of the two.

We then began work on CBJ (Conflict-Directed Backjumping), one of the most efficient backtracking algorithms mentioned in Prosser’s paper. We are now moving on to forward checking and dynamic variable ordering, which I expect will be a lot more difficult to implement, but I am looking forward to the challenge.

The last few weeks, I have been spending some extra time improving and debugging by AC and NC algorithms from earlier. NC reduces a problem that has unary constraints, a case not handled by my AC algorithm, which is why it is important to run NC before AC, or your domains will not be properly filtered. I discovered a bug in my NC algorithm that I only discovered when testing on instances of the Zebra problem—our only test cases in which NC altered the resulting domains.

Once correcting this, I was able to catch a bug in my implementation of AC-3 and work to correct it. Implementing algorithms is what I find to be the most time-consuming aspect of research, but also the most rewarding aspect.

My main frustration lately has been time constraints—having a heavy course-load this semester makes it difficult to find extended periods of time to devote to coding and studying. I am happy to say that I have recently registered for spring classes, and have arranged for a lighter schedule that would allow me to devote more time to research.

-Maggie

Tuesday, October 25, 2011

Week 7

This week, we continued working on our implementation of the CBJ algorithm. We studied the hierarchy of algorithms based on consistency checks and also nodes visited. We discovered that, overall; FC-CBJ performs the best! We also began studying the similarities and differences between CSP's and Relational Databases. We went through the differences in terminology as well as went through examples of projection, selection, intersection, join, and union. Some of the main differences highlighted were:

CSP's have many relations while databases have only a few relations in a query, databases have a higher arity than CSP's typically, and databases have much larger domains than CSP's (to name a few!).

This week we are planning to finish comparing our implementations of the CBJ algorithm and begin our implementation of the FC-CBJ algorithm (first with static ordering then with dynamic ordering).

-Mary Burke

Tuesday, October 18, 2011

Mary Burke Week 6

This last week we finished with our backtrack search algorithm. I mentioned in my last post that we had issues comparing our search algorithms due to differentness in the way we implemented our search heuristics. One of the problems we ran into was with using the degree heuristic. I used the current degree, while my colleague used the future degree. In my implementation, I simply searched through the list of variables for the variables and sorted them based on the number of ‘neighbors’ the variable had. I the other implementation, after a variable is found to have the largest number of neighbors it is added to the list than its neighbors’ degree is decremented. After we discovered this, we decided to run both types of degree orderings. Additionally, we ran our backtrack search on random instances with varying tightness. Since these instances were much larger, we had to obtain a special accounts and run our script on the supercomputer at UNL!

Also, during the course of the week, we studied more thoroughly the forward checking algorithm and its hybrids. We went through the pseudocode of forward checking and studied the data structures and methods used in the algorithm. We discussed sufficient conditions for the backtrack search, backjump search, conflict-direct backjump search, and forward checking search. Finally, we began implementing the conflict directed backjump search and will compare results this week.

-Mary Burke

Monday, October 10, 2011

Week 5

During week 5 we began implementing the Backtrack search algorithm fo find one and all solutions to the CSP. To compare results we implemented variable value ordering heuristics: lexicographically ordering, least domain ordering, ordering by the degree of the variable, and ordering based on the domain size divided by the degree of the variable. Some of the issues arising in the debugging appear to have come from the ordering particularly when node consistency deleted some domain values. We are still working on debugging the backtrack search algorithm particularily with the zebra problem CSPs. Additionally, we also studied conflict directed backtracking, back marking, and discussed how to find all solutions using these search algorithms.

Monday, October 3, 2011

September, 2011

Research has been very engaging so far. Over the last month, I have been gradually learning the methods and processes of research as well as getting a lot of coding experience. First, we familiarized ourselves with the concept of the constraint satisfaction problem (CSP). We researched the structure of the problem and examined microstructures, macrostructures, graphs, networks, and hypergraphs used in CSP representation. We explored some of the applications for this type of problem as well. We then implemented code to parse and interpret an instance of a CSP and design a data structure for storing and manipulating it. I used Java for this particular task, as Java lends itself well to object-oriented code and CSPs are very modular in nature.

We then developed methods for traversing the vertices of the CSP via its constraints. In my CSP data structure, I allowed each variable to have a set of pointers to each of the neighboring variables in order to make node-traversal simpler. We then expanded our code to include a dynamic domain for each variable of the CSP, allowing us to keep track of each variable’s initial domain as well as its current, modified domain. With this addition to the program, we were able to first implement a node-consistency algorithm to limit the domain of each variable (based on variable constraints), and then implement arc-consistency to eliminate elements of the domains for variable-value pairs based on their shared constraints. For this project, we limited ourselves to binary CSPs because they represent a wide variety of problems and can be more easily conceptualized.

We first implemented an arc-consistency algorithm called AC-1, which removes all inconsistent domain values but is not the most efficient algorithm, as it reiterates through all variable-value pairs each time a node’s domain is reduced. So, we implemented an algorithm called AC-3 which is an improvement on AC-1 in that it reduces the number of constraint checks for CSPs with a moderate tightness level. AC-3 implements a dynamic queue that minimizes the number of variable domains checked and additionally reduces CPU-time for moderately-difficult problems. AC-3 was also slightly more difficult to implement.

We then tested our algorithms through a graphical performance analysis on hundreds of random binary constraint satisfaction problems. We were able to graph this data to determine trends in the number of constraint checks and computation time for both algorithms. We were able to prove that AC-3 did, in fact, perform better than AC-1 in terms of handling moderately difficult problems. We performed a t-test on the number of constraint checks done by each algorithm at different tightness levels of the random CSP instances in order to show that AC-3 performed better on average for CSPs with a moderate-to-high tightness level. This testing was also a useful debugging tool, as it allowed us to ensure that both algorithms were removing the same number of values from the variable domains of the CSP instance.

As this is the first research project I have worked on, I still have much to learn about research methods and tons of background material to delve into. Nevertheless, I am looking forward to implementing new algorithms and gaining a lot from this research opportunity!

--Maggie Krause

Friday, September 30, 2011

Week 1-4

My group has begun our research by doing background reading in Rina Dechta’s book, Constraint Processing, on Constraint Satisfaction Problems (CSPs) and the SAT problem. Our advisor, Dr. Choueiry, spent a fair amount of time during the first week making sure we had a clear understanding of these two problems. Our next step entailed utilizing a java CSP parser and creating classes to store that the CSP. This was a fairly simple task but came with a few issues mostly in trying to understand the parser.

With our initial program completed and thoroughly tested, we moved to studying Arc Consistency. In particular we studied the AC1, AC3, and AC2001 algorithms. We went through the pseudo code of all three of these algorithms and implemented AC1 and AC3. Each of us implemented these algorithms and recorded the number of consistency check, how many domain values were removed, the CPU time, and other information for each CSP instance. Once we separately implemented these algorithms, we compared results to make sure our implementations were correct and our results were consistent. We ran into quite a few issues with this and spent ample time debugging our programs and going through smaller problems manually to see what see what the actual number results should be.

After this step, we ran both AC3 and AC1 algorithms on 800 randomly created instances: we varied the tightness of the CSP and created 50 instances of each of these tightnesses. We compared the results of the CPU time and the consistency checks of both AC1 and AC3. We graphed the averages and ran the t test on each of the tightnesses. Our results showed that indeed AC3 performs better than AC1.

Our next step was to study algorithms to search for solutions to the CSP instances. We studied Patrick Prosser’s paper, “Hybrid Algorithms for the Constraint Satisfaction Problems”. We are in the process of implementing the backtrack search algorithm.

(We will be posting weekly on our progress)

-Mary Burke