CS470/570 Artificial Intelligence

Program #1: Fred Flintstone problem-solving
(a.k.a. Warming up our Python mojo)

Overview:

There's nothing intrinsically different or magical about AI programming; it's all just software...only it just happens to be specialized towards certain analytic orientations, algorithms, and problem-solving goals. Well heck, you're all rock-star programmers, right? So let's get that programming mojo warmed up with a little basic problem-solving challenge!

I'm calling this "Fred Flintstone problem-solving" because it's just that: we know nothing at this early point in the course, so we can offer only completely off-the-cuff, uninformed, "naive" solving of a problem. As we move forward, we'll soon develop a broader understanding of the intellectual "terrain" surrounding problems like these: what we are really doing, what the alternative solution approaches in this terrain are, and how to think about what will work best. But for now, we're just going to get out our caveman club and flail away... and then grunt happily when we get a solution!

The Problem:

In this first small programming exercise, we will consider how we can solve Boggle. If you haven't played in awhile, here's the gist of the game: There are 16 cubes with letters on the faces. These cubes are randomly arranged in a 4x4 matrix by shaking the boggle game. The goal of the game is to make words out of these letters by traversing adjacent (horizontal, vertical or diagonal) tiles. This "chain" of letters may snake all over the board, but you can only use each tile once, i.e., no fair using the same letter twice in a word. In a fixed amount of time players must make as many words as possible. Words are then scored as follows: 1 point for each 3-4 letter word, 2 points for a 5-letter word, 3 points for a 6-letter word, 5 points for a 7-letter word, 11 points for a 8 (or more)-letter word.

The Assignment:

In this problem, you are asked to solve Boggle boards exhaustively: given a particular boggle board as input, your algorithm should enumerate all possible words that can be found in that Boggle board.

The dictionary we will use for our game of Boggle is the Tournament Scrabble Wordlist which includes 178,691 words. I've cleaned up and provided a file of dictionary words for you here.

Your program should read in a dictionary and an NxN boggle board as command-line arguments (e.g., python boggle.py dict.txt board.txt). It should then discover all possible words existing in the given board and print out a summary of its findings.

Detailed points:

Any dictionary used will have the same format as the twl06 dictionary, i.e., one word per line in a text file. This means that, given the appropriate dictionary file, your program can Boggle in French just as easily as in English!
The boggle board will consist of N lines of N letters separated by spaces. You should ignore extra space at the end of the line or extra newlines at the end of a file. (Hint: check out the python strip() function)
You should always assume Q is special and represents 'QU'. So "QEEN" will count as the word "queen"...but should still be listed a 'QEEN' in the found words, and count as a 4-letter word.
To ease scoring, your output should group found words into groups for 1,2,...X-letters. To ease correctness checking, your program should also print the total number of words found and provide an alpha-sorted list of all words.
To give an idea of efficiency, your program should report the time taken to run your code, and the total number of words checked (positions explored). (Hint: time.time() in time). The time is merely of passing interest, since this will vary by machine (proc. speed, memory, etc). What is really telling is the number of words explored.
I'm providing a couple of sample boards and their outputs to help verify your results.
1. Here is a standard 4x4 Boggle board and here is my output file for it.
2. Here is a nice fat 10x10 board, along with the outputs for it. As you can see, it's not even manageable without pruning the search! The combinatorics are killer here!
Your program's output does not have to be *identical*, but should be very close, i.e., reports same info in same order, nice and clean and readable.
Efficiency matters in time-based scenarios (like games). Keep efficiency close in mind as you design your code.

This is not a particularly hard problem...provided you think it through! (Hint: elegance, recursion). Just as a reference point: my solution has one main function of about 15 lines, three smaller helpers to load/print out boards and stuff. Without comments, the whole thing fits on a page.

Your write-up:

In addition to your code and solution print-outs, you'll need to provide a nice write-up of your solution. Your write-up should be professionally neat and must include:
A brief description of your solution approach/strategy. You don't need to dissect all of your functions etc. here. What I want is for you to describe your algorithm abstractly: how is it that your program goes about solving the problem.
Answers to the following questions:
- Analyze the problem. How many different words did your solver explore on a 4x4 board? 3x3? How many possible combinations of letters (i.e. actual words or not) can be constructed from an NxN board? Walk through your reasoning carefully, showing how your value comes together. Let's keep it simple just to get a decent upper bound without solving needing a PhD in combinatorics: Ignore detailed paths possible (or not), just assume that every letter could be chained with every other letter on the board.
- Explain how solving Boggle is a search problem.
- Use your solver to do a bunch of Boggle boards and reflect on the solutions you saw produced. Suppose there is a Boggle competition where players are given a sequence of boards to solve, and the time limit decreases with each board. What strategy for finding words would a "smart" (or as we'll call it in this course, "rational") program employ, based on your analysis of your explorations.
- In the sample output provided, there are two runs on the same board shown...with drastic differences in time/resources used to find identical results. What the heck could that devious Dr. D be doing here to achieve this? Magic? Hint: put in some print statements to watch your program work...and then reflect on the implications of where effort is wasted. The difference between the two outputs in my source code is exactly one line of code...plus another easily-created resource.

To turn in:

A professional packet with the following items in exactly this order:

Cover sheet: Name, course, assignment title, date
Your write-up, typed up, cleanly formatted.
Printout of your program output running the boards given in the first sample file.
Printouts of your program running the dynamically assigned boards. THIS LINK to the testing boards will be activated shortly before the due time.
Your fully-commented code (maybe be duplex printed).