Homework Assignment #3

Due Tuesday, November 6, 1:00 PM (Before Class)
60 points total

1. (30 points) Augmenting a Trie to Generate Random Stylized Text - Write in Python

For this problem I have given you some starter code in python: trie.zip. The file main.py and Trie.py loads in a textfile, splits out each word, and inserts the word into a trie. A function allows you to type in words and they are looked up in the trie. There are several text files in the folder that you can work with. alice.txt is Alice in Wonderland and romeoandjuliet.txt is Romeo and Juliet. The text is from Project Gutenberg and the original, unedited files are also in the folder as 11-0.txt and pg1112.txt. There is also some C++ code for a red-black tree in redblack.txt. For testing purposes, you might want to create a text file with just a few words if you want to see how the trie is constructed.

The following diagrams depict adding the words "hi", "hot", and "pi" to the trie. Initially the trie is a single TrieNode object with an empty children dictionary. The endsOnWord variable indicates if the node marks the ending of a word:

After adding the word "hi":

After adding the word "hot":

After adding the word "pi":

Part 1: Remove the call to input and search for a word in the trie from main.py. That was only added as a demo. Write a method in Trie.py that randomly chooses a word from the trie and returns it. The method should randomly choose a child node with equal probability. As you navigate the trie, if you are at an internal node (i.e. it has children) but it is also a valid word, return the word as the random word with 50% probability. Otherwise move on to a random child. Output a 100 word "paragraph" of random words. Here is an example using Romeo and Juliet:
pleading dexterity lath, goes ?the idolatry, porter widow pipes why, [a 1. 1. feel. phaeton be nuptial rhyme, 3. zounds, gate. he sallow m. name clubs zounds, cell. -the (o 2. digging [attended], so; ?the (within) [snatches 2. scope up, immediately. yonder rapier, cynthia's up 1. hymns hire 1. rigour vast a augmenting yard keys hills. hymns fly. "where drown'd, keys (o nine. iv. garish burns o 3. ?the r. snowy 2. -the tying "where "where whore!' quivers. 'heart's dainty, omnes. errand. eight. 3. he knight nine. limit, queen untimely rose pop'rin queen zounds, debt. (mark abed 2. i

Part 2: You could get a more meaningful random paragraph by choosing a word that follows the previous word in the text. One way to do this is to augment the trie by adding a list to each node that references the TrieNode that represents the words that follows. In our example where we add the words "hi", "hot", and "pi" in that order then we would end up with the following diagram:

The Nextwords list would be empty for all other nodes except those shown. Although only one "next word" is shown in each list for this example, in general there could be many next words. Modify your program to output a second paragraph, except this time generate it by first picking a single random word. Then for the next 100 words, pick a random word that follows the previous random word. Here is an example using Romeo and Juliet:
me of fair word, we'll to mar,' quoth my love, but passion lends his remedy. jul. 'tis not so fair a sea nourish'd with flowers and so full of good quarrel, sir? the measure of my lord, we pass; but romeo may trust me, it down. give it excels your will be rank'd with wild goose in that black and found me lurk where i see where have fought on my head so tedious is love be in that of the deadly level of breath? the face. speak more dark and threat'ned me now is as't should confess to himself
The text should make a little more sense than the random single words! Experiment with the other files, alice.txt, and redblack.txt to generate a random paragraph in some other styles.

10 pts extra credit: You could get a more common sequence of random words by selecting words that are more likely to follow the previous word. One way to do this is to add a counter to each node that counts how many times that node is linked to from a previous word (not from a parent node). You can now choose the next word probabilistically based on the number of linkings. For example, suppose the next word list links to three TrieNode word objects. The first was linked to 1 timem the second was linked to 10 times, and the third was linked to 5 times. Then you should select the first word with 1/16 probability, the second node with 10/16 probability, and the last word with 5/16 probability. This will generate more commonly accepted sentences, but will have less variation in word selection. Add this technique as a third paragraph for extra credit. Here is an example using Romeo and Juliet:

zounds, a man to do i to a man? give me from the house and the house of my sweet is the peace. for that the wall and all the friar to see that they saw her to my old as i will have this is my eyesight lost. show me a montague, and i never be my mother? why, i must i will be well, i will they are a right fair juliet is your love is not to the way to be my blood of my man to the earth too good to thy lady to be a

2. (15 points) Shortest Path on a Map  - Write in Python or Java or C++

The map below represents a crude graph of places around Anchorage. Each location is a node. The red lines indicate edges between nodes. The edges have weights, which is the distance between the nodes. All of this information is stored in the data below, which you can save in a text file and read into your program.

The format of the file is:

Line 1: Number of nodes in the file

The following then repeats for the number of nodes in the file:

Number of node (e.g. 1 on the map)
Name of node (e.g. "JBER" on the map)
Number of edges from this node
NeighboringNodeNumber  WeightOfEdgeToNode
... (repeats for the number of edges)

In the data below, Tikahtnu is node number 2 and has 4 edges. It is connected to Node 1 with weight 2.71, Node 3 with weight 2.21, Node 8 with weight 1.8, and Node 9 with weight 2.39

15
1
JBER
1
2 2.71
2
Tikahtnu
4
1 2.71
3 2.21
8 1.80
9 2.39
3
Science & Nature Museum
5
2 2.21
5 1.19
7 2.60
8 2.18
13 2.40
4
Sullivan Arena
4
5 1.31
6 0.79
14 1.25
15 0.76
5
Merrill Field
3
3 1.19
4 1.31
14 1.45
6
Anchorage Museum
3
4 0.79
7 0.41
15 1.58
7
AK Railroad
2
3 2.60
6 0.41
8
Cheney Lake
4
2 1.80
3 2.18
9 1.25
13 2.09
9
Totem 8
4
2 2.39
8 1.25
11 1.89
13 2.90
10
Campbell Creek Science Center
1
11 1.61
11
Crime Lab
4
9 1.89
10 1.61
12 1.89
13 1.38
12
Golden Donuts
3
11 1.89
13 0.94
15 1.40
13
UAA
6
3 2.40
8 2.09
9 2.90
11 1.38
12 0.94
14 0.81
14
Don Jose
4
4 1.25
5 1.45
13 0.81
15 1.02
15
Steamdot Coffee
4
4 0.76
6 1.58
12 1.40
14 1.02
        
To do:  Read this data into a graph.  Let the user enter a source and target destination (by number is OK). Then run Dijkstra's algorithm to find the shortest path between the source and target, and output the total distance and path that should be traveled to reach the destination from the source.  Prompt the user if he or she wishes to repeat the calculation.

3. (15 points) Minimum Spanning Tree on a Map - Write in Python or Java or C++

Using the same graph as problem 2, write a program that calculates a Minimum Spanning Tree and output the edges and total edge weight of the MST. The MST could be used if you wanted to connect all the nodes in some way (e.g. water, electrical network connection). Your program can use any MST algorithm that you like as well as the data structures of your choice for internal implementation (e.g. making a set or queue).