Computer Science 302
Introduction to Data Structures
Spring 2005
Assignment 6
Due Thursday, May 7, 2005.
This deadline cannot be extended.
This file updated:
Mon Apr 18 17:35:26 PDT 2005

-
Redo Assignment 4, the minimum spanning tree assignment, but use
efficient data structures and algorithms for each portion of
the code.
-
The inputs for this assignment are the same as for Assignment 4, as
are the required outputs.
-
It is expected that students will discuss the assignments. But do
not turn in code written by another person. If that happens, both
persons are violating the ethics of this course. Representing the
work of another person as your own is plagiarism, and may result in
a disciplinary "F" in the course.
It is permissable (in this course, as this is up to the instructor)
for you to obtain ideas from other students or from other persons.
But you must type in the code yourself.
It is unadvisable for you to allow another student to have an electronic
copy of your work, as then you would unwittingly be a party to a breach
of these rules. Showing the other student a hard copy is less dangerous.
The above rule does not apply to copying files that I have made publicly
available. I expect that you will use those in their original form or
with appropriate modifications (made by just you, of course).
If you utilize code that you obtain from other sources, be sure to give
credit in a comment, such as:
// this code from website www.mit.edu/help/x13
-
Find the least cost telephone network connecting all villages to the
capitol city on an island.
A certain island has one city, which is the capitol, and n
villages. The Governor of the island has hired consultants from the
UNLV College of Engineering to optimally construct a data network linking
all villages to the capitol city. This network will consist of
data lines, where each data line is between two villages or between
a village and the city. A village may have a router which can
connect to any number of data lines, but no routers are allowed
outside the villages or the city.
Consultants from another department, taking into account the varied
terrain of the island, have provided you with a list of costs for running
data lines between each pair of villages and each village and the city.
Using those costs as inputs, design a minimal cost network.
-
Here are the data structures and algorithms that you must use.
-
Represent the graph as two data structures:
-
A search structure of V of towns, where the key is the name of the town.
You need to be able to do findInsert on V, and it must be designed so that
each findInsert takes O(log n) time, where n is the number of towns.
There are several possible implementations for V which fulfill the
conditions. You could use a hash table, or a treap, or perhaps one of the
other structures we've discussed or will discuss in class.
-
A structure E which contains all edges. Each edge will contain two
pointers to towns, as well as a weight. Do not store the names of
the owns in the edge. Since those names could be long, they should
be stored in only one place, namely in V.
E must be designed so that it is possible to insert an edge, and it is
also possible to delete an edge of minimum weight. Two possible
implementations of E would be a heap and a sorted array. There are
other possibilities. If you implement it as a heap,
make sure each operation takes O(log n) time. If you implement it
as a sorted array, make sure that your sorting algorithm take O(n log n)
time.
-
Use Kruskal's algorithm. We start with a subgraph S consisting of all
towns and no edges. At each step, we pick a least weight unused edge
and check to see whether the two ends of that edge belong to the same
component of S. If they do, the edge is discarded and S is unchanged
(although a path compression in F might occur). If they do not,
that edge is added to S. The algorithm stops when S is connected,
which will happen after exactly n-1 edges are added to S.
-
You must have a forest F whose nodes are in one-to-one correspondence
with the members of V.
Each node x of F must contain the following data:
-
A pointer to a member of V.
-
A boolean indicating whether x is the root of a tree of F.
-
A number indicating how many nodes are in the tree rooted at x, if
x is a root. If x is not a root, this number is ignored.
-
A pointer to the parent of x, another node of x. If x is a root of F,
then the parent of x is either itself or null, whichever you prefer.
Pointers can be implemented in at least two ways: the pointer to dynamic
variables that are built into C++, or integers that are indices of an array.
There may be other ways to implement pointers.
-
You must use union-find with path compression, as I have explained in
class.
-
Instead of having V and F as separate data structures, I recomment
a combined data structure "VF" as follows:
-
VF is a search structure, keyed by name of town, which allows the
operation "findInsert(townName)" in O(log n) time.
-
Each node of VF contains the name of the town, whatever you need
to implement the search structure (i.e., if you use a treap, you would
need a randomly chosen number "priority" and left and right pointers
to other nodes), a pointer to its parent node, a boolean indicating
whether it is the root of a component of S, and a number giving the
size of that component if it is a root of a component of S.
-
Initialize VF to be empty. For each input line, read the two town
names as well as w, the weight of the edge. Then obtain two nodes
of VF, say x and y, using findInsert. You may think of x and y as
pointers. Insert the triple (x,y,w) into E.
-
After all input has been read, sort or heapify E, whichever.
-
For each member of E, in order of increasing w, execute the following:
lx = findLeader(x) [Use path compression]
ly = findLeader(y) [Use path compression]
If lx is not equal to ly, then:
union(lx,ly) and insert (x,y,w) into S.
-
You can stop when you have done union n-1 times, where n = number of towns.
but if you don't do that check, it works anyway.
-
Print S and the total of the weights of the members of S.
-
Sample Input and Output.
-
Specifications.
-
Each input file consists of at most 1024 lines, one for each edge.
-
Each input line consists of at most 80 characters.
-
Each line consists of the name of a town followed by a comma
followed by the name of another town followed by a space followed by
a positive integer. There are no spaces in the names of the towns.
For example:
Chicago,NewYork 731
SantaAna,WestCovina 62
-
Your program should print out a minimum cost spanning tree, by listing
the edges and the total cost of the spanning tree. If there is more
than one minimal cost spanning tree, just list one of them.
-
If the graph is disconnected, there is no spanning tree.
Your program must then report that. In this case, list
all the edges, followed by the statement "The graph is disconnected."
-
The graph is planar. That implies that numberOfEdges = O(numberOfTowns).
In fact, if a planar graph has n
vertices and m edges, then
m <= 3n - 6, provided
n >= 3.
This information may or may not be useful to you. But, whether you make
use of it now or not, there is some other application in which that
information is of importance.
-
Input files needed.
Run your program on the following four input files:
-
As always, the programs must be executed on our machine, "student".
If you cannot get your program to run on that machine, even though
it works on some other computer, that will be graded the same as
non-working code, in other words,
partial credit might be given.
-
Turn in all files, including a file that shows several sample runs.
All files must be turned in electronically to the teaching assisant.
The deadline for electronic submission
is Thursday, May 7. The submission must be timestamped
before midnight. You must mail it from your engineering college account.
Be sure to include the string "CS 302 001 Hw6" in the subject field
of your email.
How I did it.
