Assignment 3, C++ Intro - Enhanced Word Class, Dictionary Class


This page last updated on: Jan. 21, 2002


Contents:


Due Date

Feb. 25, 2002 by end of day

Description

There are two classes to be implemented in this assignment:

Word Class:

Enhance the Word class by adding a default constructor (if one has not been defined yet), copy constructor, assignment operator, equality operator, 'less-than' operator, and output stream operator.

Dictionary Class:

Implement a Dictionary class. This will be one of the higher-level building blocks of the spell checker application and along with the Word class will provide most of the core functionality in the application. A Dictionary object stores multiple Word objects in a searchable structure, provides operations for searching and comparing input strings (words) against the dictionary object, and provides functionality which associates Word objects that have the same Soundex value. This allows the spell checking application to display Word objects which sound similar when a misspelled word is found in a document. A Dictionary object will have the following operations associated with it (at a minimum):

This assignment introduces the use of templates as a type parameterization for container classes. Specifically, a pre-written C++ standard library container class such as std::vector will be used, with the Word class as the template argument. Using pre-written container classes allows quicker development of applications that need to efficiently manage multiple objects. The implementation details are hidden from the application using the class, and alternate implementations or containers can be substituted with minimal changes to the source code.

The data structure and search mechanism for the Soundex related Words is also encapsulated in the Dictionary class, allowing the client application code to request a collection of Soundex related Words without concerning itself with implementation details.

Requirements

Word Class:

Add a default constructor (if one has not been defined yet), copy constructor, overloaded assignment operator, overloaded '==' operator, overloaded '<' operator (for container ordering), and overloaded '<<' operator for output stream printing.

Dictionary Class:

This implementation of the Dictionary class could use a sorted array of Word objects, stored internally as a standard C++ std::vector container, or a similar type parameterized data structure. An additional optimization is a data structure that provides associative lookup capabilities for the Soundex values.

Create a header file named dict.h containing the class declaration and an implementation file named dict.cpp. This class is a good example of a public interface and a private implementation. Since the public interface provides the only access to the functionality, an alternate implementation could be provided that has the same functionality, but different performance characteristics. For example, a hash table could be used for the implementation instead of a sorted std::vector, and the client code that uses Dictionary objects would not need to be changed (only recompiled with the alternate implementation).

Duplicate Word objects may be added to the Dictionary object - since the search logic works even with duplicates, the extra logic to first check for an existing duplicate Word is not needed.

A std library container is also a good facility to return the collection of Word objects related by Soundex values. Consider the implications of returning a collection of Word objects versus a collection of Word pointers, as well as returning the container by value versus by pointer or reference.

Both:

As usual, write the source code which implements the Word and Dictionary classes, and a test program which tests the member functions and functional logic of both classes. The Word source will be separated into word.h and word.cpp, while the Dictionary source will be separated into dict.h and dict.cpp. The test program will be called tstdict.cpp.

The test data will come from two files, each text files consisting of a set of words (one per line). The test application will create a Dictionary object from the second set of words, then check if each word from the first file is contained in the Dictionary. Use the std::ifstream capabilities built in to the C++ standard library for the file processing. The two file names will come from the command line arguments, with the first command argument being the set of words for testing, and the second being the Dictionary words. In addition, the enhanced Word class capabilities will be tested.

In particular, the test program should consist of at least the following functionality:

  1. Create a Dictionary object
  2. Insert multiple Word objects into the dictionary (one from each line in the second text file), destroying them appropriately (either automatically as stack objects, or with the delete operator)
  3. From the first text file, search the Dictionary object for matches with each Word object, and also display the collection of Soundex matches
  4. After the text file processing, create two Word objects, one on the stack, one from the heap
  5. Compare the Word objects, using the overloaded '==' and '<' operators
  6. Copy construct a third Word object from the second Word object
  7. Create an empty (using the default constructor) fourth Word object
  8. Using the overloaded '<<' operator, print all four Word objects (in one statement)
  9. Assign the first Word object into the third and fourth Word objects
  10. Using the overloaded '<<' operator, print all four Word objects again

Internals

Since the copy constructor, assignment operator, and the regular char const * constructor all have similar logic in the Word class, split this code into a separate private member function ("helper method"). A suggested interface would be something similar to:

private:
  void initWord (int len, char const* initStr);

The overloaded '<<' operator must be a non-member function since the Word object is on the right-hand side. An alternative to declaring the function as a friend function is to have the non-member function call a public output function that belongs to the Word class:

ostream& Word::streamOut (ostream& os) const {
  return (os << " " << mWordPtr << " ");
}

ostream& operator<< (ostream& os, Word const& rhs) {
  return rhs.streamOut(os);
}
Both the operator< and operator== functions should be implemented in terms of functions already written (e.g. compare and isEqual).

Important note - there is an important logic design decision to be made for implementing the default constructor. If the default constructor initializes the internal character pointer to null , then this must be handled appropriately in all of the member functions. Specifically, none of the standard C library string functions accept a null pointer (according to the C standard), so anything doing comparisons or length checks on the internal string must be guarded with conditional checks. In addition, the regular constructor must have logic to guard against a null pointer being passed in. Another approach for the default constructor logic (or null pointer passed in to the regular constructor) is to allocate a single char from the heap and initialize it to a nul char - i.e. the internal character pointer will point to an empty string. This will be handled safely and consistently with all of the other member functions in the class.

In the default constructor, all of the member data must be initialized, including the soundex char array, so that Word objects work correctly when compared, or accessed in other similar fashions.

The Dictionary class will need to encapsulate a container of Word's, along with a flag specifying whether it is sorted or not. The class should contain the following:

The sort state is needed so that the container is only sorted when necessary, instead of each time a Word is inserted. (Alternatively, an associative container could be used.) The condition that triggers a sort is the first time a search is attempted upon the container of Words. When a Word is inserted (and when the Dictionary object is initialized) set the state to unsorted, then when the first search on the container is tried, sort it and change the state to sorted. If another Word is later added, change the sort state back to unsorted, and follow the same logic.

For sorting and searching, a convenient set of algorithms have been included in the standard C++ library. These can be used by including the <algorithm> header file, and calling them as follows:

#include <vector>
#include <algorithm>
#include "dict.h"
// ... the following snippets of code will be somewhere in dict.cpp:
//   time to sort
  std::sort( mWords.begin(), mWords.end() ); // sort the whole container object
//   time to search
  std::binary_search (mWords.begin(), mWords.end(), w); // w is Word to search for, returns true or false
Note that the copy constructor and assignment operator functions can be disabled in the Dictionary class by declaring them as private.

To create a std::vector object with Word as the template type, declare the following:

  std::vector<Word> vecObject; // use an appropriate name for each file container object
std::vector provides a push_back member function, which takes an object by value (in this case, a Word object).

The size() member function returns the number of elements in the container. To loop through a std::vector container, indexes can be used as if the std::vector object is a regular C or C++ array:

  for (int i = 0; i != vecObject.size(); ++i) {
        // do something with vec[i], which will be the i'th element of the vector container
  }
The std::vector class is in the standard library namespace, so it needs to have std:: prepended.

Use standard C++ I/O functions for file handling, using the following include files:

#include <iostream>
#include <fstream> 
An I/O object of type std::ifstream can be created, passing in the name of the file as the constructor argument:
  std::ifstream inFile(fname);
Errors or end-of-file can be checked by invoking a conversion operator:
  if (inFile) { // inFile object is in a good state
The simplest approach to parsing the file is to use the std::fstream operator>> functionality to fill up a standard C++ library std::string object. As each std::string object is extracted, it will contain characters from the std::fstream up to a whitespace character:
#include <string> // C++ std lib string class, different than C library string functions
// ...
  std::string wordStr;
  while (inFile >> wordStr) { // do something with the string object
If there are one or more alphabetic characters, a Word object can then be constructed, relying on the Word constructor to skip the non-alpha characters. A std::string object provides access to the internal data as a char const * with the c_str() member function (e.g. wordStr.c_str() ).

More explanation will be given in class, as necessary.

Topics Learned

C++ features and general software development abilities used in this assignment include operator overloading, output stream overloading, object creation and destruction, name mangling, private member function design and implementation, sorting and searching, private functions as a disabling method, templatized containers, object state encapsulation, object lifetimes, private vs. public interfaces, and implementation vs. algorithm choices.

This page constructed by Cliff Green, Copyright © 1998-2002.