CS 231: Lab #8

Title image Spring 2020

Hash Tables

The goal of this lab period is to give you an opportunity to implement a hash table. Like your BSTMap from last week, it should implement the MapSet interface and store KeyValuePairs.

Video: Hashmaps

Tasks

  1. Setup

    Make a new folder for this week's project.

    Copy from last week or download the MapSet.java interface.

    Copy your KeyValuePair.java and WordCounter.java files from last week's project. You will be using a hash function on the Key to generate the index into the hash table.

  2. Create a Hashmap class that implements MapSet

    public class Hashmap<K,V> implements MapSet<K,V>

    Design your own implementation of a hash map. It is up to you whether or not to use the default hash function provided by Java, it is up to you how to handle collisions, and it is up to you to decide whether to implement open hashing or closed hashing.

    Open hashing uses a data structure like a linked list or binary search tree at each entry in the hash table so multiple entries can be stored at each position.

    Closed hashing allows only a single value to be stored at each location in the hash table and uses chaining to store colliding keys in a different location.

    A reasonable hash function implementation is to use the built-in hashCode function for the String type (lookup Java String hashCode) to generate an index given a String. Note that the hashCode function can return negative values (which is bad) so use Math.abs to compute the absolute value.

    A reasonable method of implementing open hashing is to use your BSTMap from last week to hold all of the entries at a given index in the hash table. The BSTMap will handle collisions in an efficient manner.

    The Hashmap keySet, values, and entrySet methods do not need to return the data in any particular order. The keySet and values methods should return it in the same ordering.

    Implement two constructors for the Hashmap class. One should have only the Comparator as its argument. The other should have both the Comparator and a suggested initial size for the hash table. The Comparator needs to be able to compare two things of type K.

        // Hashmap constructor that starts with default size hash table
        public Hashmap(Comparator incomp) {
            // code here
        }
    
        // Hashmap constructor that starts with the suggecsted capacity hash table
        public Hashmap( Comparator incomp, int capacity ) {
            // code here
        }

    The Hashmap's internal hash table array should be able to expand. When it should expand up to you to decide. If you are using open hashing with a BSTMap at each position in the hash table, you can allow multiple entries per location (on average) without losing efficicency. When the average number of entries per location goes above some specified value, then increase the size of the table. What the threshold should be is something you should test.

    If you implement closed hashing, then hash maps that are more than 50% full quickly become less efficient.

    Do not use an ArrayList as your hash table array. For a generic Hashmap, allocate a fixed size array of type Object[], because you cannot allocate an array of a generic type.

    Include a field in the Hashmap that counts the number of collisions that have occurred. This will be useful information when comparing different hash functions or overall performance. Please use the following definition of a collision.

    A collision occurs the first time a key is inserted and there is a different key already at the key's hashed location.

  3. Test your class thoroughly. Make sure there are no duplicate key entries. For debugging purposes, you may want to be able to print out a representation of your entire Hashmap that shows what is at each table entry (i.e. a toString method).

Once you have completed the Hashmap class, get started on the project.