10 - Hashing


10.1 - Introduction

  • Hashing = A technique that retrieves the value using the index obtained from the key without performing a search.
  • Hashing is superefficient. Takes O(1) time to search, insert and delete an element using hashing.
  • Hashing uses a hashing function to map a key to an index.
    • Key is used to search for the corresponding value.
  • Hash Table = Array that stores the values that can map a key to an index in an array.
    • Has 3 states: Occupied, Marked or Empty.
  • Hash Function = The function that maps a key to an index in the hash table.
    • Converts a search key to an integer value (hashcode)
    • Compresses the hashcode to an index in the hash table.
  • Perfect Hash Function = A function that maps each search key to a different index in the hash table. Difficult because of collision.
  • Hash Collision = When two or more keys are mapped to the same hash value.
  • HashMaps have low time complexities. (get has O(1), values/rehash has O(capacity)

10.2 - Hash Functions and Hash Codes

  • Hashcode = An integer value of the search key from an object.
  • Java`s root class Object has the hashCode method.
    • hashCode() returns the memoryAddress of the object when not overridden.
    • HashCode rules:
      1. Override the hashCode() method when overriding the equals() method to make sure two equal objects have the same hashcode.
      2. Invoking the hashCode() method multiple times during a program execution should result in the same integer when the objects data has not changed.
      3. Two unequal objects may have the same hashcode, when they are not equal. The hashCode() method needs to be overwritten.
  • Folding = Converting a 64-bit into two 32-bit hashes and using the XOR (java bitwise: ^ )operation. To create one 32-bit hashcode from which all 64-bits have been taken in consideration. (for example Long.hashCode() works like this)
  • Polynomial hash code = ???

10.3 - Handling Collisions Using Open Addressing

  • Open Addressing = The process of finding an open location in the hash table in the event of a collision. In one of 3 ways:

    • Linear Probing = Finding the next available location sequentially.
      • Generally causes clusters.
      • Collision at hashTable[(k) % n]? Then k++;
      • K is the hashvalue.
      • N is a chosen prime.
      • Will always find a location, as long as the table is not full.
    • Quadratic Probing = Finding the next available location while changing the location modifier quadratic.
      • Causes Secondary Clustering
      • Collision at hashTable[(k + j^2) % N]? then j++, starting with 1;
      • K is the hashvalue.
      • J is an integer 1,2,3,4,5, etc.
      • N is a chosen prime.
      • Possible that is does not find a spot for the key.
    • Double Hashing = Uses a secondairy hash function h`(key) on the keys to determine increments.
      • Collision at hashTable[(k+j*h`(key))]? Then j++, starting with 0.
      • h`(key) should not return 0.
  • Cluster = A group of consecutive cells in a hashtable to be occupied.

  • Secondary Clustering = Entries can collide with an occupied entry using the same probe sequence.


10.4 - Handeling Collisions Using Seperate Chaining

  • Seperate Chaining = Placing all entries with the same hash index in the same location. Each location uses a bucket (ArrayList or LinkedList) to hold the multiple values.
  • Rehashing = Reloading of entries in a new hashtable after the previous one has exceeded its load factor.
    • Load factor is displayed as the greek lambda symbol.

results matching ""

    No results matching ""