Concept of Hashing in data structure
#Tag, the concept of #(Hash) is to index the targeted content from a huge repository of diversified content, which helps to retrieve relevant content. Understand the concept of hashing in the data structure.
There were multiple data structures that were available to manage the data in memory. Like Array, Linked List, Binary Search, etc., Hash is also a kind of data structure used to manage data on memory so that operations can be performed efficiently. The technique of Hash is implemented as an array of Linked Lists. For hashing, the data storing process is done with the help of a function called a hash function.
About Hash Function
The role of the hash function is to do hashing, which means to map a large number of data items in a table called Hash Table. Hash tables associate a set of keys with a set of values. Each key-value pair is an entry in the table. By looking over the key you can find the corresponding value. To do entries in Hash-Table, the hash function uses the modulo operator to calculate an index of an array of slots. As in school you have studied the concept of function as “f(x) = y”, Similar to that “h(k) = array Index”. Let’s consider an example of a simple hash function
h(key) = key%table size
In a hash table with the size 7
h(25) = 25%7 = 4
h(49) = 49%7 = 0
h(54) = 54%7 = 5
Hash tables allow you to store a bunch of objects in such a way that you can later find them again very quickly.
In the hash table, the ratio between the number of stored items and the array’s size is considered a load factor. Whenever the load factor exceeds some threshold, rehashing is done for resizing a hash table which prevents performance degradation.
Operations performed using Hash Function
On the hash table, one can perform the following operations:
- Insert(x), and add object x to the data set.
- Delete(x), delete object x from the data set.
- Search(x), search object x in the data set.
Properties of a good hash function:
- It should provide a uniform distribution of hash values. A non-uniform distribution increased the number of collisions and the cost of resolving them.
- Choose a hash function that can be computed quickly and returns values within the range of the Hash-Table.
- Chose a hash function with a good collision resolution algorithm that can be used to compute an alternative index if the collision occurs.
- Choose a hash function that uses the necessary information provided in the key.
- It should have a high load factor for a given set of keys.
Applications of Hashing:
- Associative arrays: Hash tables are commonly used to implement many types of in-memory tables. They are used to implement associative arrays (arrays whose indices are arbitrary strings or other complicated objects).
- Hash Functions are used in various algorithms to make their computing faster
- Database indexing: Hash tables may also be used as disk-based data structures and database indices (such as in DBMS).
- Caches: Hash tables can be used to implement caches i.e. auxiliary data tables that are used to speed up the access to data, which is primarily stored in slower media.
- Object representation: Several dynamic languages, such as Perl, Python, JavaScript, and Ruby use hash tables to implement objects.
- Finding duplicate/similar records as similar records/strings will produce a similar hash
- Cryptography to check message integrity, data corruption detection, digital signature, authentication, etc
- When writing a chess program, it is sensible to keep track of what positions have been evaluated earlier, so you can backtrack whenever you run into the same position again. This is done using a hash table.