Hash and Hash algorithm
Home » Uncategorized  »  Hash and Hash algorithm
Hash and Hash algorithm

Hash, translated as "hash", and transliterated as "hash", 해시게임 converts an input of any length into a fixed-length output through a hashing algorithm, and the output is a hash value. Hash and Hash algorithm This transformation is a compression map, that is,

the space of the hash value is usually much smaller than the space of the input, and different inputs may be hashed to the same output.

so it is impossible to uniquely determine the input value from the hash value. Simply put, it is a function that compresses a message of any length into a message digest of a fixed length.

Hash functions have the following basic characteristics:

If the hash values ​​calculated according to the same hash function are different, then the input values ​​must also be different.

If the hash values ​​calculated according to the same hash function are the same, the input values ​​are not necessarily the same.

The phenomenon that two different input values ​​have the same hash value calculated according to the same hash function is called a collision.

Hash function:

Direct addressing method: directly use the keyword k or k plus a certain constant (k+c) as the hash address;
Numerical analysis method: extract the number with a relatively uniform value in the keyword as the hash address;

The remainder method of division: divide the keyword k by a number p not greater than the length m of the hash table, and use the resulting remainder as the address of the hash table;

Segmentation superposition method: Divide the keyword into several parts with an equal number of digits according to the number of hash table addresses, and the last part can be shorter.

Then add these parts together, and the result after discarding the highest carry bit is the hash address of the keyword;

Square method: If the distribution of each part of the keyword is uneven, you can first find its square value, and then take the middle bits as the hash address according to the requirements;

Pseudo-random number method: A pseudo-random number is used as a hash function.

An important indicator to measure the quality of a hash function is the probability of collision and the solution to the collision.

Basically, any hash function cannot completely avoid collisions. Common ways to solve collisions are as follows:

Open addressing method: The open addressing method is to look for the next empty hash address once a conflict occurs.

As long as the hash table is large enough, the empty hash address can always be found and the record is stored;

Chain address method: Take each unit of the hash table as the head node of the linked list, and all elements whose hash address is I constitute a synonym linked list. That is,

when a conflict occurs, the keyword is chained at the end of the linked list with the unit as the head node;

Re-hash method: when the hash address conflicts, use other functions to calculate another hash function address until the conflict no longer occurs;

Establish a common overflow area: Divide the hash table into two parts: the basic table and the overflow table, and put the conflicting elements into the overflow table.

There are two relatively simple data structures for storing data: arrays and linked lists.

1) The characteristics of arrays are: easy to address, difficult to insert and delete;

2) The characteristics of the linked list are: difficult to address, and easy to insert and delete.

The method to solve the conflict of hash function, the chain address method,

is to combine the array and the linked list and play the advantages of both,

which can be understood as the array of the linked list.

The left side of the above figure is an array, and each member of the array is a linked list.

All elements contained in this data structure contain a pointer for linking between elements.

Elements can be allocated to different linked lists according to their own characteristics and in turn,

it is through these characteristics that the correct linked list is found, and then the correct element is found from the linked list.

Among them, the method of calculating the subscript of the element array according to the element characteristics is the hash algorithm-hash() function.