Hamming distance calculation

This is a small snippet of how to calculate hamming distance in cpp with small bit of assembly for doing a population count.

Code

typedef unsigned long long   hash_t; 

#include 
#include 

int popcount64(const hash_t& val) noexcept
{
    int ret;
    __asm__ ("popcnt %1, %1" : "=r" (ret) : "0" (val));
    return ret;
}

int hamming_distance(const hash_t& x, const hash_t& y)
{
    auto z = x ^ y;
    auto p = popcount64(z);

#ifdef  DEBUG
    std::cout<<"size  : " << sizeof(hash_t) << std::endl;
    std::cout<<"x val : " << std::bitset<sizeof(hash_t)>(x) << std::endl;
    std::cout<<"y val : " << std::bitset<sizeof(hash_t)>(y) << std::endl;
    std::cout<<"z val : " << std::bitset<sizeof(hash_t)>(z) << std::endl;
    std::cout<<"pop   : " << p << std::endl;
#endif

    return p;
}
</sizeof(hash_t)></sizeof(hash_t)></sizeof(hash_t)>

Usage

    hash_t hash1 = 123456;
    hash_t hash2 = 123456;

    int  distance = hamming_distance(hash1, hash2);
    std::cout<<"Hamming : " << distance <<"\n";

Results for sample runs

Same hashes so we expect our distance to be 0.
hash1 = 123456
hash2 = 123456

size  : 8
x val : 01000000
y val : 01000000
z val : 00000000
pop   : 0

Hamming : 0

Small difference in hashes.
hash1 = 123456
hash2 = 123455

size  : 8
x val : 01000000
y val : 00111111
z val : 01111111
pop   : 7

Hamming : 7

Medium difference in hashes.
hash1 = 123456
hash2 = 223455

size  : 8
x val : 01000000
y val : 11011111
z val : 10011111
pop   : 10

Hamming : 10

Large difference in hashes.
hash1 = 12345678
hash2 = 23445671

size  : 8
x val : 01001110
y val : 10100111
z val : 11101001
pop   : 14

Hamming : 14

code gist

Reference :
https://en.wikipedia.org/wiki/Hamming_distance

Leave a Comment

Your email address will not be published. Required fields are marked *