mardi 1 septembre 2015

Beating binary search using CPU cache line

For educational purposes, I am trying to beat binary search using CPU cache line.

http://ift.tt/1JuoQw4

If you uncomment #define EXIT_ONLY, the search works like normal binary search, except if there are few elements, the search became linear search.

As expected this performs faster than binary search.

However I want to improve future, so if you comment #define EXIT_ONLY, then "small" linear search is made instead of accessing just the "middle" element.

In theory the values for linear search must be in CPU cache line and access must be "free of charge".

However in practice this search is way too slow than the normal binary search.

If I hardcode CACHE_COUNT_2 to be equal to 1, then speed is comparable, but still slower.

Note I never tried to unroll the for cycle in _linear().

What could be explanation of the slower execution?

Repo with all files is here:
http://ift.tt/1LQxQla

Aucun commentaire:

Enregistrer un commentaire