For educational purposes, I am trying to beat binary search
using CPU cache line.
If you uncomment #define EXIT_ONLY
, the search works like normal binary search
, except if there are few elements, the search became linear search
.
As expected this performs faster than binary search
.
However I want to improve future, so if you comment #define EXIT_ONLY
, then "small" linear search
is made instead of accessing just the "middle" element.
In theory the values for linear search must be in CPU cache line and access must be "free of charge".
However in practice this search is way too slow than the normal binary search
.
If I hardcode CACHE_COUNT_2
to be equal to 1, then speed is comparable, but still slower.
Note I never tried to unroll the for cycle in _linear()
.
What could be explanation of the slower execution?
Repo with all files is here:
http://ift.tt/1LQxQla
Aucun commentaire:
Enregistrer un commentaire