lundi 1 juillet 2019

String encoding for memory optimization

I have a stream of strings in format something like this a:b, d:a, t:w, i:r, etc. Since I keep on appending these string, in the end it becomes a very large string.

I am trying to encode, for example:

a:b -> 1
d:a -> 2
etc.

My intension is to keep the final string as small as possible to save on memory. Hence I need to give single digit value to string occuring maximum number of times.

I have following method in mind:

Create: map<string, int> - this will keep the string and its count. In the end I will replace string with maximum count with 1, next with 2 and so on till last element of map.

Currently size of final string are ~100,000 characters.

I can't compromise on speed, please suggest if anyone has better technique to achieve this.

Aucun commentaire:

Enregistrer un commentaire