lundi 19 octobre 2020

Succinct data structure for Trie implementation

I am trying to implement space efficient Trie data structure as mentioned in the below paper.( completion Trie )

Space-Efficient Data Structures for Top-k Completion

With above reference I did following steps while making data structure.
1) I created Trie tree ( without compressed ) with 1000 words.
2) Then I traversed tree by level order and created few vectors.

  • vector< bitset<5> > noOfChildren;
    How many children particular node has. A node can have ( 0 to 26 children) so it can be represented with 5 bits.

  • vector< bitset<5> > charsInBits;
    The character of a particular node is converted into 5 bits.

  • vector< int> startPosition
    It is a prefix sum of vector noOfChildren. Which gives the Index of 1st child of a particular node.

  • vector<bitset<1> > leafNode
    It stores whether that node is leaf node or not. 0 for no leaf and 1 for leaf.

    Now I stored all these vectors data in different files.(.txt)

( Above procedure was one time procedure. )

Now when I run the program, I fetch all the data from (.txt) files into respective vectors.
Then I process all these vectors and successfully traverse through Trie tree.

 Few questions I have:        
  
 1) Is this a compact Trie representation as I have not stored Trie node address in memory ?        
  
 2) The files in which I have stored the above vectors data is giving me more size       
    than my actual 1000 words file size. So I am confused , whether I have compressed Trie or not ?         
      
 3) For more than 1000 words my program gives me "Segmentation fault". It does not even create Trie tree. Why this is happening ?           
               

   Note : I have implemented above logic in program and my program is perfectly working      
          for 1000 words.           
       
         I have not implemented exactly what was said in above research paper.

Aucun commentaire:

Enregistrer un commentaire