dimanche 4 septembre 2016

Seraching for similar columns in a huge csv file

I have a huge csv file which has 5000 columns and 5,000,000 rows. I know that there are some columns in this file which are exactly the same. I want to identify such columns. Please not that I cannot fetch this huge file into the memory and runtime is also important.

Aucun commentaire:

Enregistrer un commentaire