Problem
I have code to compare nodes in sets of XML files. However each set (A.xml + B.xml) have different names for the nodes and I need the output st:cout to always be consistent no matter what the input is. This is the example output of the program:
<entry>
<id><![CDATA[946757316]]></id>
<url><![CDATA[http://ift.tt/1HRvcWA;
<content><![CDATA[Specialized Dolce Sport 27 Speed]]></content>
<title><![CDATA[Bike]]></title>
<price><![CDATA[£600]]></price>
<date><![CDATA[01-AUG-13]]></date>
</entry>
Possible solution
Each time I do a comparison of A.xml and B.xml I define any nodes which do not follow default mappings. For example my standard node name for the description is "description" but in this example it is called "content" so I need to define it like this (description = content;), which means the line will be output like this:
<description><![CDATA[Specialized Dolce Sport 27 Speed]]></description>
Existing code
#include "pugi/pugixml.hpp"
#include <iostream>
#include <string>
#include <map>
int main() {
pugi::xml_document doca, docb;
std::map<std::string, pugi::xml_node> mapa, mapb;
if (!doca.load_file("a.xml") || !docb.load_file("b.xml")) {
std::cout << "Can't find input files";
return 1;
}
for (auto& node: doca.child("jobsite_vacancies").children("job")) {
const char* id = node.child_value("id");
mapa[id] = node;
}
for (auto& node: docb.child("jobsite_vacancies").children("job")) {
const char* idcs = node.child_value("id");
if (!mapa.erase(idcs)) {
mapb[idcs] = node;
}
}
for (auto& ea: mapa) {
std::cout << "Removed:" << std::endl;
ea.second.print(std::cout);
}
for (auto& eb: mapb) {
std::cout << "Added:" << std::endl;
eb.second.print(std::cout);
}
}
I'm new to C++ so any suggestions on how to implement this would be appreciated. I have to run this on tens of thousands of fields so performance is key.
Aucun commentaire:
Enregistrer un commentaire