Converting Opensource Dictionary to JSON

Folddoc provide free online computer dictionary but they don’t have any standard tool or format to import dictionary in database or so.

Dictionary can be downloaded form http://foldoc.org/Dictionary.txt

Run the following nodejs code it will dump dictionary to OUTPUT.json and don’t forget to remove initial lines of disclaimer or they will add to your dictionary.

var fs  = require("fs");
var words=[];
var word,meaning="";
fs.readFileSync('./Dictionary.txt').toString().split('\n').forEach(function (line) { 
	if(line.length==0){
		meaning+="\n";
		return;
	}
    if(line.indexOf("\t")==0)
		meaning+=line.trim()+"\n";
	else{
		if(meaning.length>0)
			words.push({title:word,definition:meaning.trim()});
		word = line.trim();
		meaning = "";
	}
    
});

fs.appendFileSync("./output.json", JSON.stringify(words));

if you want to import to mongo use `mongoimport –db mad –collection foldoc –file foldoc.json –jsonArray`

At time of writing this article it had 15093 terms but importing resulted in 15110 documents, this might be because of some false positive.

Leave a Reply

Your email address will not be published.