Optimal way of scheduling long job – nodejs/js

If you are using nodejs for some backend stuff to schedule some-work, probably you will hate scheduling because of two factors

  1. You can’t set long timeout
  2. Timers are removed when process restart

You would probably favor cronjobs for scheduling – or may be polling database for regular interval. Loading nodejs script every 5min is a costly job – developers understand, so you can’t follow php/short interval scheduling.

First problem can be solved by of setting a recursive call to a function, with time out. Many solutions on web.
But there lies a problem – You can’t cancel the timeout if you needed to after a recursive call. To solve the recursive call problem i wrote few lines of code.

function setLongTimeout(callback, timeout_ms, updateTimerID) {
    //if timout is more than 32bit int
    var timer;
    if (timeout_ms > 2147483647) {
        //now wait until the max wait time passes then call this function again with
        timer = setTimeout(function () {
            var id = setLongTimeout(callback, (timeout_ms - 2147483647), updateTimerID);
            if (typeof updateTimerID ==='function')
                updateTimerID(id);
        }, 2147483647);
    } else {
        timer = setTimeout(callback, timeout_ms);
    }
    if (typeof updateTimerID ==='function')
        updateTimerID(timer);
    return timer;
}

Above code behaves same as `setTimeout` if you don’t need timer object updates, else you can register for that.

Second problem – Timers are removed when process are restarted. – That shouldn’t be hard – You can create a function to execute atomic functions,
Note atomic – atomic because if your setTimeout code will depend on variables state you will need to load those variables from database which will make job harder – better way is to schedule something like – `send email to [email protected]` instead of `send message to currently loggedin users`.

Solution of second problem really depends on your problem – but if you analyze your scheduled jobs closely you will find that they are really atomic, you made them non-atomic !.

if you really need that your jobs are never lost from memory better way is to run a stable nodeserver & use IPC. – but that would be practically hard to maintain.

Javascript Domless events

Using jQuery or nodejs makes you addicted to pubsub (publish/subscribe) model.
Nodejs has nattive support for event emitters but javascript doesn’t have. Events are only binded to a DOM, using which doesn’t make sense and its also adds to performance cost.

To solve the problem i wrote a custom class which implements `on` and `emit` method, Yes some methods like `once` are missing.

var tikEvent = function () {
    this.listeners = {};

};
tikEvent.prototype.on = function (event, callback) {
    if (!this.listeners.hasOwnProperty(event)) {
        this.listeners[event] = [];
    }
    this.listeners[event].push(callback);
};

tikEvent.prototype.emit = function (event) {
    if (this.listeners.hasOwnProperty(event)) {
        var args = Array.prototype.slice.call(arguments);
        args.shift();

        for (var i = 0; i < this.listeners[event].length; ++i) {
            try {
                this.listeners[event][i].apply(null, args);
            } catch (e) {
               console.error(e);
            }
        }
    }
};


//Some test code.

vat test = new tikEvent();
test.on('emitted',function(d){

console.log(d);
}

test.emit('emitted',"test");

Converting Opensource Dictionary to JSON

Folddoc provide free online computer dictionary but they don’t have any standard tool or format to import dictionary in database or so.

Dictionary can be downloaded form http://foldoc.org/Dictionary.txt

Run the following nodejs code it will dump dictionary to OUTPUT.json and don’t forget to remove initial lines of disclaimer or they will add to your dictionary.

var fs  = require("fs");
var words=[];
var word,meaning="";
fs.readFileSync('./Dictionary.txt').toString().split('\n').forEach(function (line) { 
	if(line.length==0){
		meaning+="\n";
		return;
	}
    if(line.indexOf("\t")==0)
		meaning+=line.trim()+"\n";
	else{
		if(meaning.length>0)
			words.push({title:word,definition:meaning.trim()});
		word = line.trim();
		meaning = "";
	}
    
});

fs.appendFileSync("./output.json", JSON.stringify(words));

if you want to import to mongo use `mongoimport –db mad –collection foldoc –file foldoc.json –jsonArray`

At time of writing this article it had 15093 terms but importing resulted in 15110 documents, this might be because of some false positive.

Importing large CSV file to Mongo.

Someday you will need to import a large file to mongo or some other DBMS ,
You start writing code for achieving same using NodeJS, but when you run memory usage start increasing and either core of your CPU is at 100%.

Continue reading

Getting whatsapp statistics – WhatsappWeb

There are no tool available for whatsapp which give us how many messages a user has contributed to group, Sometimes you need.

Whatsapp web exposes whatsapp to JS, whiche makes things more easy,
I wrote a simple code which logs Number of messages sent by each user of group, obviously you need full conversation open in web, You can use Home Key to do that,

The code was written in few minutes so it doesn’t have proper validations or automation,

But is good for purpose.
Open whatsappWeb and after opening complete conversation of group execute the following code in Google Chrome – Developer Console

var v = document.querySelectorAll('.msg.msg-group .text-clickable');
var t, count ={};
for(var i=0,l=v.length;i<l;++i){
t = v[i].textContent.trim();
 if(typeof count[t]=== 'undefined')
	count[t]=0;
 ++count[t];
}

console.log(count);

Redis Vs MongoDB

Redis and Mongodb are considered to solve a common problem that “There is something wrong with RDBMS”
So what is difference ?

  1. Redis is stored in memory and swapped with disk, So data is not lost and you can enjoy performance of your RAM. (Not to be confused with Memcached it is memory only database) whereas Mongodb is disk only database.
  2. Redis can store many type of object strings, hashes, lists, sets, sorted sets whereas Mongo can only store key value pair in form of document (mongo is schemaless).
  3. Redis is hard to learn whereas mongo is easier to understand and get started.
  4. Redis has no support for clustering (Clustering support has been added since version 3.0) whereas mongo has inbuilt clustering support.

Why not nodejs v4.1.1

Recently i updated nodejs installation to v4.1.1 using

curl -sL https://deb.nodesource.com/setup_4.x | sudo -E bash -
sudo apt-get install -y nodejs

and i ended up fixing many bug in other modules, but i couldn’t move far in this decision.

Nodejs has a history of breaking compatibility with older code in every new release, so if you have project with multiple dependencies don’t upgrade you nodejs version to latest, Instead use NVM to test your code for compatibility in different versions.

Backup Server Configurations to Git

When you have multiple servers it pain to remember every configuration and it may take hours to configure servers again incase you need.
It is also impractical to copy each files.

There are few tools available but they come with overhead attached moreover its fun to write custom solutions 🙂

1. Create directory
2. cd into it.
3. Initialize git
4. Add a remote
5. Create branch specifically for that server
6. Check out branch
7. Add shell script
8. Make a commit and push to server.

mkdir backup
cd backup/  
git init
git remote add [email protected]:tikaj/ConfigurationBackups.git
git branch $(hostname)
git checkout $(hostname)

#This file contains space delimited configuration file name and path relative to git.
touch ConfigurationFiles.list

vi backup.sh

Content of backup.sh

#!/bin/bash
BASEDIR=$(dirname $0)

cd $BASEDIR

while read line; do 
 IFS=' ' read -a A <<< "$line"
 cp ${A[0]} $BASEDIR/${A[1]} 
done < ConfigurationFiles.list


#Add new files and commit changes
git add --all :/
git commit -m "Update configuration files."
git push origin  $(hostname)
 

Then, Don't forget to generate SSH key adding key to your git server.

git push -u origin  $(hostname)

Now i can either setup a cron job or add command to daily backup script to update configuration files.
using

ssh [email protected] "bash /root/backup/backup.sh"

from remote.

Cloning Server to New Hardware

You can find many ways of cloning old server to new hardware, create diskimage and move to new hardware and if you have identical hardware use `dd` or similar tool to copy disks block by block, but if you have new hardware and want to move server without downtime and with no configuration change there is nearly no solution.

I used vestacp, ejabberd in my servers so i had to make changes even after moving some files, as IP changes every where but it is still far better solution than installing and configuring each package manually.

We used rsync to copy whole server to new server except following files
because on new network you will never wish to propagate old network configuration,device specific configurations, installation specific configs.

/boot/
/lib/modules
/etc/fstab
/etc/mtab
/etc/modules
/proc
/dev
/etc/network/interfaces
/root

Create a file with following directories and then execute `rsync` with `exclude-from`.

rsync will Synchronize all your files from remote to local.

Adding ellipsis to long text

When you have some long division and you don’t want to wrap text or extend size of div, But instead you want to use add few `…` at end of text using CSS

.wrapped{
overflow:hidden;
white-space: nowrap;
text-overflow: ellipsis;
width: 300px;
}

Example :

Text without text-overflow :Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nulla id sem vehicula, faucibus leo id.

Text with text-overflow : Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nulla id sem vehicula, faucibus leo id.