Node.js and CPU intensive requests


Question

I've started tinkering with Node.js HTTP server and really like to write server side Javascript but something is keeping me from starting to use Node.js for my web application.

I understand the whole async I/O concept but I'm somewhat concerned about the edge cases where procedural code is very CPU intensive such as image manipulation or sorting large data sets.

As I understand it, the server will be very fast for simple web page requests such as viewing a listing of users or viewing a blog post. However, if I want to write very CPU intensive code (in the admin back end for example) that generates graphics or resizes thousands of images, the request will be very slow (a few seconds). Since this code is not async, every requests coming to the server during those few seconds will be blocked until my slow request is done.

One suggestion was to use Web Workers for CPU intensive tasks. However, I'm afraid web workers will make it hard to write clean code since it works by including a separate JS file. What if the CPU intensive code is located in an object's method? It kind of sucks to write a JS file for every method that is CPU intensive.

Another suggestion was to spawn a child process, but that makes the code even less maintainable.

Any suggestions to overcome this (perceived) obstacle? How do you write clean object oriented code with Node.js while making sure CPU heavy tasks are executed async?

1
202
8/16/2010 9:04:31 AM

Accepted Answer

What you need is a task queue! Moving your long running tasks out of the web-server is a GOOD thing. Keeping each task in "separate" js file promotes modularity and code reuse. It forces you to think about how to structure your program in a way that will make it easier to debug and maintain in the long run. Another benefit of a task queue is the workers can be written in a different language. Just pop a task, do the work, and write the response back.

something like this https://github.com/resque/resque

Here is an article from github about why they built it http://github.com/blog/542-introducing-resque

51
8/5/2013 12:27:20 PM

This is misunderstanding of the definition of web server -- it should only be used to "talk" with clients. Heavy load tasks should be delegated to standalone programs (that of course can be also written in JS).
You'd probably say that it is dirty, but I assure you that a web server process stuck in resizing images is just worse (even for lets say Apache, when it does not block other queries). Still, you may use a common library to avoid code redundancy.

EDIT: I have come up with an analogy; web application should be as a restaurant. You have waiters (web server) and cooks (workers). Waiters are in contact with clients and do simple tasks like providing menu or explaining if some dish is vegetarian. On the other hand they delegate harder tasks to the kitchen. Because waiters are doing only simple things they respond quick, and cooks can concentrate on their job.

Node.js here would be a single but very talented waiter that can process many requests at a time, and Apache would be a gang of dumb waiters that just process one request each. If this one Node.js waiter would begin to cook, it would be an immediate catastrophe. Still, cooking could also exhaust even a large supply of Apache waiters, not mentioning the chaos in the kitchen and the progressive decrease of responsitivity.


Licensed under: CC-BY-SA with attribution
Not affiliated with: Stack Overflow
Icon