I stumbled over node.js sometime ago and like it a lot. But soon I found out that it lacked badly the ability to perform CPU-intensive tasks. So, I started googling and got these answers to solve the problem: Fibers, Webworkers and Threads (thread-a-gogo). Now which one to use is a confusion and one of them definitely needs to be used - afterall what's the purpose of having a server which is just good at IO and nothing else? Suggestions needed!
I was thinking of a way off-late; just needing suggestions over it. Now, what I thought of was this: Let's have some threads (using thread_a_gogo or maybe webworkers). Now, when we need more of them, we can create more. But there will be some limit over the creation process. (not implied by the system but probably because of overhead). Now, when we exceed the limit, we can fork a new node, and start creating threads over it. This way, it can go on till we reach some limit (after all, processes too have a big overhead). When this limit is reached, we start queuing tasks. Whenever a thread becomes free, it will be assigned a new task. This way, it can go on smoothly.
So, that was what I thought of. Is this idea good? I am a bit new to all this process and threads stuff, so don't have any expertise in it. Please share your opinions.
Node has a completely different paradigm and once it is correctly captured, it is easier to see this different way of solving problems. You never need multiple threads in a Node application(1) because you have a different way of doing the same thing. You create multiple processes; but it is very very different than, for example how Apache Web Server's Prefork mpm does.
For now, let's think that we have just one CPU core and we will develop an application (in Node's way) to do some work. Our job is to process a big file running over its contents byte-by-byte. The best way for our software is to start the work from the beginning of the file, follow it byte-by-byte to the end.
-- Hey, Hasan, I suppose you are either a newbie or very old school from my Grandfather's time!!! Why don't you create some threads and make it much faster?
-- Oh, we have only one CPU core.
-- So what? Create some threads man, make it faster!
-- It does not work like that. If I create threads I will be making it slower. Because I will be adding a lot of overhead to the system for switching between threads, trying to give them a just amount of time, and inside my process, trying to communicate between these threads. In addition to all these facts, I will also have to think about how I will divide a single job into multiple pieces that can be done in parallel.
-- Okay okay, I see you are poor. Let's use my computer, it has 32 cores!
-- Wow, you are awesome my dear friend, thank you very much. I appreciate it!
Then we turn back to work. Now we have 32 cpu cores thanks to our rich friend. Rules we have to abide have just changed. Now we want to utilize all this wealth we are given.
To use multiple cores, we need to find a way to divide our work into pieces that we can handle in parallel. If it was not Node, we would use threads for this; 32 threads, one for each cpu core. However, since we have Node, we will create 32 Node processes.
Threads can be a good alternative to Node processes, maybe even a better way; but only in a specific kind of job where the work is already defined and we have complete control over how to handle it. Other than this, for every other kind of problem where the job comes from outside in a way we do not have control over and we want to answer as quickly as possible, Node's way is unarguably superior.
-- Hey, Hasan, are you still working single-threaded? What is wrong with you, man? I have just provided you what you wanted. You have no excuses anymore. Create threads, make it run faster.
-- I have divided the work into pieces and every process will work on one of these pieces in parallel.
-- Why don't you create threads?
-- Sorry, I don't think it is usable. You can take your computer if you want?
-- No okay, I am cool, I just don't understand why you don't use threads?
-- Thank you for the computer. :) I already divided the work into pieces and I create processes to work on these pieces in parallel. All the CPU cores will be fully utilized. I could do this with threads instead of processes; but Node has this way and my boss Parth Thakkar wants me to use Node.
-- Okay, let me know if you need another computer. :p
If I create 33 processes, instead of 32, the operating system's scheduler will be pausing a thread, start the other one, pause it after some cycles, start the other one again... This is unnecessary overhead. I do not want it. In fact, on a system with 32 cores, I wouldn't even want to create exactly 32 processes, 31 can be nicer. Because it is not just my application that will work on this system. Leaving a little room for other things can be good, especially if we have 32 rooms.
I believe we are on the same page now about fully utilizing processors for CPU-intensive tasks.
-- Hmm, Hasan, I am sorry for mocking you a little. I believe I understand you better now. But there is still something I need an explanation for: What is all the buzz about running hundreds of threads? I read everywhere that threads are much faster to create and dumb than forking processes? You fork processes instead of threads and you think it is the highest you would get with Node. Then is Node not appropriate for this kind of work?
-- No worries, I am cool, too. Everybody says these things so I think I am used to hearing them.
-- So? Node is not good for this?
-- Node is perfectly good for this even though threads can be good too. As for thread/process creation overhead; on things that you repeat a lot, every millisecond counts. However, I create only 32 processes and it will take a tiny amount of time. It will happen only once. It will not make any difference.
-- When do I want to create thousands of threads, then?
-- You never want to create thousands of threads. However, on a system that is doing work that comes from outside, like a web server processing HTTP requests; if you are using a thread for each request, you will be creating a lot of threads, many of them.
-- Node is different, though? Right?
-- Yes, exactly. This is where Node really shines. Like a thread is much lighter than a process, a function call is much lighter than a thread. Node calls functions, instead of creating threads. In the example of a web server, every incoming request causes a function call.
-- Hmm, interesting; but you can only run one function at the same time if you are not using multiple threads. How can this work when a lot of requests arrive at the web server at the same time?
-- You are perfectly right about how functions run, one at a time, never two in parallel. I mean in a single process, only one scope of code is running at a time. The OS Scheduler does not come and pause this function and switch to another one, unless it pauses the process to give time to another process, not another thread in our process. (2)
-- Then how can a process handle 2 requests at a time?
-- A process can handle tens of thousands of requests at a time as long as our system has enough resources (RAM, Network, etc.). How those functions run is THE KEY DIFFERENCE.
-- Hmm, should I be excited now?
-- Maybe :) Node runs a loop over a queue. In this queue are our jobs, i.e, the calls we started to process incoming requests. The most important point here is the way we design our functions to run. Instead of starting to process a request and making the caller wait until we finish the job, we quickly end our function after doing an acceptable amount of work. When we come to a point where we need to wait for another component to do some work and return us a value, instead of waiting for that, we simply finish our function adding the rest of work to the queue.
-- It sounds too complex?
-- No no, I might sound complex; but the system itself is very simple and it makes perfect sense.
Now I want to stop citing the dialogue between these two developers and finish my answer after a last quick example of how these functions work.
In this way, we are doing what OS Scheduler would normally do. We pause our work at some point and let other function calls (like other threads in a multi-threaded environment) run until we get our turn again. This is much better than leaving the work to OS Scheduler which tries to give just time to every thread on system. We know what we are doing much better than OS Scheduler does and we are expected to stop when we should stop.
Below is a simple example where we open a file and read it to do some work on the data.
Open File Repeat This: Read Some Do the work
Open File and Do this when it is ready: // Our function returns Repeat this: Read Some and when it is ready: // Returns again Do some work
As you see, our function asks the system to open a file and does not wait for it to be opened. It finishes itself by providing next steps after file is ready. When we return, Node runs other function calls on the queue. After running over all the functions, the event loop moves to next turn...
In summary, Node has a completely different paradigm than multi-threaded development; but this does not mean that it lacks things. For a synchronous job (where we can decide the order and way of processing), it works as well as multi-threaded parallelism. For a job that comes from outside like requests to a server, it simply is superior.
(1) Unless you are building libraries in other languages like C/C++ in which case you still do not create threads for dividing jobs. For this kind of work you have two threads one of which will continue communication with Node while the other does the real work.
(2) In fact, every Node process has multiple threads for the same reasons I mentioned in the first footnote. However this is no way like 1000 threads doing similar works. Those extra threads are for things like to accept IO events and to handle inter-process messaging.
@Mark, thank you for the constructive criticism. In Node's paradigm, you should never have functions that takes too long to process unless all other calls in the queue are designed to be run one after another. In case of computationally expensive tasks, if we look at the picture in complete, we see that this is not a question of "Should we use threads or processes?" but a question of "How can we divide these tasks in a well balanced manner into sub-tasks that we can run them in parallel employing multiple CPU cores on the system?" Let's say we will process 400 video files on a system with 8 cores. If we want to process one file at a time, then we need a system that will process different parts of the same file in which case, maybe, a multi-threaded single-process system will be easier to build and even more efficient. We can still use Node for this by running multiple processes and passing messages between them when state-sharing/communication is necessary. As I said before, a multi-process approach with Node is as well as a multi-threaded approach in this kind of tasks; but not more than that. Again, as I told before, the situation that Node shines is when we have these tasks coming as input to system from multiple sources since keeping many connections concurrently is much lighter in Node compared to a thread-per-connection or process-per-connection system.
setTimeout(...,0) calls; sometimes giving a break during a time consuming task to allow calls in the queue have their share of processing can be required. Dividing tasks in different ways can save you from these; but still, this is not really a hack, it is just the way event queues work. Also, using
process.nextTick for this aim is much better since when you use
setTimeout, calculation and checks of the time passed will be necessary while
process.nextTick is simply what we really want: "Hey task, go back to end of the queue, you have used your share!"
(Update 2016: Web workers are going into
io.js - a Node.js fork Node.js v7 - see below.)
(Update 2017: Web workers are not going into Node.js v7 or v8 - see below.)
(Update 2018: Web workers are going into Node.js Node v10.5.0 - see below.)
You can think of a web worker as a lightweight microservice that is accessed asynchronously. No state is shared. No locking problems exist. There is no blocking. There is no synchronization needed. Just like when you use a RESTful service from your Node program you don't worry that it is now "multithreaded" because the RESTful service is not in the same thread as your own event loop. It's just a separate service that you access asynchronously and that is what matters.
The same is with web workers. It's just an API to communicate with code that runs in a completely separate context and whether it is in different thread, different process, different cgroup, zone, container or different machine is completely irrelevant, because of a strictly asynchronous, non-blocking API, with all data passed by value.
As a matter of fact web workers are conceptually a perfect fit for Node which - as many people are not aware of - incidentally uses threads quite heavily, and in fact "everything runs in parallel except your code" - see:
But the web workers don't even need to be implemented using threads. You could use processes, green threads, or even RESTful services in the cloud - as long as the web worker API is used. The whole beauty of the message passing API with call by value semantics is that the underlying implementation is pretty much irrelevant, as the details of the concurrency model will not get exposed.
A single-threaded event loop is perfect for I/O-bound operations. It doesn't work that well for CPU-bound operations, especially long running ones. For that we need to spawn more processes or use threads. Managing child processes and the inter-process communication in a portable way can be quite difficult and it is often seen as an overkill for simple tasks, while using threads means dealing with locks and synchronization issues that are very difficult to do right.
What is often recommended is to divide long-running CPU-bound operations into smaller tasks (something like the example in the "Original answer" section of my answer to Speed up setInterval) but it is not always practical and it doesn't use more than one CPU core.
There are few modules that are supposed to add Web Workers to Node:
I haven't used any of them but I have two quick observations that may be relevant: as of March 2015, node-webworker was last updated 4 years ago and node-webworker-threads was last updated a month ago. Also I see in the example of node-webworker-threads usage that you can use a function instead of a file name as an argument to the Worker constructor which seems that may cause subtle problems if it is implemented using threads that share memory (unless the functions is used only for its .toString() method and is otherwise compiled in a different environment, in which case it may be fine - I have to look more deeply into it, just sharing my observations here).
If there is any other relevant project that implements web workers API in Node, please leave a comment.
I didn't know it yet at the time of writing but incidentally one day before I wrote this answer Web Workers were added to io.js.
In Update 1 and my tweet I was referring to io.js pull request #1159 which now redirects to Node PR #1159 that was closed on Jul 8 and replaced with Node PR #2133 - which is still open. There is some discussion taking place under those pull requests that may provide some more up to date info on the status of Web workers in io.js/Node.js.
Latest info - thanks to NiCk Newman for posting it in the comments: There is the workers: initial implementation commit by Petka Antonov from Sep 6, 2015 that can be downloaded and tried out in this tree. See comments by NiCk Newman for details.
As of May 2016 the last comments on the still open PR #2133 - workers: initial implementation were 3 months old. On May 30 Matheus Moreira asked me to post an update to this answer in the comments below and he asked for the current status of this feature in the PR comments.
The first answers in the PR discussion were skeptical but later Ben Noordhuis wrote that "Getting this merged in one shape or another is on my todo list for v7".
All other comments seemed to second that and as of July 2016 it seems that Web Workers should be available in the next version of Node, version 7.0 that is planned to be released on October 2016 (not necessarily in the form of this exact PR).
Thanks to Matheus Moreira for pointing it out in the comments and reviving the discussion on GitHub.
As of July 2016 there are few modules on npm that were not available before - for a complete list of relevant modules, search npm for workers, web workers, etc. If anything in particular does or doesn't work for you, please post a comment.
As of January 2017 it is unlikely that web workers will get merged into Node.js.
The pull request #2133 workers: initial implementation by Petka Antonov from July 8, 2015 was finally closed by Ben Noordhuis on December 11, 2016 who commented that "multi-threading support adds too many new failure modes for not enough benefit" and "we can also accomplish that using more traditional means like shared memory and more efficient serialization."
For more information see the comments to the PR 2133 on GitHub.
Thanks again to Matheus Moreira for pointing it out in the comments.
I'm happy to announce that few days ago, in June 2018 web workers appeared in Node v10.5.0 as an experimental feature activated with the
For more info, see:
Finally! I can make the 7th update to my 3 year old Stack Overflow answer where I argue that threading a la web workers is not against Node philosophy, only this time saying that we finally got it!