Well... I'm back to square one. I can't figure this out for the life of me.
I'm getting the following error:
FATAL ERROR: JS Allocation failed - process out of memory
I could enumerate the dozens (yes, dozens) of things I've tried to get to the root of this problem, but really it would be far too much. So here are the key points:
My assumption is that (because of the 2nd point), a leak is probably not the cause; rather, it seems like there's probably a SINGLE object that is very large. The following thread backs up this theory:: In Node.js using JSON.stringify results in 'process out of memory' error
What I really need is some way to find out what the state of the memory is at the moment the application crashes, or perhaps a stack trace leading up to the FATAL ERROR.
Based upon my assumption above, a 10-minute-old heap dump is insufficient (since the object would have not resided in memory).
I have to give huge props to Trevor Norris on this one for helping to modify node.js itself such that it would automatically generate a heap dump when this error happened.
Ultimately what solved this problem for me, though, was much more mundane. I wrote some simple code that appended the endpoint of each incoming API request to a log file. I waited to gather ~10 data points (crashes) and compared the endpoints which had been run 60sec before the crash. I found that in 9/10 cases, a single endpoint that had been hit just before the crash.
From there, it was just a matter of digging deeper into the code. I pared everything down -- returning less data from my mongoDB queries, passing only necessary data from an object back to the callback, etc. Now we've gone 6x longer than average without a single crash on any of the servers, leading me to hope that it is resolved... for now.
Just because this is the top answer on Google at the moment, I figured I'd add a solution for a case I just ran across:
I had this issue using express with ejs templates - the issue was that I failed to close an ejs block, and the file was js code - something like this:
This is obviously a super specific case, OP's solution should be used the majority of the time. However, OP's solution would not work for this (ejs stack trace won't be surfaced by