Mongoose indexing in production code


Question

Per the Mongoose documentation for MongooseJS and MongoDB/Node.js :

When your application starts up, Mongoose automatically calls ensureIndex for each defined index in your schema. While nice for development, it is recommended this behavior be disabled in production since index creation can cause a significant performance impact. Disable the behavior by setting the autoIndex option of your schema to false.

This appears to instruct removal of auto-indexing from mongoose prior to deploying to optimize Mongoose from instructing Mongo to go and churn through all indexes on application startup, which seems to make sense.

What is the proper way to handle indexing in production code? Maybe an external script should generate indexes? Or maybe ensureIndex is unnecessary if a single application is the sole reader/writer to a collection because it will continue an index every time a DB write occurs?

Edit: To supplement, MongoDB provides good documentation for the how to do indexing, but not why or when explicit indexing directives should be done. It seems to me that indexes should be kept up to date by writer applications automatically on collections with existing indexes and that ensureIndex is really more of a one-time thing (done when a new index is being applied), in which case Mongoose's autoIndex should be a no-op under a normal server restart.

1
109
1/15/2013 8:49:13 PM

Accepted Answer

I've never understood why the Mongoose documentation so broadly recommends disabling autoIndex in production. Once the index has been added, subsequent ensureIndex calls will simply see that the index already exists and then return. So it only has an effect on performance when you're first creating the index, and at that time the collections are often empty so creating an index would be quick anyway.

My suggestion is to leave autoIndex enabled unless you have a specific situation where it's giving you trouble; like if you want to add a new index to an existing collection that has millions of docs and you want more control over when it's created.

115
1/15/2013 9:05:15 PM

Although I agree with the accepted answer, its worth noting that according to the MongoDB manual, this isn't the recommended way of adding indexes on a production server:

If your application includes ensureIndex() operations, and an index doesn’t exist for other operational concerns, building the index can have a severe impact on the performance of the database.

To avoid performance issues, make sure that your application checks for the indexes at start up using the getIndexes() method or the equivalent method for your driver and terminates if the proper indexes do not exist. Always build indexes in production instances using separate application code, during designated maintenance windows.

Of course, it really depends on how your application is structured and deployed. If you are deploying to Heroku, for example, and you aren't using Heroku's preboot feature, then it is likely your application is not serving requests at all during startup, and so it's probably safe to create an index at that time.

In addition to this, from the accepted answer:

So it only has an effect on performance when you're first creating the index, and at that time the collections are often empty so creating an index would be quick anyway.

If you've managed to get your data model and queries nailed on first time around, this is fine, and often the case. However, if you are adding new functionality to your app, with a new DB query on a property without an index, you'll often find yourself adding an index to a collection containing many existing documents.

This is the time when you need to be careful about adding indexes, and carefully consider the performance implications of doing so. For example, you could create the index in the background:

db.ensureIndex({ name: 1 }, { background: true });

Licensed under: CC-BY-SA with attribution
Not affiliated with: Stack Overflow
Icon