Enabling TTL on MongoDB GridFS

Anything you store in MongoDB’s GridFS is stored in 2 collections. For instance if your files are loaded in a GridFS bucket called ‘images’ you will see 2 collections:

  • images.files
  • images.chunks

The .files collection will contain all the metadata around the file itself, but not the actual contents. Metadata such as filename, content type, file size, your own meta data, upload timestamp, etc. The .chunks collection will contain the actual file contents encoded in base64. Depending on the chunk size (default is 255 kilobytes) and file size, you could have one or more chunks for your file.

The ‘files_id’ field in the .chunks collection is a foreign key to the ‘_id’ primary key in the .files collection.

The .files collection contains an uploadDate field, which is ideal to create a TTL index for. The problem is that when the file does get deleted from the .files collection, the chunks still remain in the other collection. To have that removed you also need to add the uploadDate field in there.

Because there is a chance that the chunks may get removed slightly before the file itself, it is best to set the TTL on the .chunks collection a little longer than what you set for the .files collection. Consider at least a few minutes longer because the MongoDB TTL cleanup process runs about once a minute.

Below code example uses Node.js, mongoose and gridfs-promise to illustrate one way this can be implemented in your project.

./app.js:

import mongoose from 'mongoose';
import { GridFSPromise } from 'gridfs-promise';
import ImageFile from './models/imageFile.model.js';
import ImageChunk from './models/imageChunk.model.js';

(async () => {
    await mongoose.connect('mongodb://localhost/mydb', {useNewUrlParser: true});
    
    // The inclusion (import) of the mongoose models will trigger the TTL index
    // creation on the GridFs collections.
    console.log(`Enabling TTL on Image GridFS collection '${ImageFile.modelName}' ...`);
    console.log(`Enabling TTL on Image GridFS collection '${ImageChunk.modelName}' ...`);
    
    const imageData = Buffer.from('<base64 image data>', 'base64');
    const contentType = 'image/png';

    const gridFS = new GridFSPromise(mongoose.connection.name, null, null, 'images');
    gridFS.CONNECTION = mongoose.connection.getClient();
    gridFS.uploadFileString(imageData, 'myimage.png', contentType, {
        mymetadata: 'a random value'
    })
    .then((fileDoc) => {
        // Ensure that the chunks collection also has an uploadDate field
        // upon which MongoDB's TTL can operate.
        ImageChunk.updateMany(
        	{ files_id: fileDoc._id },
        	{ $set: { uploadDate: fileDoc.uploadDate } }
        )
        .then((res) => {
            console.log('Image loaded successfully.');
        })
        .catch((error) => {
            console.log(`Failed to set uploadDate in image chunks collection: ${error}`);
        });
    })
    .catch((error) => {
        console.log(`Failed to load image: ${error}`);
    });
})()

./models/imageFile.model.js

import mongoose from 'mongoose';

// Use GridFs "files" model so we can manage expiry date to enable
// DB managed TTL.
export default mongoose.model('ImageFile', new mongoose.Schema({
    _id: {
        type: mongoose.Schema.Types.ObjectId,
        required: true,
    },
    length: {
        type: Number,
        required: true,
    },
    chunkSize: {
        type: Number,
        required: true,
    },
    uploadDate: {
        type: Date,
        required: true,
        expires: 86400, // TTL one day.
    },
    filename: {
        type: String,
        required: true,
    },
    md5: {
        type: String,
        required: false,
    },
    contentType: {
        type: String,
        required: false,
    },
    metadata: {
        type: mongoose.Schema.Types.Mixed,
        required: false,
    },
}, {
    collection: 'images.files',
    autoCreate: false,
    autoIndex: true,
    versionKey: false,
}));

./models/imageChunk.model.js

import mongoose from 'mongoose';

// Use GridFs "chunks" model so we can set expiry date to enable
// DB managed TTL.
export default mongoose.model('ImageChunk', new mongoose.Schema({
    _id: {
        type: mongoose.Schema.Types.ObjectId,
        required: true,
    },
    files_id: {
        type: mongoose.Schema.Types.ObjectId,
        required: true,
    },
    n: {
        type: Number,
        required: false,
    },
    data: {
        type: mongoose.Schema.Types.Mixed,
        required: true,
    },
    uploadDate: {
        type: Date,
        required: false,
        expires: 90000, // Set this a few minutes longer than images.files TTL.
    },
}, {
    collection: 'images.chunks',
    autoCreate: false,
    autoIndex: true,
    minimize: false,
    versionKey: false,
}));

Leave a Reply

Your email address will not be published. Required fields are marked *