Full Stack File Management Part I: The Backend

2021-07-11

javascript
node.js
mongodb

Every full stack developer would need to implement file uploads, downloads and access control at some point in their projects. This handy guide explores implementation of all aspects of file management, with Part I focusing on data, backend and server side things.

Every full stack developer will have to implement file uploads, downloads and access control at some point. There are many ways of achieving this — from just storing all files on your server to having a managed file system across multiple servers.

A day in the life of a full stack developer
A day in the life of a full stack developer

In this article, we shall cover how to handle the server stuff for a file management. In Part II of this guide, we'll take a look at the front-end client interaction side of things. This is targeted at an intermediate level developer audience.

Part I: The Backend

Let's decide what we want our file system to achieve. Imagine we have a system where the users belong to organizations (Think slack, monday, etc.).

  • Signed-in users can upload files
  • Uploaded files are accessible by
    • only the users who created them (default)
    • invited personnel
    • entire organization
    • public
  • Only the uploader can delete the files

Storage

Once the files are uploaded into the system, we need somewhere to store them. For this example we would be using google cloud storage. You would need service account credentials to use the client library. Head over to the google cloud docs for setting this up.

Alternatively you can use any of the similar file hosting services like s3 or block storage, or even store the files on your server itself.

Once you have your GOOGLE_APPLICATION_CREDENTIALS stored safely, and the path to them linked to the server (preferably in the env).

Create a bucket on your storage. You can do this with command line, or using the online storage browser. This bucket will be the root of where we'll be putting the files.

We can put all the files at the root, but that's highly disorganized. Instead we'll create folders for organizations, and then for the module of the app from where the file was uploaded (e.g. profile, account, messages, etc.). We would be storing a reference to the file in our database.

We're now ready to use the storage API.

Modeling

We're storing references to the actual file in the database. Every file object will need to store

idunique file identifier
namefile/display name
pathactual location of the file in the bucket
sizefile size
mimetype of the file (e.g. application/pdf)
datedate of creation
descdescription (if any)
uploaderreference to the user who uploaded the file
accesstype of file access (private/restricted/org/public)
aclistlist of people having access to the file

Model the fields in your database of choice. We're using MongoDB here, specifically mongoose wrapper.

where ObjectId is MongoDB's object type and User is our user schema.

Great, now let's accept those files!

Handling Uploads

There are multiple ways to do file uploads. You can accept the files directly on your server – in which case you will have to transfer the file from the server to the bucket (this step is skipped if you're storing the files locally).

In our case, we would be generating Signed URLs to be sent to the client, so the client can directly upload the files into the bucket. This is especially helpful if you're going to handle a lot of large files, as it reduces network load on the server.

Low Effort Trade Offer
Low Effort Trade Offer

Let's set up our upload endpoint. Firstly we need to check if the user is signed in or not.

Let's get the parameters from the request. We're restricting the request to POST only, so we'll get the stuff in the request body.

Where uploadPath ([string]) is the location of the upload within the bucket. For example, ['account', 'profile'] means the file will be uploaded to /organization_id/account/profile

Now we need to generate the path within the bucket for the file. Add a random uuid to the filename to prevent duplicates. And use path library's join() to get a path.

Now let's generate a v4 signed URL for upload, and send it back to the client

The client can now directly upload files to the endpoint.

Worst Trade in the History
Worst Trade in the History

Note that we're not saving the file metadata to the database at this point, for the purpose of making request handling easier. We would be saving the file metadata to the database once the client submits the form which contains the file(s) to be uploaded.

Saving File Meta

Usually you won't be storing file meta information individually into the database. Files are referenced or embedded within other schemas.

For example you could have a request adding to the messages table which contains a reference to File – In which case you'll save the file to the files collection, and save the reference to it in the messages collection.

An example request handler could look like

Where attachment is the file object sent from the client, either on form submission or post file upload confirmation. (With validation and checks of course).

A typical attachment object will contain name, desc, path, size, mime, date, access and aclist properties for the file, organization and uploader properties are set on the backend.

File Access

Multiple ways to handle this one too – You could fetch the file on the server and send a stream back to the client, or send a reference to the locally stored file, or in this case – generate a read-only Signed URL and redirect to it on access.

The flow for the client would be: clicking a linked file, new tab opens and loads/downloads the file (depends on file type and browser settings).

On our GET endpoint for requesting the file, we can fetch a file by it's ID (_id) or by its path.

We need to do sign-in like in uploads, availability and access checks. A simple file not found suffices for availability.

For private files, access is restricted to uploader

For restricted files, access is restricted to the allowed list (set from the client side)

For organization wide files, access is restricted to anyone in the organization

Once all checks are passed, we will generate the signed URL using the storage library.

And redirect to the link in our response.

The file will be either previewed within the browser, or prompted to save to "downloads" based on file type and browser (and OS) settings.

File Deletion

Normally we would remove the files from the bucket once any object containing a file meta object is removed from the database.

Deletion will start with the same sign-in checks. Additionally we also need to check if the request file-to-be-deleted exists and is created by the user requesting deletion.

So, after sign-in checks, on our DELETE endpoint;

Check if the file exists in the database (and is owned by the current user)

Yeet the file
Yeet the file

If all the checks pass, we can yeet the file from the bucket as such

where bucket is an instance of cloud storage bucket.

Conclusion

Welp, this should give you some idea of a file management backend. The levels of access controls for a file may vary. Sometimes you will have to handle multiple uploads at once, or manage storage across multiple buckets, or any other storage solution. Signed URLs will not always be the best way of handling file storage – many-a-times you will have to accept uploads to your server.

In Part II, we'll be covering a way of handling files on front-end with React.

© avikantz, 2024