2021-07-11
Every full stack developer would need to implement file uploads, downloads and access control at some point in their projects. This handy guide explores implementation of all aspects of file management, with Part I focusing on data, backend and server side things.
Every full stack developer will have to implement file uploads, downloads and access control at some point. There are many ways of achieving this — from just storing all files on your server to having a managed file system across multiple servers.
In this article, we shall cover how to handle the server stuff for a file management. In Part II of this guide, we'll take a look at the front-end client interaction side of things. This is targeted at an intermediate level developer audience.
Part I: The Backend
Let's decide what we want our file system to achieve. Imagine we have a system where the users belong to organizations (Think slack, monday, etc.).
- Signed-in users can upload files
- Uploaded files are accessible by
- only the users who created them (default)
- invited personnel
- entire organization
- public
- Only the uploader can delete the files
Storage
Once the files are uploaded into the system, we need somewhere to store them. For this example we would be using google cloud storage. You would need service account credentials to use the client library. Head over to the google cloud docs for setting this up.
Alternatively you can use any of the similar file hosting services like s3 or block storage, or even store the files on your server itself.
Once you have your GOOGLE_APPLICATION_CREDENTIALS
stored safely, and the path to them linked to the server (preferably in the env
).
Create a bucket on your storage. You can do this with command line, or using the online storage browser. This bucket will be the root of where we'll be putting the files.
We can put all the files at the root, but that's highly disorganized. Instead we'll create folders for organizations, and then for the module of the app from where the file was uploaded (e.g. profile, account, messages, etc.). We would be storing a reference to the file in our database.
We're now ready to use the storage API.
Modeling
We're storing references to the actual file in the database. Every file object will need to store
id | unique file identifier |
name | file/display name |
path | actual location of the file in the bucket |
size | file size |
mime | type of the file (e.g. application/pdf) |
date | date of creation |
desc | description (if any) |
uploader | reference to the user who uploaded the file |
access | type of file access (private/restricted/org/public) |
aclist | list of people having access to the file |
Model the fields in your database of choice. We're using MongoDB here, specifically mongoose wrapper.
where ObjectId
is MongoDB's object type and User
is our user schema.
Great, now let's accept those files!
Handling Uploads
There are multiple ways to do file uploads. You can accept the files directly on your server – in which case you will have to transfer the file from the server to the bucket (this step is skipped if you're storing the files locally).
In our case, we would be generating Signed URLs to be sent to the client, so the client can directly upload the files into the bucket. This is especially helpful if you're going to handle a lot of large files, as it reduces network load on the server.
Let's set up our upload endpoint. Firstly we need to check if the user is signed in or not.
Let's get the parameters from the request. We're restricting the request to POST
only, so we'll get the stuff in the request body.
Where uploadPath
([string]
) is the location of the upload within the bucket. For example, ['account', 'profile']
means the file will be uploaded to /organization_id/account/profile
Now we need to generate the path within the bucket for the file. Add a random uuid
to the filename to prevent duplicates. And use path library's join()
to get a path.
Now let's generate a v4 signed URL for upload, and send it back to the client
The client can now directly upload files to the endpoint.
Note that we're not saving the file metadata to the database at this point, for the purpose of making request handling easier. We would be saving the file metadata to the database once the client submits the form which contains the file(s) to be uploaded.
Saving File Meta
Usually you won't be storing file meta information individually into the database. Files are referenced or embedded within other schemas.
For example you could have a request adding to the messages
table which contains a reference to File
– In which case you'll save the file to the files
collection, and save the reference to it in the messages
collection.
An example request handler could look like
Where attachment
is the file object sent from the client, either on form submission or post file upload confirmation. (With validation and checks of course).
A typical attachment object will contain name
, desc
, path
, size
, mime
, date
, access
and aclist
properties for the file, organization
and uploader
properties are set on the backend.
File Access
Multiple ways to handle this one too – You could fetch the file on the server and send a stream back to the client, or send a reference to the locally stored file, or in this case – generate a read-only Signed URL and redirect to it on access.
The flow for the client would be: clicking a linked file, new tab opens and loads/downloads the file (depends on file type and browser settings).
On our GET
endpoint for requesting the file, we can fetch a file by it's ID (_id
) or by its path
.
We need to do sign-in like in uploads, availability and access checks. A simple file not found suffices for availability.
For private files, access is restricted to uploader
For restricted files, access is restricted to the allowed list (set from the client side)
For organization wide files, access is restricted to anyone in the organization
Once all checks are passed, we will generate the signed URL using the storage library.
And redirect to the link in our response.
The file will be either previewed within the browser, or prompted to save to "downloads" based on file type and browser (and OS) settings.
File Deletion
Normally we would remove the files from the bucket once any object containing a file meta object is removed from the database.
Deletion will start with the same sign-in checks. Additionally we also need to check if the request file-to-be-deleted exists and is created by the user requesting deletion.
So, after sign-in checks, on our DELETE
endpoint;
Check if the file exists in the database (and is owned by the current user)
If all the checks pass, we can yeet the file from the bucket as such
where bucket
is an instance of cloud storage bucket.
Conclusion
Welp, this should give you some idea of a file management backend. The levels of access controls for a file may vary. Sometimes you will have to handle multiple uploads at once, or manage storage across multiple buckets, or any other storage solution. Signed URLs will not always be the best way of handling file storage – many-a-times you will have to accept uploads to your server.
In Part II, we'll be covering a way of handling files on front-end with React.