One of the first big errors a beginner programmer makes is to treat all HTTP requests as sequential. It’s easier to architect an application in which all the necessary actions occur in one single HTTP transaction.
When there is only one client accessing the application it works fine, but it takes only one client generating a complex report to saturate the underlying machine. Again, this happens because a report in a real application won’t take seconds, but instead minutes or hours.
In general, API HTTP transactions are meant to be as fast as possible. Besides affecting the server performance, it’s a bad user experience to wait 10+ seconds in a loading screen. Now imagine a user clicking on generate report, waiting 1 second with no response and clicking it again and again. That way, a server can easily be overwhelmed and irresponsive.
To solve this, there are techniques that we use while architecting applications to prevent this problem from happening.
General rule
Before entering in the details, I would like to reference the general rule I use for APIs: In general, API routes should be more of a gateway that a complex computation model form a resource consumption point of view. For example, in API routes we should generally apply business rules and move data between disk, cache, database and network. Following, there are examples of duties that an API should not do:
- Query the entire database
- Process images
- Launching linux processes
- Contacting an external service and waiting for it to respond, even tho we don’t need that services’s confirmation to consider the operation completed (more on that later)
- Anything that takes more than 500 milliseconds
Why can’t I do heavy tasks in the API?
Knowing that, now we can introduce techniques we can use to offload this process from the API.
Method 1: Background Jobs
One solution is to isolate the actions necessary to make the activity into a process that runs outside the API’s process. That way, when a report is requested, the API can “call and forget” the other service responsible for generating this process.
Let’s exemplify that for a better understanding:
- The client requests the generation of the report to the API
- The API call the other service via HTTP, informing the client’s e-mail, the type of report that should be generated, and the date of reference. As soon as the call is made to the other service, the API can return a success response to the client. In total, that process could’ve taken 10ms.
- The other service generates the report in 30 seconds, and send it via e-mail to the user.
That’s a simple case in which background jobs make a good improvement. Imagine that now, the API would not be handling that single request for more than 10ms, query a large portion of the database, use CPU and RAM and disk necessary to aggregate the results and send it to the client.
If someone is wondering that the final user experience would be bad because it will be informed that the report was successfully generated, but it’s not showing in the screen, nor will it be available in the user’s e-mail in the next 30 seconds, there are strategies we can implement in the front-end to mitigate the user experience degradation.
A good method I can imagine is to deliver a message to the user such as: We’re collecting the data and generating your report. In about one minute, we’ll send you the report to your email.
Now, you probably remembered a service you used that had this message. The user is fully aware that the process isn’t instantaneously, and it can expect the processed report in it’s e-mail in up to one minute.
E-commerce are well known to utilize this technique. Due to the instability, complexity and time required for all the jobs related to a simple purchase (for example, fraud detection, payment, inventory, shipping, tax and invoice), it’s impracticable to make all this in one single API request. To resolve this, e-commerces generally make each process asynchronously:
- [At the client] The payment intent was registered
- [Via email, after 30 seconds] The payment was confirmed
- [Via email, after 2 minutes] The product was reserved
- [Via email, after 3 minutes] The shipping was scheduled
- [Via email, after 5 minutes] The invoice was generated and is attached to the email.
That’s only one technique. Some other methods people use to notify the user about a background job is to persist and entity indicating the trial to perform an action. That entity can be updated as progress is being made. In the front-end, all the pending processes can be listed. Examples are following:
- GitHub Actions: When you launch an action, there is a persisted registry about it, and you can see the logs, duration and status.
- General cookie scanner services: Shows a progress bar while processing.
If you don’t want to persist an entity for that, the message queue service can be used to track the job’s progress.
Communication between the API and the service
The way to communicate the API with the jobs processor is a topic for another post. For now, I just would like to mention that Digital Ocean used a simple MySQL database as a message queue for years.
A good message queue service will provide you with:
- Monitoring
- Distribution
- Prioritization
- Retry and dead letter queue
Background jobs for persisting data
One use case that most people don’t immediately thing for using background jobs is to persist data. For example, your application produces audit logs that must be persisted in the exact instant an event happened. It’s easy to think that we should just add it to some database at the time it happens.
There’s a good analogy I read once regarding remote logging that said: When you intend to implement a remote logging to your application, imagine after it, there will be at least one extra network call for each request the application receives.
That’s true for auditing log too. Now, for each request that probably interact with the database, there will be another persistence event that must be executed in sequence. If you scale this to thousands of requests per minute, you can imagine how the application performance would be affected.
Also, generally when we save that kind of log, it’s not a really fast thing. It can easily take 10-20ms, due to some processing, or indexing.
One solution for that is to send the request to save that log to a background job. Just because it’s done in the background, doesn’t mean it is not persistent, nor that it’s not reliable. Actually, it’s used for more than just logs. Even updating a user e-mail can be done in background.
Method 2: “Microservices” and synchronous jobs
Sometimes there is the real need to wait for the job completion to finish the API request. In that case, there still benefits in separating that service from the API, as CPU, RAM and disk resources from the API would not be affected in a relevant way.