Thoughts about edge computation

April 17th, 2025

I will start this blog post by defining what "computing on the edge" refers to. Wikipedia states that

[...] [edge computing] refers to any design that pushes computation physically closer to a user, so as to reduce latency [...] (ref)

However, as I'd like to focus on web apps, I will extend the definition by the understanding of major cloud providers (CP):

Edge computing is typically accomplished by serving requests via scalable, containerized, load-balanced systems.

Running your app on the edge typically requires some sort of containerization. When a user sends a request to your service, major cloud providers spin up the container, wait for the HTTP service (inside the container) to be available and then forward the request to the (now running) container. Here's a (simplified) graphic: (not included in transcript)

The startup time for more complex HTTP services typically ranges from a few hundred milliseconds to a few hundred seconds, depending on the tech-stack used to build the service. Most (all) cloud providers know about this and thus mark the container with a TTL that is renewed on every subsequent request. Because of this, the startup latency is significantly worse for apps with few MAUs and for users that reside in regions with few MAUs, as the TTL will expire more often. As a customer of any CP, you can typically request to always have at least one running container per region. However, this will radically increase your hosting expenses, especially for apps with few MAUs, as CPs usually bill their customers per computing time.

Because the cloud is overpriced, devs should always make sure to confirm that anything that runs in the cloud really needs to stay in the cloud. To augment the cost efficiency of the cloud, CPs usually offer their customers two different billing models:

Pay-per-Request/Request-based-billing (PpR)
Customer pays a fixed amount per request plus a fixed amount per vCPU-second plus a fixed amount per GiB-second.
Pay-per-Instance/Instance-based-billing (PpI)
Customer pays a fixed amount per vCPU-second plus a fixed amount per GiB-second. The fixed amounts are usually lower compared to PpR.

You've probably already looked at Example I. This will be our scenario for showcasing the cost of the cloud. I am using the GCP Expense Calculator for the calculation of costs.

Assumptions:

2k requests per user per month
1 vCPU
300ms computing time per request
1.5 GiB memory required
1 container can handle up to 100 concurrent requests
us-central1 (Iowa, Tier 1)
southamerica-east1 (São Paulo, Tier 2)
PpI 1 instance per region

Setup	Cost/month
PpR no min-instances	$ 13.81
PpR 1 min-instance	$ 52.56
PpI 18h/day	$ 81.00
PpI 24h/day	$ 121.41

Putting the GCP and easy scalability aside, two VPSs on Hetzner Cloud with 2 vCPUs and 4 GiB memory each, running 24h/day will cost $ 9.18/month. That will provide you with >2x better specs compared to PpI 24h/day while being >13x cheaper. Ironically, avoiding CPs like the GCP instantly gets you a >10x cost efficiency improvement.

Larry Page, co-founder of Google:

It's often easier to make something 10 times better than it is to make it 10% better.

Thus, when running on the edge is a requirement, devs should strongly consider choosing VPS providers over major CPs, as the cloud is significantly overpriced and the initial manual effort can be easily paid off long term. Sadly, I feel like many devs have forgotten about what "running on the edge" really means. You absolutely do not have to pay Google, amazon or Microsoft ridiculous amounts of money. For most use cases, a few VPSs for every region will be enough.

Up until here, we have mainly focused on the costs and talked a bit about latency. But now, we will get to the real deal when it comes to edge computing: data(-bases)

To demonstrate why running on the edge is pointless when using a monolithic/single central database, let's take a look at Example II. Alice (A) resides in the US while Bob (B) resides in Russia. A talks to the NA Service (N-S) while B talks to the Asia Service (A-S). Both the A-S and the N-S need to access the central database that is being hosted in Germany. In order to talk to their services, A and B authenticate themselves using a JWT. Because the access shall be revokable at any time, N-S and A-S need to query the DB to validate the given credentials. Now assume that our service is a messaging service and A wants to send B a message. We will observe 2 actions:

A sending the message
(sequence diagram not included in transcript)
B pulling new messages
(sequence diagram not included in transcript)

Now let's make some assumptions:

database delay $d_{DB} = 100$ ms
service delay $d_s = 10$ ms
processing delay $d_p = 0$ ms
"processing is free"

With these assumptions made, it's obvious to see that RTT(1) = RTT(2), where

RTT(1) $= 2 d_s + 4 d_{DB} = 420$ ms

Thus, sending as well as pulling messages takes 420 ms in this scenario.

This example is a modified version of Example II: We still have A and B in their locations but only provide a single Main Service (M-S) this time. A still wants to send B a message and we still observe the actions 1 & 2 (with N-S and A-S replaced by M-S). However, since the scenery has changed, we need a new set of assumptions:

database delay $d_{DB} = 10$ ms
service delay $d_s = 100$ ms
processing delay $d_p = 10$ ms (unchanged)

Obviously, RTT(1) = RTT(2) still holds, but

RTT(1) $= 2 d_s + 4 d_{DB} = 240$ ms

this time, which is almost 2x faster compared to Example II. As one can clearly see, when using a single central database, the use of edge computing significantly increases the latency for end users. Thus, edge computing should not be employed in such scenarios.

Example IV

But what if we want to have both low-latency database access and low-latency services? Then we need to have the same database in different regions at the same time. The key concept we need to accomplish this, is: replication. We need to replicate our database.

Sadly, this easier said than done. Most data models are difficult to replicate. Take relational databases for example, you will face some real challenges:

ACID-Compliance
database migrations/schema changes
conflict resolution

You will have to give up some guarantees if you want to improve latency. E.g. if you wouldn't give up ACID-compliance, you would have to lock relations simultaneously for all regions, which then would worsen the latency far beyond Example II.

This is why most apps that (successfully) run on the edge employ NoSQL databases, as those are much easier to replicate. Graph databases on the other hand, despite belonging to the NoSQL family, are an exception and hard to replicate.

To round things up, I'd like to share some of the best use cases for the edge:

If you came this far, I hope you enjoyed the read :)