Hunter Henrichsen

Hunter Henrichsen

Search Circle
< lecture

Lecture 20 - Scaling with Services

posted over 1 year ago 6 min read

Lecture 20 - Scaling with Services#

Announcements#

Come hear about the technical challenges of building APIs at scale. Lucid Software launched their Developer Platform last fall, allowing developers to build out custom functionality on Lucid’s visual collaboration suite. We now need to consider how to write code that is easy to use internally, and great for external developers to build on top of. Whether it’s taking data from an external system to show that in Lucidchart, building out automation in Lucidspark, or many other use-cases we are actively improving what we can support with our APIs. After diving into some of the deeper technical challenges of building out APIs, we’ll end with perspectives from an engineer who transitioned into product management, who helped lead the success of our API launch.

Scaling and Performance in General#

Before we jump into anything else, I want to be clear – these are all scaling techniques, but you may not need any of the techniques that I’m going to talk about. These are tools I think can be useful in your toolbox for if you do run into scaling problems, and a rough overview of them at that.

Scaling is the idea that as more users use your site, you will need more resources in order to handle those users in a performant way. I’m going to go over two approaches that we’ve already touched loosely on: today, we’ll talk about Services, and next time we’ll talk about Sharding.

Scaling with Services#

Microservices have somewhat taken over the idea of services, but I want to focus on a simpler idea here: if you are struggling to deal with load, and there is a portion of your app that can work independently, make that portion of the app work independently.

Note: Diagrams can be viewed independently here.

Problem: I have too many users, and my site can’t keep up anymore!

Here’s an example of the app that we’re working with:

Original State

Vertical Scaling#

Solution 1: Give my site more resources.

![](/img/lecture-sb03/Designing with Services - Vertical Scaling.png)

Benefits:

Drawbacks:

Horizontal Scaling#

Solution 2: Create multiple servers, and balance requests between them.

Drawbacks:

Revisiting that second benefit, if I have closer servers I can pick those by using smarter routing:

Horizontal Scaling plus Performance Services#

Solution 3: Solution 2, but also isolate performance-critical, frequently-used, or resource-intensive areas of code to their own services and servers.

Benefit: Resources are used more effectively and parts of the system can fail independently while still allowing some functionality.

Drawback: Additional complexity and cognitive load in implementation and code sharing; need service to service communication and authentication.

Services without Load Balancing#

Solution 4: Isolate performance critical, frequently-used, or resource-intensive areas of code to their own services without load balancing.

Benefit: Critical tasks are isolated and can be executed independently of main tasks.

Drawback: Additional complexity in implementation; need to implement service communication and configure services to talk to each other.

Communicating Between Services#

Services are able to talk to each other so long as they understand how to deal with each others’ requests. There are a couple ways to do this.

REST#

You can send REST requests from a service just like a client can send a REST request. This type of communication tends to follow naturally from designing RESTful APIs, especially if they don’t drift too much.

Message Queue#

Another popular way to balance work and exchange data between services is a message queue. Services push info to the queue and other services can consume the messages to do work. Queues can be used for round-robining tasks, for pub/sub, or for normal queue usage.

RPC#

RPC, or Remote Procedure Calls, are another way to talk between services. A popular implementation of this is Google’s gRPC. These use a shared schema and list of procedures and services to communicate in a standard way between servers and clients.

On Load Balancers#

Something frequently used among scaling operations are load balancers. Load balancers are useful for distributing work, whether that work is operations on a database, SMS messages, emails, dealing with HTTP requests, or anything else.

Most technologies that assist with scaling (PaaS, Lambda, Kubernetes, etc.) will require some sort of ingress configuration. Lambdas will give you a URL to use, Kubernetes requires a specific configuration, PaaS will vary. Some of the most frequently used balancers are Nginx and HAProxy. Normally, these look like a list of servers, and can also have some additional rules like cross-region routing.

Demo: Using a Message Queue to Do Work#

We’ll be looking at this repo.

Reading#