1.1 Describe distributed applications related to the concepts of front-end, back-end, and load balancing
What are the front end and the back end
- The front end refers to what a user/client can see
- The back end refers to what the company/server can see
The front end may refer to:
- A clients browser
- HTML/CSS/JS on a web page
- Images/Audio/Video a user sees
The back end may refer to:
- Load balancers, proxies, firewalls not directly accessible by a user
- Databases, caches, servers and object stores not directly accessible by a user
- Back end technologies and languages (server side languages)
What is load balancing
Without load balancing there is a direct connection between the client and the server.
This limits performance and reliability as the availability is based on a single server.
A load balancer is a middle man that appears as the "server" and directs front end traffic to back-end servers.
Load balancing features
An example of a load balancer that I use is NGINX:
NGINX offers the following load balancing methods :
- Round Robin – Requests are distributed evenly across the servers, with server weights taken into consideration.
- Least Connections – A request is sent to the server with the least number of active connections, again with server
weights taken into consideration.
- IP Hash – The server to which a request is sent is determined from the client IP address. In this case, either the
first three octets of the IPv4 address or the whole IPv6 address are used to calculate the hash value. The method
guarantees that requests from the same address get to the same server unless it is not available.
- Generic Hash – The server to which a request is sent is determined from a user‑defined key which can be a text
string, variable, or a combination. For example, the key may be a paired source IP address and port, or a URI.
- Least Time (NGINX Plus only) – For each request, NGINX Plus selects the server with the lowest average latency and
the lowest number of active connections, where the lowest average latency is calculated.
- Random – Each request will be passed to a randomly selected server. If the two parameter is specified, first, NGINX
randomly selects two servers taking into account server weights, and then chooses one of these servers.
There are other types of load balancing:
- Cookie marking – adds a field in the HTTP cookies which is used for balancing calculation
- Consistent IP-Hash: adds or removes servers without effecting user’s session or cache
Reverse proxy features (extra credit)
I use NGINX as a reverse proxy only (not as a load balancer). It acts a reverse proxy and serves web apps running on
localhost as well as websites served from directories.
While it is possible to serve my website directly from the web app that runs it, it only allows for a single web server
to run at a time. NGINX adds server multiplexing.
HTTP proxies are commonly used with web applications for gzip encoding, static file serving, HTTP caching, SSL handling, load balancing and spoon feeding clients.
Scalability and Flexibility
As you can see by the diagram above the "Cheap VPC" when it gets a request for "wilyarti.com" proxies the request to
the server or folder based on the configuration.
This allows multiple sites to be run on a single IP address and the same server. I currently run 7 websites and 4
microservices on a single server/IP address.
The web app for "wilyarti.com" runs on localhost and is not accessible on the internet directly. On this server NGINX is
configured to proxy requests directed at "wilyarti.com" to 127.0.0.1:3000.
NGINX can also be used to redirect websites and ports in any number of ways. For example all my websites force HTTPS and
redirect all request to port 80 to port 443.
NGINX manages all my SSL certificates using certbot. Reverse proxies can also serve other security functions:
- DDoS protection
- Packet sniffing (SSL is performed by the load balancer so plaintext packets can be inspected)
I currently use two projects to monitor for performance on NGINX:
Goaccess.io provides insight into many areas of server performance such as:
- Requested files
- Static requests
- Not found URLs
- Visitor hostnames/IPs
- Operating Systems
- Time Distribution
- Referrers URLs/Sites
- Google keywords
- HTTP status codes
This data is scraped from the NGINX log files.
Ntopng provides real time data about traffic flows to a website. This was incredibly useful when I was trying to diagnose
spikes in internet traffic to my website.
Reverse proxies can also improve performance by handle SSL termination, caching web pages and implementing
Caching pages reduces the need to generate dynamic content everytime it is requested which is resource
Gzip compression reduces network bandwidth and speeds up the page load time.
Reverse proxies can act similar to firewalls with access control and filtering rules to prevent
misconfigured servers from leaking sensitive data.
Reverse proxies can be configured to perform authentication. This allows a developer to put static
websites or unprotected web apps behind password protection.
What are distributed applications
Distributed applications refer to applications which split there functionality or resources of multiple servers.
For example a server may store their images and videos in an AWS S3 bucket. This decouples the local file system from
the web server allowing servers to me stood up and down without having to migrate data.
As another example a website might use separate services (microservices) for things such as authentication.
The website might use different servers for "shop.wilyarti.com" than it does for "wilyarti.com", but the authentication is
shared through a single authentication microservice.
Generated from markdown using parsedown-extra.