What is Varnish? Is it Reverse Proxy Cache?
- March 22, 2019
- 0
Any website, whether blogs or e-commerce or of any other kind, requires to deliver results faster. Not only fast but it is also supposed to be scalable. This means that it should not affect a website whether the number of visitors are 10 or 1000 or a million!
A website should always be displayed in the same manner right from the beginning till its existence. If that is not there then the visitors will more likely visit one’s competitors. Practically one may just lose a potential convert!!!
Having said that, all web pages should be displayed fast and if there are online transactions or payments included, each fraction of second is important. A site owner absolutely cannot risk this loss as it can be huge due to a number of customers involved in that payment at a given moment. The reputation built for months or years can suddenly take a U-turn.
If a site is popular (and it should be), its visitors increase drastically, and a server may not be able to tackle that intense traffic. This is because the users request original web pages that fail to load each time the request is generated.
Do you know that page load time DOES affect the search engine ranking?
If your site is not loading as per expectations, Google, Yahoo, Bing, or any other popular search engine wouldn’t consider that website to rank higher. Therefore, a site owner is doomed if their web pages are slower than normal.
One has to find a solution that helps in load balancing. Now the question arises, how that is achieved. Caching is a concept in which the copies of web pages are generated. Therefore, the user gets the cached copy of the original website without their notice and that’s loaded quick as well.
So, how do we do that?
Varnish is your answer. But before we jump into that we should first learn what forward and reverse proxy are. Below is the brief explanation of each.
Forward Proxy
A forward proxy, also known as web proxy or simply proxy server, is kept in front of a number of client machines. It acts as a middleman, and on behalf of those clients, it communicates with those web servers for interaction. There are many reasons why proxies are used, some of which are as follows.
Advantages of Forward Proxy
1. To allow oneself browse certain sites that are restricted to one’s own state or country. For instance, websites like Facebook and YouTube are banned in countries like China. So, one may consider using proxy server to shield their country so that they can easily have access to those sites. This is because the original client is encapsulated, and instead of that, the IP address of proxy server is shown which is most likely located elsewhere and has no relation with the original client. So, it would appear as if the user resides in another country like Germany, Canada, etc. There is a huge list of (allowed) countries one may choose.
2. To filter out specific users. If a site wants to filter specific kinds of users, proxies can assist it well. This is generally followed by educational institutions so that everyone is restricted with access rights based on their designation. These sites also refuse to take requests from other third-party websites, such as Facebook and similar.
3. To hide the real identity. A person may want to stay anonymous in order to protect themselves from serious consequences as a result of blog/audio/video postings/presentations that are against the political or governmental regulations of their country. This may include extremely sensitive statements or anything that is critical of whatever goes on inside the political premises and the person involved may face lashes from the Law and Order or the public itself.
Reverse Proxy
A reverse proxy is kept in front of the web servers that receives the client requests first and then forwards to those servers. Reverse proxies enhance the security, reliability and performance. Reverse proxy ensures that NO client directly communicates with the server.
Advantages of Reverse Proxy
1. Load Balancing: Any popular website is likely to get traffic spikes quite a lot of times, and hence, load balancing would be extremely necessary. This is done by distributing the site among different servers that handle the requests the same site. The reverse proxy can provide the load balance and the traffic can be distributed equally among all servers. If one server fails, other servers can compensate that loss without the knowledge of end users.
2. Safeguarding: Reverse proxy does not allow a site server’s IP address to be exposed in any way. This is a plus for all the websites that have proxies since the protection against attackers (DDoS) is highly essential. The attackers won’t be able to target the reverse proxy either because it too has tighter security.
3. GSLB: In Global Server Load Balancing, a site can be well distributed among a list of servers located around the globe. The reverse proxy will send the clients’ requests to those servers that are geographically nearest. Due to minimal traveling, the time is consumed less and the speed is optimum.
4. Caching: For swift results, reverse proxy uses cached contents to display. The cached content is the exact copy of the original site that delivers the same result as if the request were sent directly. This saves a lot of time and the overall website performance is enhanced.
What is Varnish?
Varnish Cache is an open source project written in C language. By the term “open source”, it implies that the source code is available online free of cost. Yet it is maintained by an active community led by Poul-Henning Kamp, and the project is financially backed by Varnish Software that does most of its development.
Varnish Configuration Language (VCL) is a domain-specific language being used by Varnish to control the cache’s behavior.
Working
Varnish can either be installed on separate machines or web servers. It is capable to mimic the server’s behavior that is being protected by it. It usually listens to TCP port 80, the standard Transmission Control Protocol port that delivers HTTP, except the case if Varnish sits behind another proxy. Varnish will also have a few registered backends and it will communicate with them to confirm whether a particular result actually didn’t have the cache (cache hit or cache miss) to display for clients. The objects in Cache are distinguished by hash that contains hostname or IP, plus the request’s URL.
Varnish uses pthread (POSIX Thread – an execution model that controls multiple flows of overlapping tasks at a given moment) to tackle bulk requests that improves the performance of a server substantially.
It has an easy set up and significantly elevates the sites’ speed to as much as 1000 times better. In fact, countless popular websites such as Wikipedia, Stackoverflow, Reddit, Udemy, etc. use Varnish to handle the huge traffic.
As said earlier, VCL is the language used that facilitates the developers with features like overriding and extending the behavior of varying states in Varnish Finite State Machine. The VCL file contains the subroutines and VCL code. In the beginning, the VCL file is read, translated to C language, compiled, and loaded dynamically.
There may be times when one might feel that VCL too does have limitations. In that case, one can write custom Varnish modules in C. There are a set of functions that contain the modified behavior. These functions enrich the VCL syntax.
The cache has to be updated each time the original data is modified or changed. The admins need to write certain sets of code that executes right after the changes are there.
Caching doesn’t magically change the computing speed. It is rather a architectural decision that increases the efficiency of a system if the right steps are followed.
One may feel that their firm is small or of medium size, and so, they may not require Varnish. It is true that large scale companies most definitely use this technology, but it would take a lot of time and effort to upgrade the technology and use Varnish AFTER the site has gained popularity.
It is worth the shot to use Varnish as reverse proxy cache in any kind of organization since this technology is already established and is widely accepted by many popular firms.
If you think that it would be a costly affair to switch to this technology, think again. There are a lot of plugins in the market that fulfill lots of requirements that need to be put. One of such is Apachebooster, a cPanel plugin that contains Varnish. It also has Nginx for better server performance.