"All programming is an exercise in caching." - Terje Mathisen
This document covers different details of web application level caching. That is, caching data within a mod_perl process or between mod_perl processes. It does not cover proxy/caching server details, browser caching, etc
Just about everything to do with getting performance out of computers has to do with caching. Even below your application layer, we have the basic hardware layers of:
When we consider the web, there's many, many layers:
What we're dealing with here is data caching within a mod_perl process or between several mod_perl processes. Even then, we can consider several types of caching:
In each case, the general aim is reduce the amount of time required to perform an operation by using more memory or disk space to store the previously calculated result.
There are two main ways to develop a web server (though hybrids are possible):
In a multi-process system, a number of individual processes are forked, each of which randomly handles any new incoming request. In a multi-threaded system, there is one process, but it has an internal set of threads, each of which handles a new incoming request. Basically you can summarise the differences as follows:
Since each process is completely isolated, if any process crashes, no other processes are affected. A new process can be spawned to take the place of the crashed process. Some people might regard this 'robustness' as a false one, because it basically you're hiding programming errors and problems that should really be fixed. On the other hand, other people might use the 90/10 rule to say that if a process crashes once a day, it's not worth the possible 1000's of hours of effort that might be required to track down the one bug. Of course all this depends also on the stability of the OS you use, but generally most Server OS's are extremely stable these days.
One other advantage of multi-process is that each process is limitable. You can control system resources on a process by process basis, if one process ends up using too much memory, too much CPU, etc, it can be killed automatically, but again, no other processes are affected
Since multiple threads run within each process space, if one thread crashes, or overwrites memory, the entire process and web-server can crash. You also have to be careful about race conditions with threads writing data over each other. Some of these issues are particularly hard in C/C++, but much less so in a language like Java which has much better support for threads and synchronisation, and also stops the ability of threads to 'crash' (throws catch-able exceptions instead). Of course this depends on the stability of your Java environment, something more people might question.
Generally it's harder/impossible to have per-thread limits similar to the ones described above for processes.
Since each process is isolated, no data, open files, sockets, etc are shared implicitly (except read-only pre-forked data, see way below) To share any data, you have explicit code the sharing in some may. Possible solutions include: sockets, streams, files, shared memory, memory mapped files, etc. Each each case, you also have to provide some locking mechanism to ensure that two processes don't try and write to the same shared area at once.
Since each thread runs within the same process, all data, open files, sockets, etc are shared implicitly. You do still have to provide some locking mechanism to ensure that two threads don't try and write to the same shared area/item at once. Generally intra-process locking methods are faster than extra-process locking methods.
Depending on your OS, the difference between swapping process contexts and swapping thread contexts might be a little, or it might be a lot. In general (though I don't have evidence on how true this is), Windows is much quicker at thread swaps than process swaps, while for Linux, threads and processes are basically the same thing.
However even below this, there is another issue. Basically all processors have a TLB which maps virtual process addresses to physical addresses. However the virtual to physical mapping is different for each process. Thus, when there is a process switch, the TLB must be flushed. For a multi-threaded program, this is not required on a thread swap because all threads have the same virtual to physical mapping. This can be an issue on a heavily loaded server which has to swap processes/threads a lot.
As indicated above, because of the difference in amount of data shared, this can have a big performance implication on how your application is coded. Generally, if lots of sharing is required, multi-threading will be a considerable win over multi-process.
Given the above discussion, Apache 1.x uses a traditional multi-process model to handle web requests (except on Windows). Apache 2.0 will have a variable hybrid model that allows multi-thread, multi-process or some in-between combination.
For the moment, this discussion concerns multi-process caching.
As discussed above, caching involves saving the result of a complex calculation/slow query/etc based on the assumption that another web request will soon require the results of that query. Here's a really basic example of what we might do:
use vars qw(%Cache); sub GetSlowQuery { return $Cache{SlowQuery} ||= $dbh->selectall_arrayref('select slow_query_view'); }
Which basically runs the query once, and saves the result in the %Cache hash. If we call GetSlowQuery again, it will retrieve the result from the hash rather than running the query again.
A couple of important points to note:
Lets say now that we have some module that allows you to store data in some shared memory. We would rewrite the above code to be:
use Cache::SharedMemory; use vars qw(%Cache); tie %Cache, 'Cache::SharedMemory'; sub GetSlowQuery { return $Cache{SlowQuery} ||= $dbh->selectall_arrayref('select slow_query_view'); }
In this case, the Cache::SharedMemory does some magic to make the %Cache hash look like a hash, but internally it uses shared memory (locked as appropriate).
A couple of important points to note about this:
Thus these are the two main ways you can cache data:
Given that the multi-process model doesn't allow any implicit data sharing, then why share any data at all? These is the equivalent of keeping some sort of global variable, and each process then keeps a separate cached copy of the data. This works reasonably with data is basically static, or not modified by other Apache processes. Note though that this means you end up with a copy of the same data within each process. If you have 50 forked Apache servers, and each cache ends up with 1M of data, this means you'll really be using 50M of data.
There is a way around this problem, that uses the fact that most modern operating systems use a 'copy-on-write' technique to fork processes. What this means is that if you load data into the Apache process at startup before it starts forking children, and that data is only every read and not written to, then the data will automatically be shared between processes. Remember though, that for this to work it has to happen before the child processes are forked, so it means you need to know the common read only data that you want to cache