Let’s say you wanted to run two completely separate instances of Apache’s web server. (I.e. not just multiple virtual hosts listening on different ports, but actual separate httpd processes.) All you’d have to do is create a new httpd.conf (in a separate directory, or with a different name in the same directory as the current one). Then just invoke httpd with the -f option to point at this new config file. The new config file can specify the locations of all the other files (server root, logs, pid files, etc.). Couldn’t be easier.
On the other end of the spectrum is WebLogic 8. I just tried to do the same thing with it (made a copy of the config directories, tweaked them, fired it up), and it failed miserably. Parts of it were still pointing into the old config tree, so things like DB pool deployments were failing.
So, what’s the problem here, and what’s the solution?
In my opinion, there are three types of data associated with an application:
The executable code, and anything else that gets installed off the media (and/or built at install time), and never changes thereafter.
Any file that affects how you want the application to run.
Anything that changes during the running of the application, but isn’t part of your own data (e.g. not your relational database). This includes stuff like log files, pid files, lock files, and work directories. Usually you can delete this stuff when the application isn’t running, and it will happily recreate it when it restarts.
In my opinion, you should be able to create a new config file(s), execute the same static code, point it at the new config file(s), and have the newly configured instance create its own set of dynamic files.
This is what I did with apache in my example above. For apache, my classification scheme works out as follows:
Static: the bin and modues/libexec directories
Configuration: the conf directory
Dynamic: the logs directory
Why would I want to do this? Because I want to check the config files into CVS, so I can carefully manage changes to them. I don’t need to check a copy of apache itself into my CVS tree, since I know what version I downloaded and installed. If/when I upgrade to a new version of apache, I can make any required changes to the config files to support and configure the new version, and check those changes into CVS. The Dynamic files I definitely don’t want in CVS, but I might want to archive them (e.g. the logs) for future reference. A clean split between the 3 file types makes this process very simple. I just have to put a copy of the apache conf directory into CVS, and I’m done.
Now let’s look at what happened with Weblogic. I’m still doing the analysis, but here are the preliminary results:
Weblogic wants 3 directories:
BEA_HOME: stuff common to all BEA products
WLS_HOME: where the weblogic software lives
Domain directory: where the files for a “domain” go
In order to run a single weblogic server instance listening on a single port, you actually have to execute 3 processes: an Admin Server, a Node Manager, and a Managed Instance. You have to allocate each of these servers at least one port number to use to listen/talk to the others.
The Admin Server and the Managed Instance share a single config file: config.xml in the Domain directory. The Node Manager has it’s own config file: nodemanager.properties, which lives in the WLS_HOME directory.
By default log files get created in the Domain directory, and there’s an LDAP database that gets stored there too.
There are quite a few files that need to be changed in order to change the directory locations and port numbers assigned to the 3 processes that make up a single “instance” of Weblogic. Somehow in my effort to simply copy the directories and create a new configuration, I missed one spot (grepping can’t find it!) Our problem seems to be with the LDAP directory, and the license file location.
With Weblogic, it seems impossible to separate out the Configuration and Dynamic portions, and hard to separate the Static from the Configuration.
It looks like we’re going to have to re-install from media, and create entirely new BEA_HOME, WLS_HOME and DOMAIN directories. Then I’ll have to run and configure the software, to get it to the state I want. Then I can shut it down and do a “diff” between the various directory trees to see what actually changed in order to change the directories and ports. Then I’ll have to figure out which files are actually Configuration files and which are just Dynamic, how to check in only the Configuration files so we can track config changes in CVS. In the worst case, some of these files (e.g. the LDAP database) will be binary.
So, the guidelines:
Make a clear separation between the three types of files
It should be easy for me to tell which files are executable stuff and which are configuration related. If all the configuration files are stored in a single directory, then it makes it easy to put the configuration under CVS control.
I should be able to lock down the Static and Configuration directories so that they are read-only and the software will run happily (i.e. it shouldn’t be trying to write stuff into the configuration directory… unless I ask it to through some configuration utiltity or GUI)
The location of all Dynamic files should be configurable
I need to be able to control the location of these files (e.g. I might want my log files on a different file system). I need to be able to build a config file that will ensure that none of these Dynamic files will overlap with those from another instance running on the same CPU.
Configuration files should be text files
This way I can use CVS to control them and actually see via cvs diff what the changes were.
Reasons you might want to run multiple separate instances
- You want to try out some different configuration options without affecting the main server process.
- You want each developer to have their own instance that they can bounce at will.
- You want a production and a staging instance on the same box.
Other classes of data
Is there another class of data beyond the 3 I’ve identified above?
Maybe Persistent for things like user and content databases that can change at runtime.
Maybe Custom for things like HTML, JSP, .java, .class, .jar/.war/.ear files and all the other stuff that makes your site different from every other site, but doesn’t change during the execution of a server instance.
These are similar to Dynamic in that you need to be able to control the location (e.g. filesystem path, jdbc URL) that the data will be stored, so it doesn’t (or does!) overlap with another instance, and you can treat it differently than other Configuration and Dynamic data (e.g. you need to back this stuff up, but you don’t usually want to back up the Dynamic data, and the Configuration data is in CVS, so it doesn’t need backing up.)
Some programs come with a nice utility (command-line, web-based or GUI) for setting up the configuration. In many cases these utilities just write out text files that you could edit directly. This can be beneficial in several ways.
- Sometimes it’s easier to understand what the various options are and what the interactions are by running the utility and having it explain them to you, or by having a single option change in the utility change several options “in sync”. You can then do a “diff” of the before and after versions of the text files, and get a better understanding of how the text file works.
- You can check changes to the config file into CVS, even though you used a GUI to make them.
More examples, both good and bad
Tomcat: Tomcat has a directory structure that looks promising, but it’s not immediately clear to me how I would change the invocation script (catalina.sh) to point to a different configuration directory. In the worst case I could just copy the whole CATALINA_HOME tree, and that would solve the problem, but then I have multiple copies of the Static stuff as welll, which seems like a waste.
MySQL:It seems that you just have to set some options on the mysqld command line, and it will happily use completely different config files and database file locations.
Feel free to let me know about software that you’ve run across that implements a good system for separating the various file types. And let me know if there are flaws in my guidelines.
Originally Posted February 2, 2004 09:37 AM
Comment by John Barnette, February 2, 2004 05:35 PM
Here’s a bit more of a breakdown. It’s flawed; please refine.
Duh. Can vary by instance.
+ Infrastructure (maps to Static)
The executables and libraries that make up the product. Can support any number of instances.
An information repository that persists over multiple application runs, but can’t be rebuilt at runtime. Databases, flat files, et cetera. Can vary by instance.
Logs, temporary files, lock/pid files, and other stuff that can easily be recreated at runtime. Locations of each can vary by instance.
All the other stuff that you’ve created that doesn’t have the potential to be changed at runtime. Can vary by instance.