Archive for the ‘Practice’ Category

The Checklist Manifesto

Saturday, September 11th, 2010

I’ve just finished listening to the audiobook version of The Checklist Manifesto by Atul Gawande. (Book web site at thechecklistmanifesto.com). Overall I think it’s worth reading, especially for those of us in highly technical roles.

Overview

The book outlines Dr. Gawande’s experiences with checklists, and what he’s learned about how they can make the performance of complex tasks more consistently successful. He presents examples from many realms of endeavour, including medicine (he is a medical doctor), architecture/construction, aviation, and finance. In each area he shows how the creation and use of checklists has allowed those who follow them to complete tasks in a more consistently successful manner, while (seemingly paradoxically), allowing for more flexible reactions to unforseen circumstances. Given the apparently obvious benefits of using a checklist, it is surprisingly rare, even when doing so has been proven to be beneficial to colleagues. He discusses what makes a good checklist and describes his experience developing one from scratch.

Comments

I sometimes found that Dr. Gawande’s anecdotes went on a bit long. OK, building a 50 story building is complicated, we get it. Tell us how the checklists help!There are a lot of medical examples (he is a doctor) and I sometimes had to struggle to apply the lessons from the medical checklist stories to my own environment.

What I learned

Simple/Complicated/Complex: Early on in the book Dr. Gawande divides tasks into three levels of difficulty (Note: these are not his divisions… someone else invented them). There are simple tasks, which, once learned, can be executed successfully most of the time. Complicated tasks take longer to learn, but can eventually be codified as well. Complexity arises when there is simply too much variability and randomness to possibly codify everything. (A good summary, with some good reference links: Simple vs. Complicated vs. Complex vs. Chaotic)

Hipocrisy: I’m going to recommend the use of checklists. But that doesn’t mean I’ll consistently use them myself (I will try!). I’m in good company: near the end of the book, he mentions that about 80% of the surgeons and nurses thought they would follow the checklist he had developed. But 93% would want a team operating on them to follow it.

Not Forgetting the stupid stuff. There is a seeming paradox in following a checklist (codified, rigid) in a complex environment that will likely require flexibility and creativity to succeed. The author argues that this is precisely because the checklist helps you to not forget the simple, stupid things that are easily forgotten in the heat of the moment, but which, in their omission, can cause the entire process to fail. My interpretation is that there are at least two reasons for this:

  1. Time/Danger pressure. Either time is short, or there is actual danger. This encourages skipping steps.
  2. Boredom: We want to get to the interesting stuff. In technical fields, we tend not to want to spend time on the boilerplate, but get right to the guts of the interesting problems. Again, this encourages skipping steps.

How to successfully execute tasks in a complex technical environment. Bringing together the items from the book’s many anecdotes and stories, here are some items that I think apply to the software development arena:

  • Communication: In the building industry example, Dr. Gawande points out that there are in fact two checklists. The first is obvious: weld girder A to girder B on such-and-such a date. But the second is more subtle: force people to talk. Primes from each functional area (structure, electrical, building owner) have a defined schedule of communications. Is this task done? How did it go? Were there any problems that other areas need to know about, or that will affect the schedule? This is similar to various notions in Agile design (scrums, involving the customer), but is more well-defined. I’m struggling with exactly how to incorporate this into a software development process, especially given the clear difference between a building plan (which seems like a classic waterfall design: you know up front exactly what you are going to build and when) and an Agile software development project. However, it seems to me there is something of value here.
  • Don’t forget the simple steps: In his surgical checklist, administering antibiotics just before an operation is about to commence is a step that makes a large difference in the number of post-operative infections. But the step is quite frequently missed or performed too early. In software development there are simple things that must be done, but are easy to miss. Did you forget to re-throw the exception? Could that pointer be null? Did you deploy the updated library when you deployed the updated mainline code that calls it? I can think of several checklists that I could prepare that would assist me day-to-day.
  • Use a checklist: I’m just as guilty of preparing checklists for others but not following them myself. Hopefully, having read this book, I will improve on that.

Cloud Computing Presentation

Friday, March 13th, 2009

The presentation went well, the Elluminate software worked just fine. People reported good audio quality, and the features like “raise hand” and the chat window allowed for questions to be asked and answered during the presentation.

The screen sharing seemed to behave as expected.

I’ve made the presentation slides available for dowload: Cloud Computing Presentation to CJUG March 2009

A recording of the actual presentation is available at CJUG’s web site.

Cloud City Background

Refactoring Dynamic Code

Monday, November 19th, 2007

This has been one of my concerns all along.

The claim is that “enough testing” will eliminate the need for the compiler to do static checking at compile time.

http://www.artima.com/weblogs/viewpost.jsp?thread=217080

Frank Sommers seems to be saying otherwise… in the long run.

So, I’m feeling justified in my decision to stick with Java, where I gain maximal benefit from IDEs and compilers.

Pet Peeve: Learning Curve Misuse

Thursday, March 15th, 2007

How many times have you heard this?

This product has a steep learning curve.

I’ve lost count. But it never ceases to annoy me. What does the above actually mean? And what is the person saying it really trying to get across?

What they’re trying to say: This product is difficult to learn.

What it really means: This product is easy to learn.

Thus you see the confusion, and my frustration. They’re actually saying the opposite of what they meant to say. Usually it’s clear from context what they really mean. But I’m a bit of a stickler when it comes to language issues. (OK, sometimes I’m a lot of a stickler.) So this misuse of the phrase “steep learning curve” really bugs me.

Where does this come from, and why is it so difficult to understand?

My theory is that “steep” seems to connote “difficult”. As in “that’s a steep hill to climb.” But we are not talking about hills, we are talking about curves, and in this case steep means “changing quickly”. And since the curve being discussed is a “learning curve”, that means “can be learned quickly”.

If that explanation makes sense, you can:

  1. stop reading now
  2. start using “steep learning curve” properly
  3. tell other people to do the same
  4. correct them when they do not

But if you’re not yet convinced, read on. Colleagues of mine from late, lamented XOR in Boulder (you know who you are) will recognize this rant, and the graphs that accompany it.

Learning Curves

To start with, what, exactly, is a learning curve?

I first ran across learning curves in the 70’s. My father brought home a programmable HP calculator (an HP67, as I recall) which was nearly the size of a mechanical adding machine. He wanted to see how this technology could be useful, and since I seemed to have an aptitude for these things (I was in my early teens
at the time, and already showing geekish tendencies) he thought I might be able to put the machine through its paces.

The sample problem my father brought home was to calculate learning curves. Learning curves were part of the toolkit of the reductionist approach that Operations Research practitioners were applying to production line analysis at the time, along with Time-and-Motion studies, Therbligs and other such evil things.

He had a simple formula which related the time required for an individual to perform a task to the number of times that individual had previously performed that task.

T = f ( T0 , Tinfinity , r , n )

That is, the time (T) to perform the task is a function of the time required for an inexperienced user to perform the task the first time (T0), the time required for an experienced user to perform the task
(Tinfinity), the “learning rate” (r), and the number of times it has been performed (n).

The difference between T0 and Tinfinity dictates how much opportunity for improvement there is as a result of having practiced the task many times. That is, some tasks are more amenable to learning from experience than others.

The learning rate parameter encodes the speed with which an inexperienced operator learns and becomes an experienced one. This parameter is the crux of the issue. Its value might change from person to person (some people learn more quickly than others) and from task to task (some tasks can be learned more quickly than others). As I recall there is a sort of “average” value for this parameter that is applicable to the average person and to a wide range of common assembly tasks found on widget assembly lines.

I don’t remember the form of the equation, but I’d be willing to bet it included some sort of negative exponential function that approaches Tinfinity in the asymptote.[1]

What does that all mean? Well here is a picture to help clear things up.

Steep and Shallow Curves

With that background we can start to play with some variations.

The first one, which is directly related to the discussion at hand is the classic “steep learning curve”.

Interpreted verbally: “After only a few iterations, an inexperienced operator is able to perform the task nearly as quickly as a very experienced one.” The widget factory managers like this kind of curve, because it means that when they hire a new assembly line worker, it only takes them a short amount of “training time” before they become as proficient as the other more experienced operators.

What’s the opposite of “steep”? For lack of a better word, I’ll choose “shallow”.

Interpreted verbally: “It takes many iterations before an inexperienced operator reaches the level of proficiency of an experienced one.” The widget factory managers hate this kind of curve, because it means that it is a long time before a new hire is as productive as the existing workforce.

You could also use “gradual”, or “slow” to describe this curve. But I would emphatically not use words like “gentle” or “easy”, as these connote that this is a desirable curve, when this is obviously not the case.

Better terminology

So, if “steep” is confusing and “shallow” isn’t much better, then what is preferable? How about “difficult” and “easy”. As in “This product has a difficult learning curve.” or “This API has an easy learning curve.” I believe I have actually seen this terminology “in the wild”. I wish I had kept a pointer to the reference, so I could send a congratulatory email to the author.

Other Interpretations

When I had this conversation with the XOR folks back in the mid 90’s various objections were raised. They all deserve addressing, to ensure that there is not some interpretation of a learning curve in which “steep” means “good”.

Objection 1: We’re not making widgets

This objection stems from the fact that in most cases in the IT industry these days, the author of the “difficult learning curve” comment is not referring to a repeated activity, but rather to a single, long, drawn-out activity. That is, they are referring to learning to become proficient with a new IDE or API, which is a long and intellectually challenging task, and not to repeating the same mechanical hand-eye-coordination-requiring task over and over.

However, I would argue that the concept maps reasonably well into this new situation. We simply replace the “Number of Repetitions” axis with “Amount of Time Spent with Tool”, and the “Task Time” axis with “Proficiency”.

This still looks steep to me. And it still shows that high proficiency is obtained quickly. So “steep” still equals “good”. Objection overruled!

Objection 2: What if the Y axis is reversed?

Figure 3a nicely covers objection 2 as well. As you can see, the value along the Y axis increases over time rather than decreasing. But neither of these changes (relabelling the axes, inverting the Y axis) has changed the fact that this is a steep curve. And the steep curve is still a good one; verbally I would read this curve as “A user new to the tool will rapidly become proficient.” An IT manager would like this kind of tool, since it means that a new hire would become as proficient as the existing coders very quickly.

Just to be sure, let’s also look at a “shallow” (i.e. “not steep”) learning curve with the Y axis inverted.

Verbally, it’s clear this graph says: “An inexperienced user will take a substantial amount of time to reach peak proficiency. Our IT manager would be very unhappy with a tool or framework with this sort of learning curve (can you say EJB 1.0/2.0?). So flipping the Y axis hasn’t changed anything. Objection overruled!

Objection 3: What if we measure time spent along the vertical axis?

Cheap answer: “who would do that?”. But OK, I’ll humour you and transpose the axes.

I find this graph hard to read, but if I squint at it, I can see that

  1. it is steep
  2. it indicates that high proficiency is reached after a short time

So “steep” is still “good”. This objection also doesn’t change anything. Objection overruled!

Objection 4: Steep means that you have to exert yourself to get up the hill

This objection is harder for me to formulate properly so I can deal with it, but I’ll give it a try. The idea, if I understand it correctly, is that you want to reach a certain proficiency level in a certain amount of calendar time. (Previous graphs have used “Amount of time spent with tool” as the time axis.) The argument here is that, given two different tools, one that is easy to learn and one that is hard to learn, and a goal, say, to be usefully proficient in two weeks, you would have to spend more time and effort learning the more difficult tool in those two weeks to reach equal proficiency. With calendar time as the axis, the two curves now look equally steep, since, by definition, you’ve constrained that to be the same. However this is misleading, as the only reason that the two curves look same is that much more effort was spent learning to use the more difficult tool. The argument here seems to be something like “What we are measuring is the effort required to make the learning curve for any given product look steep.” But what does it mean to compare two products with this interpretation? Under these conditions all curves are as steep or as shallow as you want. So using “steepness” as a comparative in this case is meaningless. We’re back to saying that the tool has a “difficult” learning curve, as in “It’s difficult to make this product’s learning curve into a steep one.” But that hasn’t changed the fact that a steep curve is one on which a user can achieve proficiency in a short period of time. So steepness is still a good thing, not a bad one.

Other Misuses

More Bad Terminology: “Spiky” Learning Curve

I ran across this one just the other day. Some product reviewer said a product had a “spike” in the learning curve. What would he possibly be trying to say? Well, let’s assume he means that you go for a while, learning to use the product, and then you run across something difficult, and you slow down for a while. At least that’s what I got from the context. He implied that spiky learning curves are bad.

If we took him literally, it would mean that the learning curve went up, and then went down again. Note that none of our curves has ever changed direction. Curves that started low and went higher (proficiency as Y axis, which is presumably what this reviewer had as his mental image) never drop back again: they are monotonically increasing. What would it mean if this were not true? It would mean that at some point you become less productive than the day before. This isn’t how learning curves usually work (although… I recall that when the EJB specification first came out my web development productivity went into the toilet for a while… but nevermind).

What this guy is trying to say is that the rate of learning goes through phases where you learn more quickly or less quickly than the average. Notice that magic word “rate”. So he’s really talking about the slope of the learning curve. That is, we have to take the derivative of the learning curve equation.

For a normal learning curve (Figure 5a) a graph of the slope would look approximately like Figure 5b.

You can clearly see in Figure 5b that there is something resembling a spike: initially you are learning slowly, then you learn quickly for a while, followed by a plateau, where you are proficient, and only small additional gains are possible.

So now we have a mathematical definition of “steep”: the slope or rate of change of the learning curve. Since steep is good, that implies that a large rate of change (or slope) is good, which means that large values on the Y axis of the rate of change graph are good… which means that spiky is good. Spiky means that there are times where you are learning very quickly.

Graphical Error: Billboard

One particularly egregious misuse of learning curves was on a billboard for a local community college I spotted recently. What’s wrong with this picture?



Billboard Caption: “The learning curve of [College X] graduates”

Where do I start? I would read this as “Our graduates are mediocre and never improve.” Not exactly the message the college wants to convey.

How could we help out the college here? What if we replace their bogus graph with Figure 3a, which is our steep learning curve?

Well, there’s a bit of a minefield here, since you could read that graph as “Our graduates don’t know anything but they learn quickly.” Maybe that’s not exactly the message the college wants to convey either.

How about this:

I read this as: “Our graduates know some stuff right out of the program,
and they learn quickly.”

There, problem solved. Now I can sleep at night.



References

[1] The Learning Curve. FAA Acquisition System Toolset.

Quis Custodiet Ipsos Custodes? Monitoring your monitoring system

Monday, June 21st, 2004

The group I’m in here at a major telecommunications provider has a nice little setup of HP’s OpenView and Remedy’s Action Request System.

OpenView listens for and seeks out problems with various internal
and customer-owned systems. When something serious occurs, it uses some
3rd party integration software (RemedySPI) to create a trouble ticket
in Remedy. Rules fire to assign the ticket to an appropriate “triage”
person. Notification is then sent to the assignee by email or text
pager, depending on the severity of the problem. There is a POTS modem
plugged into the Remedy box that is used to send text pages, and the
Remedy box is allowed to make outgoing SMTP connections to the Internet
to deliver e-mail.

Remedy and OpenView are linked so that closing the ticket removes
the corresponding event in OpenView, and the OpenView event is
annotated with the Remedy ticket number.

Both use an Oracle database to store information about tickets and events.

Each system (Remedy, OpenView, Oracle) runs on a separate piece of Sun hardware.

So we have 3 single-points-of-failure that could cause notification
of critical events to stop: Openview (the software, or the hardware
it’s running on), Remedy (likewise) or Oracle (likewise).

Additionally, failure of a single 100baseT switch or switch port would sever connectivity and take out the notification system.

I’d like to set up something that would detect a failure of one or more of the three critical systems, and notify someone.

Obviously this backup notification system has to be independent of
the three in question. So it should run on some 4th piece of hardware
which is plugged into a different switch. It would then poll or look
for heartbeats from the 3 systems, and use it’s own resources to notify
someone of a problem.

As far as I can see it would be OK if this backup monitoring box
used the same Internet connection, since e-mail is only used to deliver
lower-severity notifications, and a loss of Internet connectivity would
be a Critical-level event, which would use the POTS line to deliver the
notification.

The main monitoring system would then be set up to monitor the backup system, to make sure it’s running.

So… finally, to the question:

What software would you use to set up this monitoring/notification
system? Obviously one could install a completely parallel set of
[OpenView, Oracle, Remedy], but that would be overkill, as we only need
to monitor 3 machines and a few daemon processes.

Is there some nice Open Source project out there that would allow me
to quickly solve this problem? Has anyone done something like this? Any
comments on my failure-mode analysis? Am I worrying about the right
things?

Comments

Three comments:

1. I’d make sure that the secondary monitoring system is monitored
by your fancy-dancy system, in case it fails (of course, if both fail
at once, you’re hosed, unless you want to add a third system, and so
on).

2. I don’t know how complex a system you need to do the secondary
monitoring. If all you want to answer is ‘is the software running?’ you
may be able to get by with a simple perl script running from cron (can
I make a connection to Oracle, etc). A quick look at SF and google
didn’t point out anything obviously of relevant to more complex cases.

3. How does Remedy send the text messages over the modem? You need
to consider that in your solution, for sure. Not sure how to do that in
perl.

Posted by: Dan Moore at June 21, 2004 03:33 PM

We use the open source solution: Nagios (http://www.nagios.org/).

It has a simple plugin infrastructure so you can write a bit of Perl code to monitor anything.

It is simple, clean, and very easy to use.

Dion

Posted by: Dion Almaer at June 23, 2004 10:32 AM

Unix command line utility program conventions

Monday, June 7th, 2004

Sometimes a vendor supplies a command-line utility for performing some function that we want to use from within our scripts and programs.

There are some unwritten (at least as far as I can find) rules about how to write one of these utilities so it can be used properly.

Some vendors get this right. Others, not so much…

The Rules

Return an error status indicating success or failure. For bonus points, return multiple different error codes depending on what went wrong. (The Anna Karenina Principle)

And you know what? That’s about it. The rest (arguments, input/output locations, etc., etc.) really depends on the context and function of the item in quesiton. Though the following are useful:

  • Provide a useful usage message if invalid arguments are passed.
  • Provide an explict way (e.g. –help option) to ask for the above usage message.
  • Use GNU-style arguments.

But these are really for human consumption, not for use in a script.

Recent Failures

I won’t name names, but here are some of the failures I’ve seen recently. (And these are from Big Companies that should know better. Including one company that actually has its own version of Unix… they should really know better.)

  • utility always returns status 1. How the HECK am I supposed to know if it worked? Why are you always returning failure? Didn’t you read a single Unix man page? Didn’t you notice that non-zero exit codes mean failure?
  • -q option suppresses error messages to stdout/stderr… and suppresses the error code return as well. Take a look at diff(1) sometime. The -q option just suppresses the listing of the differences, but still returns the error code.

Strangely enough, these rules apply equally well to Windows command-line utilities. Yes, these do exist.

Another suggestion

If it can be done with a command-line utiltiy, then give us an API we can use.

If you even just create a simple C library, we can then wrap it into our favorite language as a Perl Extension Module, or a Java Native Interface (JNI) pacakge.

If you feel like creating a pure Java implementation of the Library, that would be good too.

From Dan Moore:

On a different tack, but still touching some of the same principles, you may want to check out the Command-Line Options section of the Art of Unix Programming

Originally Posted June 7, 2004 11:20 AM

Third party software installation woes

Monday, February 2nd, 2004

Let’s say you wanted to run two completely separate instances of Apache’s web server. (I.e. not just multiple virtual hosts listening on different ports, but actual separate httpd processes.) All you’d have to do is create a new httpd.conf (in a separate directory, or with a different name in the same directory as the current one). Then just invoke httpd with the -f option to point at this new config file. The new config file can specify the locations of all the other files (server root, logs, pid files, etc.). Couldn’t be easier.

On the other end of the spectrum is WebLogic 8. I just tried to do the same thing with it (made a copy of the config directories, tweaked them, fired it up), and it failed miserably. Parts of it were still pointing into the old config tree, so things like DB pool deployments were failing.

So, what’s the problem here, and what’s the solution?

In my opinion, there are three types of data associated with an application:

Static
The executable code, and anything else that gets installed off the media (and/or built at install time), and never changes thereafter.

Configuration
Any file that affects how you want the application to run.

Dynamic
Anything that changes during the running of the application, but isn’t part of your own data (e.g. not your relational database). This includes stuff like log files, pid files, lock files, and work directories. Usually you can delete this stuff when the application isn’t running, and it will happily recreate it when it restarts.

In my opinion, you should be able to create a new config file(s), execute the same static code, point it at the new config file(s), and have the newly configured instance create its own set of dynamic files.

This is what I did with apache in my example above. For apache, my classification scheme works out as follows:
Static: the bin and modues/libexec directories
Configuration: the conf directory
Dynamic: the logs directory

Why would I want to do this? Because I want to check the config files into CVS, so I can carefully manage changes to them. I don’t need to check a copy of apache itself into my CVS tree, since I know what version I downloaded and installed. If/when I upgrade to a new version of apache, I can make any required changes to the config files to support and configure the new version, and check those changes into CVS. The Dynamic files I definitely don’t want in CVS, but I might want to archive them (e.g. the logs) for future reference. A clean split between the 3 file types makes this process very simple. I just have to put a copy of the apache conf directory into CVS, and I’m done.

Now let’s look at what happened with Weblogic. I’m still doing the analysis, but here are the preliminary results:

Weblogic wants 3 directories:
BEA_HOME: stuff common to all BEA products
WLS_HOME: where the weblogic software lives
Domain directory: where the files for a “domain” go

In order to run a single weblogic server instance listening on a single port, you actually have to execute 3 processes: an Admin Server, a Node Manager, and a Managed Instance. You have to allocate each of these servers at least one port number to use to listen/talk to the others.

The Admin Server and the Managed Instance share a single config file: config.xml in the Domain directory. The Node Manager has it’s own config file: nodemanager.properties, which lives in the WLS_HOME directory.

By default log files get created in the Domain directory, and there’s an LDAP database that gets stored there too.

There are quite a few files that need to be changed in order to change the directory locations and port numbers assigned to the 3 processes that make up a single “instance” of Weblogic. Somehow in my effort to simply copy the directories and create a new configuration, I missed one spot (grepping can’t find it!) Our problem seems to be with the LDAP directory, and the license file location.

With Weblogic, it seems impossible to separate out the Configuration and Dynamic portions, and hard to separate the Static from the Configuration.

It looks like we’re going to have to re-install from media, and create entirely new BEA_HOME, WLS_HOME and DOMAIN directories. Then I’ll have to run and configure the software, to get it to the state I want. Then I can shut it down and do a “diff” between the various directory trees to see what actually changed in order to change the directories and ports. Then I’ll have to figure out which files are actually Configuration files and which are just Dynamic, how to check in only the Configuration files so we can track config changes in CVS. In the worst case, some of these files (e.g. the LDAP database) will be binary.


So, the guidelines:

Make a clear separation between the three types of files
It should be easy for me to tell which files are executable stuff and which are configuration related. If all the configuration files are stored in a single directory, then it makes it easy to put the configuration under CVS control.

I should be able to lock down the Static and Configuration directories so that they are read-only and the software will run happily (i.e. it shouldn’t be trying to write stuff into the configuration directory… unless I ask it to through some configuration utiltity or GUI)

The location of all Dynamic files should be configurable
I need to be able to control the location of these files (e.g. I might want my log files on a different file system). I need to be able to build a config file that will ensure that none of these Dynamic files will overlap with those from another instance running on the same CPU.

Configuration files should be text files
This way I can use CVS to control them and actually see via cvs diff what the changes were.


Reasons you might want to run multiple separate instances

  • You want to try out some different configuration options without affecting the main server process.
  • You want each developer to have their own instance that they can bounce at will.
  • You want a production and a staging instance on the same box.

Other classes of data

Is there another class of data beyond the 3 I’ve identified above?

Maybe Persistent for things like user and content databases that can change at runtime.

Maybe Custom for things like HTML, JSP, .java, .class, .jar/.war/.ear files and all the other stuff that makes your site different from every other site, but doesn’t change during the execution of a server instance.

These are similar to Dynamic in that you need to be able to control the location (e.g. filesystem path, jdbc URL) that the data will be stored, so it doesn’t (or does!) overlap with another instance, and you can treat it differently than other Configuration and Dynamic data (e.g. you need to back this stuff up, but you don’t usually want to back up the Dynamic data, and the Configuration data is in CVS, so it doesn’t need backing up.)

Configuration utilities

Some programs come with a nice utility (command-line, web-based or GUI) for setting up the configuration. In many cases these utilities just write out text files that you could edit directly. This can be beneficial in several ways.

  • Sometimes it’s easier to understand what the various options are and what the interactions are by running the utility and having it explain them to you, or by having a single option change in the utility change several options “in sync”. You can then do a “diff” of the before and after versions of the text files, and get a better understanding of how the text file works.
  • You can check changes to the config file into CVS, even though you used a GUI to make them.

More examples, both good and bad

Tomcat: Tomcat has a directory structure that looks promising, but it’s not immediately clear to me how I would change the invocation script (catalina.sh) to point to a different configuration directory. In the worst case I could just copy the whole CATALINA_HOME tree, and that would solve the problem, but then I have multiple copies of the Static stuff as welll, which seems like a waste.

MySQL:It seems that you just have to set some options on the mysqld command line, and it will happily use completely different config files and database file locations.

Feel free to let me know about software that you’ve run across that implements a good system for separating the various file types. And let me know if there are flaws in my guidelines.

Originally Posted February 2, 2004 09:37 AM
Comment by John Barnette, February 2, 2004 05:35 PM

Here’s a bit more of a breakdown. It’s flawed; please refine.

+ Configuration
Duh. Can vary by instance.

+ Infrastructure (maps to Static)
The executables and libraries that make up the product. Can support any number of instances.

+ Persistent
An information repository that persists over multiple application runs, but can’t be rebuilt at runtime. Databases, flat files, et cetera. Can vary by instance.

+ Runtime
Logs, temporary files, lock/pid files, and other stuff that can easily be recreated at runtime. Locations of each can vary by instance.

+ Content
All the other stuff that you’ve created that doesn’t have the potential to be changed at runtime. Can vary by instance.