What needs to be done?
At this point it is important to stop and consider that if you have read my previous posts in this series and followed all the links that I recommended then you will have read a lot of other people’s documentation and best practices. With a little luck you may have put some of them into practice already, or at least be able to see some opportunities or possibilities. Likely you will have ignored others or perhaps modified them to your particular circumstances. Having gotten this far, it is critical that you take a deep breath and pause long enough to begin to pull together your own set of system documentation.
A complete set of system documents should include a wide variety of topics, but most importantly it should define and describe how you have met a particular set of design requirements using a particular set of vendor recommendations, best practices, and customizations. Think of it as “the story” of your system. It should include everything anyone would need to understand how and why your system does what it does. This includes how to “deliver” it, “feed” it, “exercise” it, protect it, watch it, fix it, and clean up after it when it makes a mess (and sooner or later it will make a mess).
Some examples of individual documents that you might want to consider including are:
- Design Description – A high level description of your system’s purpose and how you plan to achieve it. It should include enough technical language to convey your design concepts accurately without overwhelming the reader with too many of the gritty details. Save those details for the installation and operations guides.
- Project Plan – An overview of how you plan to implement your system, including all of the major phases like planning, purchasing, installation, configuration, testing, and final deployment and acceptance.
- Installation Guide – A step-by-step detailed set of instructions on how to install and configure the system hardware and software in your particular environment. This one is especially important: don’t just rely on vendor documentation, as it can be hard to read, include options that don’t apply to your situation, and won’t include important details that do apply. Your installation instructions should be as short and to the point as possible, including only those steps and considerations that are important for you.
- Test Plan – A detailed description of whatever acceptance and performance baseline testing you plan to do, including tools and methods you plan to use, the purpose and expected result of each test, and acceptable results.
- Operations Guide – A simple explanation of all major administrative and maintenance operations associated with the system, and how to perform them. This guide should also include instructions for how to monitor the system for unexpected behavior.
- Security Policy – A detailed description of the roles and responsibilities of all people associated with the system, from users to administrators. This should also include policy and procedures for security-related activities like processing new and retired users, allowing network connections to and from the system, changing access controls, password requirements, and the like.
- Backup and Recovery Guide – Related to both security and operations, this is a topic that is often big enough to require its own manual, especially where databases are concerned. Describe your backup policies, the architecture of your backup infrastructure, backup scripts or commands, and examples of how to recover your system from various types of failure scenarios. These scenarios should be individually tested to ensure accuracy as part of your Test Plan.
- Development Standards – Set down in writing the standards for developing your application, whether for security, architecture, naming of objects and procedures, or making decisions in various design scenarios. This may also include a list of recommended best practices or references to other reading related to your system.
- Interface Agreements – If your system interfaces with other systems you should document the details of the connections. A description of any related subsystems, schedules, network configurations, expected data transfer protocols or system loads should be included, as well as processes for coordinating changes to the interface and points of contact.
- Service Level Agreements – It is important to manage the expectations of your users in terms of system performance, responsiveness to problems or outages, and maintenance activity.
How can I do that?
There’s no great secret to any of these documents. Examples of all of them can be found with quick Google or Yahoo searches. The hardest thing to do is just to start writing. I usually start with an outline of what I want to include, but everyone has their own style and you may have to work a bit to figure out yours. I definitely recommend doing a little research and learning by example. Take a look at what other people have done (or what is required by your organization) and then decide for yourself what best fits your needs. If your system is small, you might be able to create what you need relatively quickly; you may not even need all of the documents I described. If your system is a bit larger, you may need all of this and more. Hopefully in that case you’re not doing all of the writing yourself, though.
Most technical people that I know (myself often included) really chafe at the idea of taking the time to create detailed documentation. We’d rather move on to the next technical challenge or work on fine-tuning our system than brain dump everything we know for someone else to not read at a later date. That is something that we all need to overcome.
How does that help?
A complete set of documentation like the one I described above can be important for all sorts of reasons. First, it can be important for us personally. Most systems – and especially Oracle databases – can be really, really complex with a lot of “moving parts”. Think about things like network listeners, datafile management, backup and recovery, user management, performance monitoring and analysis, code reviews and enforcement of design standards, requirements analysis, security policy and implementation, and the list goes on, and that is just for the database. If is difficult, if not impossible, for any one person to keep all of the details of design, policy, installation, configuration and operations all in their head. Sooner or later we’re going to forget something. When that happens it is critical for the sake of consistency in system operations to have a resource that tells us what to do and the reasons for doing it in that particular way.
Then there’s the classic “what would happen if I were hit by a bus today” scenario. What would happen if someone completely unfamiliar with your system had to come in and learn how to work with it from scratch, without the benefit your thoughtful insights and explanations as to why things have been done (or were not done) in a particular way? What would be best for your organization? A complete and concise set of documentation can make a daunting task like that a lot easier for whoever gets tasked with covering for you. Assuming that you haven’t actually been hit by a bus and you are eventually able to return to your job it would be good for things to have been maintained in a way as close to “your way” as possible in your absence.
There are a lot of other ways that system documentation can be described as “good” and “good for you” but I’ll limit myself to one last one. The act of writing things out in the first place forces you to slow down and consider what you are creating. It is a good opportunity to reflect on your decisions and look for opportunities for future improvement. Once you start to see those, the “story” of your system will be in good hands.