So, I'm a tech writer, not a database guy, but I've been asked to develop documentation for my company's databases. This means I have some questions.

It seems obvious to me that there must be some SQL documentation tools out there, but I have no idea what is used for the purpose. I have some ideas of what I think would be useful, but I'm not the target audience, so it's likely I'm missing something.
So there are two things I'd like some help with:

First, if anyone has used any such tools, I'd appreciate any pointers to things you've found useful. My notion is that dynamically generated documentation is better in this case than flat files or hand-coded html or something, and computers do a better job of generating graphic representations of complex relations than I do. My second notion is that I'm probably not the first person to have this idea, so something of this sort probably exists.
My employer is willing to pay for tools, but I don't really know how to evaluate them from a distance, so I could use some guidance.

Second, as a database developer, what information do you most wish you had when you approach a new position or a new project? The documentation is mostly aimed at continuity of business: the "Bob-gets-hit-by-a-bus" scenario, where you have to replace the guy who knows everything on short or no notice, so picture yourself hired to work on a big database with little or no ability to consult with the people who built it.

My instinct is to get the semantics right - what does each table represent? what do the rows represent? - and map the most important relations - where is my entry point? what am I looking for? - but push the hairy details down into the fine print, since presumably once you know the larger structure it's not so hard to fill in the fine detail. Am I on the right track, from your point of view? If not, what am I missing?

Thanks for any thoughts on this.

Check out the Embarcadero tools for reverse-engineering databases into diagrams. I've used this type of graphical tool for years. They can be a bit pricey but if you're serious about data design and documentation, they're a must. They really help you identify an overall layout of the "data landscape" and give you the ability to move stuff (tables/entities/relationship lines) around, group objects logically and enhance your ability to make some educated guesses about what relates to what.

Many years ago, I used the ERwin tools, but since they got acquired by CA they sort of "lost their mojo". The giveaway tools with MSSQL are next to worthless for anything other than trivial data structures.

Now, those things being said, trying to document an existing database is going to be challenging because most DBAs and developers don't bother to include comments (or, for that matter, useful table and column names!) in their physical implementations. Not only that but in many cases what gets implemented physically is a hodge-podge of de-normalized, de-referentialintegritied (not really a word but you know what I mean) slapdash git-er-done-today, fix-it-on-the-fly-during-development SQL cr@p that probably never even had a logical design in the first place.

Oops, sorry...I do tend to get worked up...I spent a lot of years as a data modeler and it never ceased to amaze me how lax some project teams were about documenting. These were the same guys who "look down" on maintenance programmers and analysts as if they were some sort of inferior breed.

Anyway, in reference to your last paragraph, yes your instincts serve you well. The only other thing I recommend is make a high-level pass at the entire data structure first, and keep a list of questions (TABLE BY TABLE) so you can go back later and follow up. Make judicious use of a query tool to validate your assumptions by actually looking at the data, and try to identify your "anchor tables" first so you can arrange type-code tables, many-to-many-resolution tables and parent/child tables appropriately. Also, if you can swing it, try to spend some time arranging your model so parent tables appear higher on the diagram than child tables...helps readability.

Hope this helps. Best of luck!

commented: Thanks very much. Great feedback. +11

Now, those things being said, trying to document an existing database is going to be challenging because most DBAs and developers don't bother to include comments (or, for that matter, useful table and column names!) in their physical implementations. Not only that but in many cases what gets implemented physically is a hodge-podge of de-normalized, de-referentialintegritied (not really a word but you know what I mean) slapdash git-er-done-today, fix-it-on-the-fly-during-development SQL cr@p that probably never even had a logical design in the first place.

As far as I can tell, it's not as bad as you would think. It's a big mess, as far as I can tell, and it's been under continuous cobbling since the first brick was laid, so, and there's not a stick of documentation, but aside from that, it's a reasonable structure. Basically, the whole thing is a two-stage data warehouse and the people writing the ETL packages have been pretty good, so things have been kept reasonably orderly.

In any case, it's not going to be replaced, and the only way it'll be improved is by making some documentation happen and making it reasonably dynamic so that when I move along from the job, at least there will be something to reflect the changes. And I suspect that the database guys will be willing to annotate their work if it's made easy for them. One of them actually shows some intitiative - he's instituted a pretty cool change to the way data runs through the warehouse, and then he sought me out to write it up, so it wouldn't just be in his head. Gold star for him.

I'll take a look at this embarcadero stuff. If I can whack the right guy across the nose with it at the right time, he might go for it.
Thanks for that. Good stuff!

Nice. For general data warehouse documentation, and ETL/Changed-Data-Capture documentation ideas, you might want to check out some of the documents at www.informatica.com. Their product is hugely overpriced, but they have some excellent articles and demos on their site.

No, I don't work for them. Used to, but that's a whole 'nother story. Good hunting!

Thanks, I'll take a look at that.
Cheers!

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.