Hey,
I am working on a book writing program right now and I am wondering what the best way of
saving the book(s) would be.

I would like all of the books in one place, and I need to divide between all of the chapters
in the book. So I am thinking that a DB might be the best way of saving it.

But I would like your guys' opinion on this matter. I program as a hobby and I have never
worked with DBs before, and very, very little with text files.

Thanks in advance,

- WolfShield

Just use two tables on 1) Book Table with columns BookID,BOOKName etc(Main Details) and
2)Book_Details Table with BookID,ChapterID,ChapterText etc... Where BookID as Foreign Key and BookId and ChapterID will act as combined primary key for unique identification of records.

Hi, database can be very useful for sorting data but it has limitions especially for saving a large amount of data at once. Therefore, before thinking about database its a good idea to check how much space do you need, what type of data you want to store and how to store it there, here are some info about data types for storing large amount of data in MS SQL.

VARBINARY: Variable binary data type can store upto 8000 characters. If you need to store the whole book in one row this won't be a good choice because even some chapters can easily exceed 8000 characters.

TEXT: Text is non-Unicode data used for storing large pieces of string data with a maximum length of 2,147,483,647 characters. the TEXT is stored out of row if the length of the field exceeded, this type of data is not a good choice if you need to search on the value of the column.

IMAGE: This type of data can be used not only for images but also for any kind of binary informations like pdf or wave file, you can save a pdf file.

So,
A database has size constraints. Do text files? (Btw, it looks like most book chapters
have between 2,000 and 10,000 words. So that exceeds the 8,000 char limit.)

From the little I know about text files I know that when saving it you format
the way it looks. Is there a way that I can have a certain
place in the text file that divides the chapters? Maybe a keyword that when the
program opens the file it searches for that keyword and sub-strings it?

Thanks for the input,

- WolfShield

Here is what I would do.

Make a database that holds all the information in the books, including author, byte offsets of chapter divisions, book id, book title, and the path of the book data and whatever else you might need.

Now you can save the whole book as one big chunk of text to a file, and have a database that would hold all of the other information. This way, if you wanted to load a chapter from the book you could get the byte offset and the length of that chapter from the database. This would allow you to considerably save memory - instead of loading in the whole book all at once you can specify exactly what parts of the file you want loaded.

For small files this might be redundant, but in a very large book lets do some math:

1000 pages with 250 words per page
1 Char = 2 bytes
1 Word = 5 char average = 10 bytes
1 Page = 2500 bytes
1000 pages = 2,500,000 bytes of memory for just the raw words, and no formatting!

Ok, so modern PCs have thousands of times more memory than one book. But what if you wanted to have a 1000 books in your program. Would you really want to load over 2MB of data into memory each time you want to look at a new book? The effect on performance would be pretty noticeable I think, especially if you needed to load more than one book at a time for whatever reason.

Heres what the DB might look like:

Table Books
-------------
BookID (PK)
ChapterCount
Length
FilePath
Chapter1Offset
Chapter2Offset
Chapter3Offset
Chapter4Offset (etc etc)

That would be about the minimum to make this architecture work, obviously you would probably want a bit better of a database than this.

Well,
Thanks for the great advice! I think I'll go with skatamatic's suggestion.

Anyone have a database basics tutorial they would like to refer? Thanks for all of
the help everyone!

- WolfShield

This article has been dead for over six months. Start a new discussion instead.