Congratulations on creating a large database driven site! It is beautiful and very useful, now if only you could figure out how to get people to find your site. Time for search engine optimization!
SEO is how to make your website easy for the search engines to crawl and convince them the content is unique and valuable thus worthy of being ranked very high in the search results.
Step 1 - Bot Accessibility
The search engines discover new websites and webpages by following links with their bots. These bots are not very smart and go for quantity not quality so you need to make it easy for them to crawl your pages. Make sure you have a robots.txt file that does not block Google, Yahoo or MSN. If you do NOT have a robots.txt file then they will assume they can visit your site.
Remember the search engines follow links to discover new content. Interlinking your million page site to increase the chances of the search engines finding all of your pages. Adding links within article pages to other related pages is a good idea. Create a beautiful spider web of relevant and useful links for users and the search engine bots will have a great time.
Step 2 - Uniqueness
Great, you now have succeeded in getting the search engine bots to visit your million pages. Unless these pages are unique, the search engines will filter them all out of their index. I am not going to lie. It is hard to make a million pages on a site unique. Here are some ideas.
Make each page title as unique as possible. Instead of using an identical page title, add a few words from the database to the page title. If you are running a review site, instead of every page saying "ABC Local Reviews" have it say "Joe's Pizza Shop, New York, NY By ABC Local Reviews". By playing around with your data fields you should be able to maintain a brand presence in your page title, better describe the content on the page and make it unique.
Make the actual content (by this I mean text) on the page unique. You need several lines if not full paragraphs of unique content. Hopefully you have this already. If you don't, think about allowing users to submit reviews or comments. User generated content is not known as being high quality and should be monitored. But user generated content might be an affordable solution to generating unique content.
Minimize the duplicate code on each page. Search engines say they ignore the code and only index the visible text. My personal experiences make me think differently. If your page is 20kb and only 1kb is unique content than 95% of your page is duplicate. Slim down your html code. It makes the page load faster, saves on bandwidth and in my opinion also helps increase the uniqueness of a page.
ps - in case you are wondering meta tags have very little value. some meta tags have ZERO value. most websites will benefit much, much more if the time is spent on visible content instead of spending time on meta tags.
Step 3 - Look at me, I'm valuable
You are doing super. You now have the search engines crawling your entire site, not filtering any pages out (or at least not filtering any pages you care about). All of that is worthless unless you can get your homepage and all your subpages to rank in the search engines. How do you help convince the search engines that your pages are valuable? Links, links and some more links.
Get other websites to link to your website with relevant anchor text. Make sure to spread the link love around your entire site. A common mistake is to have all of your links only go to your home page. That will tell the search engines only your home page is valuable. Point some inbound links from external websites to your deep categories.
Remember with the crawling issue I suggested that you use your internal link structure to interlink your pages with relevant text. This will also help to show your pages are valuable. Search engines do not like run of site (ROS) links alot and will devalue them to a certain extent. This does not mean ROS links are worthless. I am just saying it is better to have some internal links embedded within your content. Make sure the anchor text is targeted but not 100% identical.
Step 4 - Lean Back, Open a Beer
Yea right. You really think it is that easy? You need to constantly develop links to prove to the search engines your pages are valuable. Also since you just opened your site and made it super easy for the search engines to crawl you, you now need to protect your content from data miners.
Don't worry if you do not have 100% of your site indexed by the search engines. Focus on the pages that you care about the most. When time is available work on the secondary pages. Good luck!