•
•
•
•
What is DaniWeb IT Discussion Community?
You're currently browsing the Database Design section within the Web Development category of DaniWeb, a massive community of 402,699 software developers, web developers, Internet marketers, and tech gurus who are all enthusiastic about making contacts, networking, and learning from each other. In fact, there are 2,412 IT professionals currently interacting right now! Registration is free, only takes a minute and lets you enjoy all of the interactive features of the site.
Please support our Database Design advertiser: Programming Forums
Views: 839 | Replies: 5
![]() |
•
•
Join Date: Oct 2007
Posts: 7
Reputation:
Rep Power: 0
Solved Threads: 0
Hi,
Firstly, I'm not a programmer nor a web-developer, but rather a statistician. I am doing equity-related (shares / stocks) research and I need to place my data in a database (MySQL), since it is too big to work with in a spreadsheet.
I have weekly closing price, volume, market cap, PE etc data for about 900 companies over a 16 year period. Obvioulsy, I can't throw it all into one table, so I was considering a table per company with year and week defining the primary keys. But a join accross 900 table in order to find the 20 companies with the lowest market cap in the first week of 1990 would be very cumbersome.
So I'd like to know from anyone here how they would go about designing this database.
Firstly, I'm not a programmer nor a web-developer, but rather a statistician. I am doing equity-related (shares / stocks) research and I need to place my data in a database (MySQL), since it is too big to work with in a spreadsheet.
I have weekly closing price, volume, market cap, PE etc data for about 900 companies over a 16 year period. Obvioulsy, I can't throw it all into one table, so I was considering a table per company with year and week defining the primary keys. But a join accross 900 table in order to find the 20 companies with the lowest market cap in the first week of 1990 would be very cumbersome.
So I'd like to know from anyone here how they would go about designing this database.
•
•
Join Date: Sep 2007
Location: North Bay Ontario
Posts: 174
Reputation:
Rep Power: 1
Solved Threads: 20
First I suggest you sit down with a good book on relational database design ('Database Design For Mere Mortals' by Michael Hernandez and 'MySQL' by Paul DuBois come to mind). We can help you help yourself, but can't do your work for you.
Amer Neely - Web Mechanic
"Others make web sites. We make web sites work!"
"Others make web sites. We make web sites work!"
•
•
Join Date: Oct 2007
Posts: 7
Reputation:
Rep Power: 0
Solved Threads: 0
Oh, I fully understand that you can't do the work for me and I'll have a look to see if I can't get hold of those books.
Was thinking (read "drawing diagrams") that I should maybe do a table per year. It will still result in a large number of records (less than 47 000), but fewer records than if I dumped all into one table. Making joins across 16 tables (one per year) should be easier than doing it across 900!
My big concern is the type of performance I'll get if I have a table with so 47 000 records... Is that big? Is too big?
Anyway, I'll experiment and report back on my results (might take a few days). Any further pointers are more than welcome in the mean time.
Was thinking (read "drawing diagrams") that I should maybe do a table per year. It will still result in a large number of records (less than 47 000), but fewer records than if I dumped all into one table. Making joins across 16 tables (one per year) should be easier than doing it across 900!
My big concern is the type of performance I'll get if I have a table with so 47 000 records... Is that big? Is too big?
Anyway, I'll experiment and report back on my results (might take a few days). Any further pointers are more than welcome in the mean time.
•
•
Join Date: Sep 2007
Location: North Bay Ontario
Posts: 174
Reputation:
Rep Power: 1
Solved Threads: 20
47000 records is trivial - MySQL can handle billions of records, and many companies use it for large databases and heavy traffic.
I'm not sure what kinds of queries you will be making, but one point to consider in any relational database design, is that it is driven by those anticipated queries. Maybe post a few pseudo-queries that you expect to run into, and that might get some suggestions.
You mentioned some of your data already - 'closing price, volume, market cap, PE '. And 900 companies is also trivial. I would look at holding your various points of interests in one table, along with a company ID. Then you would need a bridge table to tie that with company data in a separate table.
Or you could put all your various points of interest in separate tables (how many of these are you tracking?). Again, bridge tables would be needed to tie these to the various companies.
It depends on how you are going to be using the data.
Indexing can greatly speed up queries too.
I'm not sure what kinds of queries you will be making, but one point to consider in any relational database design, is that it is driven by those anticipated queries. Maybe post a few pseudo-queries that you expect to run into, and that might get some suggestions.
You mentioned some of your data already - 'closing price, volume, market cap, PE '. And 900 companies is also trivial. I would look at holding your various points of interests in one table, along with a company ID. Then you would need a bridge table to tie that with company data in a separate table.
Or you could put all your various points of interest in separate tables (how many of these are you tracking?). Again, bridge tables would be needed to tie these to the various companies.
It depends on how you are going to be using the data.
Indexing can greatly speed up queries too.
Amer Neely - Web Mechanic
"Others make web sites. We make web sites work!"
"Others make web sites. We make web sites work!"
•
•
Join Date: Apr 2008
Posts: 295
Reputation:
Rep Power: 1
Solved Threads: 41
hi,
storing some data (physically) in 900 different tables seems to be a rather bad idea, because you would have union them firstly to draw any useful information from them. Aside from the fact that handling 900 different tables when adding actual data every week isn't that easy-going. I would like to suggest you that give a meaningful example with concrete data, something like this:
week, clPrice, volume, marketC, PE(?), Company, ...
1, 100, 1000, 30%, ??, INTG
2, ....
...
10, ....
You should add further missing but important attributes to that big table. Don't worry about decomposition into smaller tables; this 2nd step is quite simple if one has broadly understand your data. To speed up processing your data you can add some indexes.
btw, 900x52x16 = 748800 records isn't that much data for any RDBMS, even MS Access will cope with this task.
brs, cliff
storing some data (physically) in 900 different tables seems to be a rather bad idea, because you would have union them firstly to draw any useful information from them. Aside from the fact that handling 900 different tables when adding actual data every week isn't that easy-going. I would like to suggest you that give a meaningful example with concrete data, something like this:
week, clPrice, volume, marketC, PE(?), Company, ...
1, 100, 1000, 30%, ??, INTG
2, ....
...
10, ....
You should add further missing but important attributes to that big table. Don't worry about decomposition into smaller tables; this 2nd step is quite simple if one has broadly understand your data. To speed up processing your data you can add some indexes.
btw, 900x52x16 = 748800 records isn't that much data for any RDBMS, even MS Access will cope with this task.
brs, cliff
•
•
Join Date: Oct 2007
Posts: 7
Reputation:
Rep Power: 0
Solved Threads: 0
Hi,
Thanks trudge for all your input. I couldn't get my hands on those specific books, but stumbled across some tutorials on the net that got me started in the right direction.
While my database isn't exactly 2NF it is pretty close, with slight amount of redundancy in order to facilitate the type of queries I'll do. Taking your advise of keeping the type of queries I'd make to the database really helped in the design process.
So, the end result (after much python scripting in order to populate the tables) is quite good. I have a table containing the share code (primary key), full name and status (delisted or not) for all the companies in my study. Then I have a table for each year which contains the share code, week, close, volume, cap, pe, dy and year for each company. Share code and week form the primary key. The year column, though somewhat redundant, is there to ease joins across multiple year tables.
Tesuji will notice that the above design looks quite similar to his suggestion. I appreciate his input too, but separating the company name out of the year tables seemed like the logical thing to do. I have also gotten my hands on some fundamental data, which is yearly, and separating the company details out of the yearly tables helps when adding in that kind of data.
Thanks again for all the help!
Thanks trudge for all your input. I couldn't get my hands on those specific books, but stumbled across some tutorials on the net that got me started in the right direction.
While my database isn't exactly 2NF it is pretty close, with slight amount of redundancy in order to facilitate the type of queries I'll do. Taking your advise of keeping the type of queries I'd make to the database really helped in the design process.
So, the end result (after much python scripting in order to populate the tables) is quite good. I have a table containing the share code (primary key), full name and status (delisted or not) for all the companies in my study. Then I have a table for each year which contains the share code, week, close, volume, cap, pe, dy and year for each company. Share code and week form the primary key. The year column, though somewhat redundant, is there to ease joins across multiple year tables.
Tesuji will notice that the above design looks quite similar to his suggestion. I appreciate his input too, but separating the company name out of the year tables seemed like the logical thing to do. I have also gotten my hands on some fundamental data, which is yearly, and separating the company details out of the yearly tables helps when adding in that kind of data.
Thanks again for all the help!
![]() |
•
•
•
•
•
•
•
•
DaniWeb Database Design Marketplace
•
•
•
•
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
•
•
•
•
advice age amd avatar backup blue gene daniweb data database design dos economy encryption energy enterprise forensic hardware help ibm ibm. news information intel ibm it keylogger linux medicine memory microsoft news open source openoffice password pc ps3 recession red hat russia security server strength sun supercomputer supercomputing survey technology trends ubuntu web working x86
- Simple solution to database (C++)
- I need some advice (Database Design)
- Browser based game, database design (Database Design)
- Hi, need advice on sql statement and table structure (MySQL)
- database problem (C#)
- Database Design Advice (MySQL)
- Database Advice (Database Design)
- Database Design (PHP)
Other Threads in the Database Design Forum
- Previous Thread: ERD for Warehouse?
- Next Thread: ERD Diagram with inheritance


Linear Mode