Hi, everyone.
I am stuck when trying to solve such a problem, I need your help.

We have a website for online e-book reading that holds around 60,000 books and currently has 5 million registered users, we are using MySQL for storage of all the data.

Now there's requirement from the operating team of the company that they want a small amount of data for each book(a top 20-item list), it sounds easy but not quite so, this small amount of data can't be acquired with simple queries on the database, or it can be but the efficiency will by no means be acceptable. Let me describe the problem in detail: A book in the website can be added to a virtual folder called "book list" owned by a user(a user may have 0-x book lists), a book list can hold as many books as the user adds to it. and we have a web page containing some information(name, author of the book, etc.) for each book.

Let's say we have a book called "A", and this book was added to 3 book lists owned by 3 users, they are BookList1, BookList2, BookList3. it is possible that these 3 book lists hold other books. they may be represented like this: BookList1(A, B, C) , BookList2(A, B, D), BookList3(A, B, D, H, G), and for describing the problem thoroughly, I will add a BookList4(H, G, K) here , each letter represents a book. Now for book A(added to 3 book lists besides BookList4), I would like to make a list that holds books "B, D, C, H, G", let me tell you where this list come from, they are books collected and sorted by their appearing frequency in book lists in descending order from the 3 book lists that hold book A(BookList4 is not counted here cause it does not have book A in it). so, for each book, I would like to make such a top-20(may be less than 20) list.

My question is: how to do this efficiently, I think SQL can not be an option here.

P.S.: a book may be added to 80,000 book lists. and a book list can hold up to 2,000 books.

Kevin Tse

If you have that large a userbase you should be able to hire professional consultants rather than have to ask here.

They will likely tell you that the amount of data is insignificant, your database design sucks if you can't get it out quickly, and how to actually go about achieving your goals.

Given the miniscule actual information you provide here it's impossible however to tell anything but to get professional help.

Well, if you ever think about it, you will learn that database like MySQL can't handle it *efficiently* and I don't think our database design sucks.

I was asked to do what was required, and after thinking about the problem for quite a while, I could not be able to hammer out a solution, so I post here hoping that someone might have had encountered the same problem or similar problems and would like to share his/her experience. I can't just reply to the operating team that this is unachievable without ever trying(asking here).

Kevin Tse

if you already know your database can't handle it, the first step would be to step up to a real database like Oracle...

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.