Loading related objects from database

Question

pritaeas 2,211 ¯\_(ツ)_/¯

9 Years Ago

This is more a general approach question. I have an old ORM like framework for this application. Basically what it does, if it needs to load a bunch of objects/records, is for every child object in my result a new query is triggered to get it's data. Since the nesting can get pretty deep, it triggers a lot of queries to get everything it needs. How would you approach this to make it more efficient?

I've left out a lot, to avoid blinding you with what I have (since that is not really important here).

(I'm not looking for framework suggestions.)

sql

4 Contributors
28 Replies
231 Views
2 Weeks Discussion Span
Latest Post 9 Years Ago Latest Post by pritaeas

All 28 Replies

gusano79 247 Posting Shark

9 Years Ago

Question one is always: Does the loading really need to be more efficient? In other words, is there a measurable and significant wait/delay when you're loading things? If not, or if the delay is acceptably short, I wouldn't worry about it.

If there's a significant/unacceptable delay, then question two is: Where in code is that time being spent? Use a performance profiler; the results may be surprising (anecdotal evidence available if you want).

If profiling indicates the queries themselves, question three is: Do you need the data all at once, or are you using it piece by piece?

If you're using the data piece by piece, lazy loading is a decent option; it spreads the loading cost over time, and it will be far outweighed by the time spent doing whatever it is you do with the data. Also, this will avoid loading data that you don't actually use, if any.

If you need all of the data together, then it really comes down to the overhead of making the queries. Performance profiling will give you much better information, but off the top of my head, I can think of two things to do:

Reuse the database connection for multiple queries
Combine sub-queries for multiple objects

DaveAmour 160 Mmmmmm beer

9 Years Ago

Which ORM is this?

gusano79 247 Posting Shark

9 Years Ago

Does the loading really need to be more efficient?

Yes, it does.

Where in code is that time being spent?

As stated, in getting the child objects recursively one-by-one.

Performance profiling will give you much better information

Not what I am after, but what I stated is the bottle-neck anyway.

Have you actually profiled the code? It wasn't clear from your post, nor from your response.

I don't mean to pick at this; profiling is something everyone who has a performance problem should do, even if it's only to confirm what they already know about the code: nothing beats cold, hard evidence to justify spending time (== money) on this kind of improvement.

If you haven't, it might be worthwhile... I've seen some abysmally-performing .NET data access code that spent 95% of its time buried deep in the internals of DataReader digging up table metadata to determine column types, rather than on the query itself.

lazy loading is a decent option

I know, but I need all data in this case so it will only move the delay down the line.

Fair enough.

Reuse the database connection for multiple queries

Already in place.

Yay!

Combine sub-queries for multiple objects

How would I do that with deep nested child objects?

You end up with one query for each nesting level. In SQL, it ends up looking something like this:

SELECT
    ParentKey,
    ParentFields...
FROM ParentTable

SELECT
    p.ParentKey,
    c.ChildKey,
    c.ChildFields...
FROM ChildTable c
JOIN ParentTable p
    ON c.ParentKey = p.ParentKey

...modulo your data model.

If you need WHERE clauses, I'd store each SELECT statement's result into a table variable or something, then join on that, which beats filtering every table in every query.

...and you'll have to split up the child records by parent key in code, of course; that cost has to be paid somewhere.

Edited 9 Years Ago by gusano79

DaveAmour 160 Mmmmmm beer

9 Years Ago

At a high level you need to provide a mechanism for client code to specify what it wants loaded.

Ef with Linq does this as follows:

var customers = context.Customers.ToList();

var customers = context.Customers.Include("Invoices").ToList()

var customers = context.Customers.Include("Invoices.InvoiceLines").ToList()

So the first says load just customers, the second says load customers and invoices and the last says load customers, invoices and invoicelines.

The above usually uses lambdas rather than strings so:

var customers = context.Customers.Include(c => c.Invoices).ToList()

and so on.

So in your code you need to examine the client specification and when building your sql add joins as appropriate to get the data you need. You then need to parse the resuls from SQL and create OO objects from the flat data.

Thats is really at a very high level. Of course it is more tricky in practice but that's the general idea I think.

gusano79 247 Posting Shark

9 Years Ago

Have you actually profiled the code? It wasn't clear from your post, nor from your response.

I thought this "Not what I am after, but what I stated is the bottle-neck anyway" was an obvious yes ;)

Just making sure... it sounded like you probably had, but experience has taught me to verify in the absence of an explicit statement, especially regarding performance. I've encountered too many people who "know" what the problem is and don't see the point of actually measuring anything.

I'll leave it alone, but I notice you still haven't given an explicit "yes" :)

You end up with one query for each nesting level.

EF does it in one query, wondering how to replicate that.

Everything I can think of offhand ends up with a giant denormalized result, which could get unwieldy if there's a lot of nesting and a lot of data.

You could set up a simple object hierarchy and use a database profiler to see what EF comes up with.

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

pritaeas 2,211 ¯\_(ツ)_/¯ Moderator Featured Poster · Answer 1 · 2015-07-08T16:23:06+00:00

Does the loading really need to be more efficient?

Yes, it does.

Where in code is that time being spent?

As stated, in getting the child objects recursively one-by-one.

lazy loading is a decent option

I know, but I need all data in this case so it will only move the delay down the line.

Performance profiling will give you much better information

Not what I am after, but what I stated is the bottle-neck anyway.

Reuse the database connection for multiple queries

Already in place.

Combine sub-queries for multiple objects

How would I do that with deep nested child objects?

pritaeas 2,211 ¯\_(ツ)_/¯ Moderator Featured Poster · Answer 2 · 2015-07-08T17:22:02+00:00

pritaeas 2,211 ¯\_(ツ)_/¯

9 Years Ago

Legacy in-house. It doesn't matter. Am looking for logic, not a new one ;)

DaveAmour 160 Mmmmmm beer Featured Poster · Answer 3 · 2015-07-09T04:28:56+00:00

It does matter which ORM it is - is it hand made - are you able to add to it or does it have any extension points or is it set in stone?

EF uses .Include with deferred execution, that works well - maybe you can emulate that - depends on the ORM though!

pritaeas 2,211 ¯\_(ツ)_/¯ Moderator Featured Poster · Answer 4 · 2015-07-09T06:52:53+00:00

are you able to add

It's not set in stone, that's why I'm looking for logic. I might be able to put the right method in.

EF uses .Include with deferred execution

Isn't that lazy loading? I'd prefer a method that doesn't use that (I have it already). I do not wish to load the related records when needed, I need them all, I need a more efficient method.

DaveAmour 160 Mmmmmm beer Featured Poster · Answer 5 · 2015-07-09T07:15:52+00:00

No what I mentioned was eager loading - ie exactly what you need.

.Include tells the ORM what to include - ie which joins to generate in SQL server. It only does one database hit. It does this using deferred execution - ie the first time the data is enumerated.

pritaeas 2,211 ¯\_(ツ)_/¯ Moderator Featured Poster · Answer 6 · 2015-07-09T07:34:00+00:00

pritaeas 2,211 ¯\_(ツ)_/¯

9 Years Ago

Oh, sorry. Saw deferred, not Include. I'll have a look.

DaveAmour 160 Mmmmmm beer Featured Poster · Answer 7 · 2015-07-09T08:05:29+00:00

All deferred means is that you can keep adding to the query - adding extra where clauses etc but not until the point in time when the query is enumerated will EF create the SQL and execute it against the database in one go.

pritaeas 2,211 ¯\_(ツ)_/¯ Moderator Featured Poster · Answer 8 · 2015-07-09T08:11:01+00:00

Know of any explanations that describe how you would build that one resulting query? Am browsing/debugging the EF source, but having a theoretical approach would be nice.

pritaeas 2,211 ¯\_(ツ)_/¯ Moderator Featured Poster · Answer 9 · 2015-07-09T15:44:41+00:00

Have you actually profiled the code? It wasn't clear from your post, nor from your response.

I thought this "Not what I am after, but what I stated is the bottle-neck anyway" was an obvious yes ;)

You end up with one query for each nesting level.

EF does it in one query, wondering how to replicate that.

DaveAmour 160 Mmmmmm beer Featured Poster · Answer 10 · 2015-07-09T16:24:00+00:00

It uses an algorithm to conver linq to sql with joins.

I think you can get the code from https://github.com/aspnet/EntityFramework

pritaeas 2,211 ¯\_(ツ)_/¯ Moderator Featured Poster · Answer 11 · 2015-07-09T16:26:41+00:00

pritaeas 2,211 ¯\_(ツ)_/¯

9 Years Ago

I know thanks, just tricky to get what you need from that.

HJ Innovations 0 Newbie Poster · Answer 12 · 2015-07-09T17:30:17+00:00

Load on demand is what you are looking for. I suggest a reading of LINQ to SQL which supports these sorts of scenrios our of the box.

pritaeas 2,211 ¯\_(ツ)_/¯ Moderator Featured Poster · Answer 13 · 2015-07-09T17:40:43+00:00

pritaeas 2,211 ¯\_(ツ)_/¯

9 Years Ago

Perhaps you should read the entire thread. That's not what I'm asking.

DaveAmour 160 Mmmmmm beer Featured Poster · Answer 14 · 2015-07-10T04:56:11+00:00

Let me have a think - working today but will post some thoughts at the weekend.

pritaeas 2,211 ¯\_(ツ)_/¯ Moderator Featured Poster · Answer 15 · 2015-07-11T08:58:36+00:00

Agreed. Using DataAnnotations and Reflection I already know how to determine what is a field/FK to a table. Assuming a simple nesting level of company and contact I can build a straightforward query. Am not sure how to proceed if say the company also has a preferred transporter (also a company) in it. I know all my fields need prefixing, but how do I assure they stay unique?

DaveAmour 160 Mmmmmm beer Featured Poster · Answer 16 · 2015-07-11T09:18:32+00:00

I'm happy to look at scenarios with you. Can you ceate some actual tables though and show me the SQL to create them then we can be aure we are looking at the right thing.

pritaeas 2,211 ¯\_(ツ)_/¯ Moderator Featured Poster · Answer 17 · 2015-07-11T09:50:34+00:00

I will. But that will be next week probably. Dont have access to a computer at the moment.

pritaeas 2,211 ¯\_(ツ)_/¯ Moderator Featured Poster · Answer 18 · 2015-07-12T08:45:33+00:00

Here's a simplified class hierarchy, but enough to get started I think.

// base class used for identification (reflection)
class BaseClass
    int Id // PK always

class Contact : BaseClass
    string Name

class Company : BaseClass
    string Name
    List<Contacts> Contacts
    Company PreferredTransporter

SQL:

CREATE TABLE Contact (
    Id INT IDENTITY(1,1) NOT NULL,
    Name VARCHAR(100) NOT NULL,
    CompanyId INT NOT NULL)

CREATE TABLE Company (
    Id INT IDENTITY(1,1) NOT NULL,
    Name VARCHAR(100) NOT NULL,
    PreferredTransporterId INT NULL)

So say I want to build a query to get all companies with their attached contacts and transporters. A transporter will never have a transporter set, so no infinite recursion (avoided by business logic and a check constraint).

DaveAmour 160 Mmmmmm beer Featured Poster · Answer 19 · 2015-07-13T04:45:33+00:00

Hi

Sorry mean to look at this yesterday but got tied up - will have a tinker tonight.

pritaeas 2,211 ¯\_(ツ)_/¯ Moderator Featured Poster · Answer 20 · 2015-07-13T06:29:04+00:00

No rush. If it hadn't started raining I wouldn't have posted anything yesterday ;)

DaveAmour 160 Mmmmmm beer Featured Poster · Answer 21 · 2015-07-15T04:29:55+00:00

Sorry real busy on my new contract. Will definatley have a look at the weekend at the latest!

DaveAmour 160 Mmmmmm beer Featured Poster · Answer 22 · 2015-07-23T16:26:35+00:00

DaveAmour 160 Mmmmmm beer

9 Years Ago

Really sorry - still uber busy!

pritaeas 2,211 ¯\_(ツ)_/¯ Moderator Featured Poster · Answer 23 · 2015-07-24T07:59:58+00:00

No need to apologize, it is not urgent. Next week am off anyway, and they might have seen the light to switch to EF for new projects.

Loading related objects from database

Recommended Answers Collapse Answers

All 28 Replies

Recommended Answers