Strange memory leak error for functions called within functions

Question

TheWolverine 0 Junior Poster in Training

13 Years Ago

Hi all,

I've been struggling with something for a number of weeks now and I just can't seem to solve it.

I've got a bit of code that looks as follows:

class classA
{
public:
    classA( );

    ~classA( );

    classBBase* pointerToClassBBase;

protected:

private:

};

class classBBase
{
public:
    classBBase( );

    ~classBBase( );

protected:

private:
}

class classB : public classBBase
{
public:
    double testValue;

protected:

private:

};

namespace predefined_classB_objects
{

enum PredefinedClassBObjects
{
    predefinedClassBObject1,
    predefinedClassBObject2
};

classB createPredefinedClassBObject( PredefinedClassBObjects predefinedClassBObjectName )
{
    classB predefinedClassBObject_;

    switch( predefinedClassBObjectName )
    {
    case predefinedClassBObject1:
        predefinedClassBObject_.testValue = 1.0;
        break;

    case predefinedClassBObject1:
        predefinedClassBObject_.testValue = 2.0;
        break;

    default:
        cerr << "Case is not defined" << endl;
    };

    return predefinedClassBObject_;
}

}

namespace predefined_classA_objects
{

enum PredefinedClassAObjects
{
    predefinedClassAObject1,
    predefinedClassAObject2
};

classA createPredefinedClassAObject( PredefinedClassAObjects predefinedClassAObjectName )
{
    classA predefinedClassAObject_;

    classB predefinedClassBObject_;

    switch( predefinedClassAObjectName )
    {
    case predefinedClassAObject1:
        predefinedClassBObject_ = predefined_classB_objects::createPredefinedClassBObjects( predefined_classB_objects::predefinedClassBObject1 );
        break;

    case predefinedClassAObject2:
        predefinedClassBObject_ = predefined_classB_objects::createPredefinedClassBObjects( predefined_classB_objects::predefinedClassBObject2 );
        break;        

   default:
        cerr << "Case is not defined" << endl;
    };

    predefinedClassAObject_.pointerToClassBBase = &predefinedClassBObject_;

    return predefinedClassAObject_;
}

}

int main( )
{
    classA predefinedClassAObject;

    predefinedClassAObject = predefined_classA_objects::createPredefinedClassAObject( predefined_classA_objects::predefinedClassAObject1 );

    std::cout << predefinedClassAObject.pointerToClassBBase->testValue << std::endl;

    return 0;
}

If I run this code, it produces the right output and on the screen I see the value 1.0 as expected.

The problem however is when I then pass the predefinedClassAObject to another class which has a function that is iteratively called, which uses the testValue. When I print the value of testValue from within that function it's 0 to machine precision (my last run gave me 1.4822e-323).

So I'm confused as to where it's going wrong. There is more detail involved in how I pass the predefinedClassAObject to the other class and how I use it. Before getting into that though, I wanted to just ask if there is anything fundamentally wrong with what I've put in code here.

I'd really appreciate feedback, as I just can't seem to crack this and figure out where things are going wrong.

I can provide more details if necessary.

Thanks a lot in advance!

Cheers,

Kartik

c++

Edited 13 Years Ago by TheWolverine because: n/a

4 Contributors
26 Replies
404 Views
2 Weeks Discussion Span
Latest Post 13 Years Ago Latest Post by TheWolverine

All 26 Replies

alwaysLearning0 39 Junior Poster

13 Years Ago

classB predefinedClassBObject_; // This is local variable, so when you leave the scope this variable doesn't exist anymore, and hence you wont be able to access it. This is definitely causing the crashing/garbage value as you are accessing invalid memory.

Second thing is:

predefinedClassAObject.pointerToClassBBase->testValue; //I am not sure whether it will be successful all the time as testValue is not part of the base class, you could use virtual member function to access it. on second thought it may work as well..

Edited 13 Years Ago by alwaysLearning0 because: n/a

Narue 5,707 Bad Cop

13 Years Ago

>predefinedClassAObject_.pointerToClassBBase = &predefinedClassBObject_; predefinedClassBObject is destroyed when the function returns. Your pointer is dangling because it was pointing to a local object.

alwaysLearning0 39 Junior Poster

13 Years Ago

Use pointer all the way.
Even better use shared_ptr or in your case may be auto_ptr

alwaysLearning0 39 Junior Poster

13 Years Ago

auto_ptr give away the ownership to the other ptr which is trying to acquire it by using the assignment or ctor. If no one is there and its going out of the scope than its delete the pointer. Be careful to use it with container.

Shared_ptr uses a reference count and when the count is 0, then it deletes the pointer.

Following code works:

#include <iostream>
#include <memory>
using namespace std;

auto_ptr<int> fun(){
        auto_ptr<int> p1 (new int);
        *p1.get()=10;

        return p1;
}

int main() {
        auto_ptr<int> p2 = fun();
        cout << "p2 points to " << *p2 << "\n";
        return 0;
}

mike_2000_17 2,669 21st Century Viking

13 Years Ago

Would I then have to overload the constructor for every single combination of the option parameters? Or is there are way by which I can just have one long parameter list in the constructor, and that if the user doesn't pass a specific optional parameter, that the compiler knows how to parse the rest of the parameter list using just one constructor definition?

This is exactly what the Boost Parameter Library does. You can pass parameters in a simple "identifier = value" comma-separated list, where order doesn't matter, and you can make any of the parameters optional with some default value. Normally, you should be able to program a single function, you won't need overloads. Of course, it is somewhat verbose when you create a function to be called this way because it requires using several MACROs that are very complex under-the-hood (uses Boost PreProcessor Lib.). But, the point is that it makes the call-site syntax simple and clear (which is the point of writing library code: put all the complex construct under-the-hood to expose the simplest and clearest possible interface to the user).

The way I've set things up now is that the user creates the predefined Earth, with all the preloaded settings, and then uses a set-function to reset the value of the mass. Do you have a suggestion as to a better way to go about doing something like this?

Oh yes. There are better ways to do this. Personally, I just use serialization to take care of this. Serialization means to take your object hierarchy and save it to a file(s) to be reloaded later. For example, I built multi-body dynamics models (i.e. robots), I usually make a simple program that builds the "default" model as an object hierarchy (e.g. links and joints, and motors, etc.) with default system parameters, then I serialize it to a file, and then I can copy and modify the file for any alternative parameter sets. Then, I have a simple "run-simulation" program that loads whatever model it is given and runs a simulation with it. One example of a serialization library is Boost.Serialization.

Another option is to have some kind of standard way by which to load configuration data. Boost.Program-Options is a good example of that.

I've kinda assumed that everything that can go in classes should for code reusability.

I will quote Alexandrescu and Sutter on that (from "C++ Coding Standards"):

Rule 37: Public inheritance is 'substitutability'. Inherit, not to reuse, but to be reused.

Which means that you should not make derived classes because they can reuse the code that is in the base class (of course, they can reuse the base-class code, but it is no justification for deriving a class, prefer composition for that). The real justification for deriving a class is such that code which uses the base-class can be reused with the derived class, not the other way around. I just wanted to be sure you understood that, it is the fundamental concept that defines polymorphism. Then, the question of whether you need to use polymorphism at run-time, or if you are OK with using polymorphism at compile-time, in the former case, you use OOP (dynamic polymorphism), and in the latter case, you can use Generic Programming (static polymorphism).

In fact, while I'm at it, I think I should quote a few more rules that I find to be of critical importance:

5. Give one entity one cohesive responsibility
6. Correctness, simplicity, and clarity come first
10. Minimize global and shared data
11. Hide information
13. Ensure resources are owned by objects. Use explicit RAII and smart pointers
14. Prefer compile- and link-time errors to run-time errors
32. Be clear what kind of class you're writing
33. Prefer minimal classes to monolithic classes
34. Prefer composition to inheritance
36. Prefer providing abstract interfaces
44. Prefer writing nonmember nonfriend functions
53. Explicitly enable or disable copying
56. Whenever it makes sense, provide a no-fail swap (and provide it correctly)
59. Don't write namespace usings in a header file or before an #include
60. Avoid allocating and deallocating memory in different modules
64. Blend static and dynamic polymorphism judiciously
65. Customize intentionally and explicitly
72. Prefer to use exceptions to report errors
75. Avoid exception specifications
79. Store only values and smart pointers in containers

Is such polymorphic behaviour possible if you're using procedural programming and free functions?

First of all, remember that procedural programming and object-oriented programming are not mutually exclusive, far from it, they are complementary. If you have a justification for using inheritance (more precisely, dynamic polymorphism), then you can do so. And if it is more convenient to use free functions for something, then do so. And, of course, you can use free functions from polymorphic types, and conversely, you can use polymorphic types within free functions. No problem there.

In fact, there is a much more powerful form of polymorphism available in generic programming, at least, in my opinion. But this requires knowledge of function templates and class templates. Consider this:

//You can define generic integration functions like so:

template <typename StateRateEvaluator, typename StateVector>
void euler_integration(StateRateEvaluator f, StateVector& x, double& t, double t_step, double t_end) {
  while(t < t_end) {
    x += f(x,t) * t_step;
    t += t_step;
  };
};

template <typename StateRateEvaluator, typename StateVector>
void runge_kutta4_integration(StateRateEvaluator f, StateVector& x, double& t, double t_step, double t_end) {
  //...
};

template <typename StateRateEvaluator, typename StateVector>
void rkf45_integration(StateRateEvaluator f, StateVector& x, double& t, double t_step, double t_end, double t_step_min, double t_step_max, double tolerance) {
  //...
};

//... other integrators you can think of, or care to implement.


//then you define a specific system, with its state vector class:

struct one_dof_state_vector {
  double x;
  double x_dot;
  struct rate {
    double x_dot;
    double x_dotdot;
    friend one_dof_state_vector operator *(const rate& dv_dt, double dt) {
      one_dof_state_vector v;
      v.x = dv_dt.x_dot * dt;
      v.x_dot = dv_dt.x_dotdot * dt;
      return v;
    };
    friend one_dof_state_vector operator *(double dt, const rate& dv_dt) {
      one_dof_state_vector v;
      v.x = dv_dt.x_dot * dt;
      v.x_dot = dv_dt.x_dotdot * dt;
      return v;
    };
  };
  one_dof_state_vector& operator +=(const one_dof_state_vector& dv) {
    x += dv.x; x_dot += dv.x_dot;
    return *this;
  };
};

//and its system function and parameters:

struct mass_spring_system {
  double mass;
  double stiffness;
  one_dof_state_vector::rate operator()(const one_dof_state_vector& v, double t) const {
    one_dof_state_vector::rate dv_dt;
    dv_dt.x_dot = v.x_dot;
    dv_dt.x_dotdot = - v.x * stiffness / mass;
    return dv_dt;
  };
};

int main() {
  one_dof_state_vector v;
  v.x = 1.0; v.x_dot = 0.0;
  double t = 0;
  mass_spring_system sys;
  sys.mass = 1.0;
  sys.stiffness = 0.1;
  std::ofstream out("results.dat");
  out << "% t x x_dot";
  out << std::endl << t << " " << v.x << " " << v.x_dot;
  for(int i = 0; i < 1000; ++i) {
    euler_integration(sys, v, t, 0.0001, t + 0.01);
    out << std::endl << t << " " << v.x << " " << v.x_dot;
  };
  return 0;
};

Of course, the above example is crude and simplified, it can be made much nicer. The point is that you can achieve polymorphism by not requiring any kind of specific objects to be given to the integrators (they are generic parameters, and thus, the name "generic programming"). The requirement is implied by the fact that the integrator requires that the "StateRateEvaluator" is callable with the parameters of the current state and the current time, and to output rate of change of the state vector. This is equivalent to having an implicit "abstract interface", just like a base-class explicitly defines the abstract interface. Note that the interface can also be made explicit in generic programming using what is called "concepts" which is analogous to base classes. The disadvantage of the above approach is that things have to be determined at compile-time (types and functions to call have to be known at compile-time), and this is why this is called "static polymorphism". Fortunately, you can wrap this in dynamic polymorphism very easily:

//say you have a base-class to evaluate a model's equations of motion:
class ModelEvaluator {
  public:
    virtual void evaluate(const double* state_vector, double* state_rate, double t) = 0;
  //...
};

//.. then you can have any derived classes you want...

//then you create a simple wrapper functor to call the base-class pointer's evaluation function:
struct ModelEvaluatorFunctor {
  ModelEvaluator* mdl; //this could also be a smart-pointer, but it is not necessary since this class is not part of the interface (should not be).
  state_rate_vector operator()(const state_vector& x, double t) const {
    mdl->evaluate(/*....*/);
    return ...;
  };
};

You can also wrap the integrator calls in a set of derived classes and base classes (and have parameters associated to the integrator objects, that are used when the integration calls are made). The advantage of this whole thing is that you can use static polymorphism (which is much more efficient and generic than dynamic polymorphism) whenever appropriate. And when you need dynamic polymorphism (and you will be surprised how seldom you need it), you can, with minimal efforts to wrap the GP part into an OOP framework. The other advantage is that code reusability is the main motivation for GP, and it is much more scalable. And it is much more flexible. This allows you to create a much more flexible and generic code-base to perform the under-the-hood calculations in a way that truly reduces the amount of coding you have to do to create new elements and reuse other algorithms, without the often-awkward problems of trying to fit things correctly into a class hierarchy. Then, you can expose an OOP thin-wrapper that contains not much more than composition of some minimalistic classes and calling some generic functions.

I think that your whole problem really boils down to a careful design of the abstractions that you are using, not so much the matter of PP vs GP vs OOP. For this purpose, read the "C++ Coding Standards" book that I referred to above. I believe that "Effective C++" is also very good for this. I'll reiterate aspects of designing your abstractions:

5. Give one entity one cohesive responsibility
13. Ensure resources are owned by objects. Use explicit RAII and smart pointers
32. Be clear what kind of class you're writing
33. Prefer minimal classes to monolithic classes
34. Prefer composition to inheritance
36. Prefer providing abstract interfaces
44. Prefer writing nonmember nonfriend functions

If you like, you might want to take a look at the library I am coding, here.

Edited 13 Years Ago by mike_2000_17 because: n/a

m4ster_r0shi commented: This is an awesome post. +5

alwaysLearning0 commented: another great post +4

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

TheWolverine 0 Junior Poster in Training · Answer 1 · 2011-06-22T20:33:44+00:00

classB predefinedClassBObject_; // This is local variable, so when you leave the scope this variable doesn't exist anymore, and hence you wont be able to access it.

Yes, I originally thought that was the case but when I then coded this and I ran createPredefinedClassAObject() function, it yielded the right result. So I figured that I was in fact returning the classB object as well, through the fact that I'm returning the class A object that contains a reference to the classB object.

This is my primary problem I guess. How do I go about solving this such that the user interface remains the same, i.e., the user should only have to provide the input argument to determine which case of the switch statement to choose, and the class A object returned should come with preloaded settings. This needs to be the case as there are other derived classes of classBBase that will be added in the future, and I want it to be such that the user interface doesn't change, only the createPredefinedClassAObject().

To place this in context, I'm applying this to a library that should be able to propagate satellite orbits. So In that application, classA is CelestialBody, with the create function therefore returning predefined celestial bodies with preloaded settings. An example of a preloaded setting, i.e., classB, is the GravityFieldModel. So in other words, the create function creates a predefined Earth, for instance, which comes preloaded with a predefined gravity field model. In future, other preloaded settings will be added, such as a predefined atmosphere, predefined magnetic field etc. Hope that clarifies why I'm trying to achieve what I am.

Second thing is:

predefinedClassAObject.pointerToClassBBase->testValue; //I am not sure whether it will be successful all the time as testValue is not part of the base class, you could use virtual member function to access it. on second thought it may work as well..

Yes, this was a mistake on my part, testValue also belongs to the base class indeed.

Any tips on how to sort out the code so I can achieve what I need to with this user interface?

Thanks a lot,

Kartik

TheWolverine 0 Junior Poster in Training · Answer 2 · 2011-06-22T20:34:52+00:00

>predefinedClassAObject_.pointerToClassBBase = &predefinedClassBObject_; predefinedClassBObject is destroyed when the function returns. Your pointer is dangling because it was pointing to a local object.

Yes indeed. So I guess to summarize my question, how do I promote the object from having local scope such that it isn't destroyed once the function returns?

Thanks,

Kartik

TheWolverine 0 Junior Poster in Training · Answer 3 · 2011-06-22T20:45:56+00:00

Use pointer all the way.
Even better use shared_ptr or in your case may be auto_ptr

Thanks, I'm looking up simple examples on how to use auto_ptr.

So should I apply the auto_ptr to dynamically allocate the predefinedClassBObject_ in createPredefinedClassAObject()? If I understand the auto_ptr examples that I've looked at correctly, wouldn't the dynamically allocated object also go out of scope at the end of the function?

Cheers,

Kartik

mike_2000_17 2,669 21st Century Viking Team Colleague Featured Poster · Answer 4 · 2011-06-22T21:09:25+00:00

Well, using shared_ptr is definitely the best thing to do in this case (I use them a lot for these kind of cases). The way to have the life-time of the object under your control is to dynamically allocate it and thus, deallocate it when it should. The problem is determining when it should be deleted, and the reference counting that is part of the shared_ptr will take care of that for you, i.e., it will be deleted when it is no longer useful to any part of your code.

Don't use auto_ptr. This type of pointer will be deprecated soon because there are fundamental flaws to it with respect to the upcoming C++ standard, so it was marked as deprecated. The equivalent will be unique_ptr in C++11, the upcoming standard. But, in this case, I think shared_ptr is more appropriate (and worth the small overhead that it entails).

TheWolverine 0 Junior Poster in Training · Answer 5 · 2011-06-22T21:29:06+00:00

Well, using shared_ptr is definitely the best thing to do in this case (I use them a lot for these kind of cases). The way to have the life-time of the object under your control is to dynamically allocate it and thus, deallocate it when it should. The problem is determining when it should be deleted, and the reference counting that is part of the shared_ptr will take care of that for you, i.e., it will be deleted when it is no longer useful to any part of your code.
Don't use auto_ptr. This type of pointer will be deprecated soon because there are fundamental flaws to it with respect to the upcoming C++ standard, so it was marked as deprecated. The equivalent will be unique_ptr in C++11, the upcoming standard. But, in this case, I think shared_ptr is more appropriate (and worth the small overhead that it entails).

If I understand correctly, shared_ptr is only available through Boost at present right? I currently don't have Boost incorporated in my library. Since the library is for physicists/engineers rather than computer scientists, I've been cautious in adding external libraries to the mix that will complicate the installation procedure of the library. If shared_ptr is really the best solution in your opinion I guess it's time to add Boost to the mix.

Cheers,

Kartik

mike_2000_17 2,669 21st Century Viking Team Colleague Featured Poster · Answer 6 · 2011-06-22T21:59:48+00:00

shared_ptr is currently part of TR1 (Technical Report 1), which is an extension to the C++ standard library. If you have a compiler that does not have this, then it must be very old. Read this to see how to use it. Basically, you #include <memory> and use std::tr1::shared_ptr<>. TR1 also includes other parts of Boost that are the most general purpose. Most of TR1 libraries and more of the Boost libraries will be incorporated (almost verbatim) in the upcoming C++ standard library, so, it is possible to just use the Boost libraries that will become standard, until they do, and then switch to the new standard libraries (this is what I'm doing, I have #ifdef statements to swap C++0x standard libraries for Boost libraries if the code is not being compiled on C++0x (experimental support)).

>>I've been cautious in adding external libraries

That's a very good thing. Personally, I mostly only add external dependencies in unit-test code (not library code), and if I do add an external dependency to library code, I never add it without writing a firewall to insulate it from every other part of my code (a "firewall" in programming terms is a thin wrapper library that encapsulates all the external dependency, usually via a form of the PImpl idiom (or Cheshire Cat)). But I would say that you can easily consider Boost libraries as an exception to this rule. It has awesome libraries that are mighty useful, almost every C++ programmer is used to it, and it is stable, cross-platform and very efficient. These are qualities you almost never find all together in a single open-source library. But, of course, it still is technically not a standard library (although, as close as you can get to that). Also note that Boost has a policy of reducing internal dependencies amongst its own libraries, so if you only need one thing out of it, it is not too hard to grab the few Boost headers you need and make them part of your own library.

TheWolverine 0 Junior Poster in Training · Answer 7 · 2011-06-22T22:05:59+00:00

shared_ptr is currently part of TR1 (Technical Report 1), which is an extension to the C++ standard library. If you have a compiler that does not have this, then it must be very old. Read this to see how to use it. Basically, you #include <memory> and use std::tr1::shared_ptr<>. TR1 also includes other parts of Boost that are the most general purpose. Most of TR1 libraries and more of the Boost libraries will be incorporated (almost verbatim) in the upcoming C++ standard library, so, it is possible to just use the Boost libraries that will become standard, until they do, and then switch to the new standard libraries (this is what I'm doing, I have #ifdef statements to swap C++0x standard libraries for Boost libraries if the code is not being compiled on C++0x (experimental support)).
>>I've been cautious in adding external libraries
That's a very good thing. But I would say that you can easily consider Boost libraries as an exception to this rule. It has awesome libraries that are mighty useful, almost every C++ programmer is used to it, and it is stable, cross-platform and very efficient. These are qualities you almost never find all together in a single open-source library. But, of course, it still is technically not a standard library (although, as close as you can get to that). Also note that Boost has a policy of reducing internal dependencies amongst its own libraries, so if you only need one thing out of it, it is not too hard to grab the few Boost headers you need and make them part of your own library.

Thanks for explaining that, that's great! Didn't realise that shared_ptr can be accessed like that through the standard library.

Yes, I am trying to convince others in my project team that we should be using Boost for a number of different things, precisely because its use is so widespread.

I'll try solving my problem now with shared_ptr and post back if I manage to resolve it ok.

Thanks!

Kartik

TheWolverine 0 Junior Poster in Training · Answer 8 · 2011-06-22T22:37:14+00:00

I've managed to get most of the code working, just when I tried to access the data in the classB object I get the following error:

pure virtual method called
terminate called without an active exception

The problem comes from the fact that I have a function defined in classBBase that's pure virtual. The function is implemented in classB. I have a pointer to classBBase now, and through a shared_ptr I've gotten it to point to a classB object. Only, when I then try to access the pure virtual function now through the pointerToClassBBase, it isn't able to execute the function.

I'm guessing this is an elementary problem, but not really sure how to resolve it.

Any ideas? I can post up the code if this isn't clear.

Cheers,

Kartik

mike_2000_17 2,669 21st Century Viking Team Colleague Featured Poster · Answer 9 · 2011-06-22T23:00:50+00:00

Well, you need to post the code for both classes for me to have any concrete idea as to what the problem may be.

However, due to experience, I can make a reasonable educated guess. Since the compiler should prevent you from creating an object of an abstract class, then, unless you have a very strange piece of code that somehow fools the compiler into creating an object of an abstract class, the problem must be either one of two things:
1) Calling a virtual method in a constructor (specifically, the base-class constructor where the method is 'pure').
2) Calling a virtual method in the destructor (specifically, the base-class destructor where the method is 'pure').

The Dtor/Ctor are functions in which you cannot call virtual methods (at least, not unless you know exactly what you are doing). This is because in either case the object is only half-constructor (either half-way in its construction or half-way in its destruction). This means that the virtual table is not yet fully formed until the object if fully created (and, conversely, the virtual table gets deconstructed during the destructor calls (or more precisely, it gets rolled-back)). This means that if you call a pure virtual function in a base-class constructor/destructor, the corresponding entry in the virtual table will be NULL and the program will either check for nullity (under debug mode) and give an error like the one you got, or it will not check and produce a segmentation fault (i.e. access violation) or worse.

TheWolverine 0 Junior Poster in Training · Answer 10 · 2011-06-22T23:57:28+00:00

Well, you need to post the code for both classes for me to have any concrete idea as to what the problem may be.
However, due to experience, I can make a reasonable educated guess. Since the compiler should prevent you from creating an object of an abstract class, then, unless you have a very strange piece of code that somehow fools the compiler into creating an object of an abstract class, the problem must be either one of two things:
1) Calling a virtual method in a constructor (specifically, the base-class constructor where the method is 'pure').
2) Calling a virtual method in the destructor (specifically, the base-class destructor where the method is 'pure').
The Dtor/Ctor are functions in which you cannot call virtual methods (at least, not unless you know exactly what you are doing). This is because in either case the object is only half-constructor (either half-way in its construction or half-way in its destruction). This means that the virtual table is not yet fully formed until the object if fully created (and, conversely, the virtual table gets deconstructed during the destructor calls (or more precisely, it gets rolled-back)). This means that if you call a pure virtual function in a base-class constructor/destructor, the corresponding entry in the virtual table will be NULL and the program will either check for nullity (under debug mode) and give an error like the one you got, or it will not check and produce a segmentation fault (i.e. access violation) or worse.

I don't think I'm calling any virtual methods from any constructors or destructors. Maybe it's just hidden to me.

Anyhow, I decided to rewrite the original code I posted here in terms of what I'm actually doing for my simulations, so that it's easier to visualize.

Here is the code using the shared_ptr. I'm probably using it incorrectly, but I'm not sure where. I think I'm right in the way I've used it for the predefined objects.

I'll post the various files that I have now.

celestialbody.h

#ifndef CELESTIALBODY_H
#define CELESTIALBODY_H

#include "gravityFieldModel.h"

class CelestialBody
{
public:
    CelestialBody( ){ };

    ~CelestialBody( ){ };

    void setGravityFieldModel( GravityFieldModel* pointerToGravityFieldModel )
    {
       pointerToGravityFieldModel_ =  pointerToGravityFieldModel;
    }

    GravityFieldModel* getGravityFieldModel( )
    {
        return pointerToGravityFieldModel_;
    }

protected:

private:
    GravityFieldModel* pointerToGravityFieldModel_;
};

#endif // CELESTIALBODY_H

centralGravityField.h

#ifndef CENTRALGRAVITYFIELD_H
#define CENTRALGRAVITYFIELD_H

#include "gravityFieldModel.h"

class CentralGravityField : public GravityFieldModel
{
public:
    CentralGravityField( ){ };

    ~CentralGravityField( ){ };

    void setGravitationalParameter( const double& gravitationalParameter )
    {
        gravitationalParameter_ = gravitationalParameter;
    }

    double& getGravitationalParameter( )
    {
        return gravitationalParameter_;
    }

protected:

private:
};

#endif // CENTRALGRAVITYFIELD_H

gravityFieldModel.h

#ifndef GRAVITYFIELDMODEL_H
#define GRAVITYFIELDMODEL_H

class GravityFieldModel
{
public:
    GravityFieldModel( ){ };

    ~GravityFieldModel( ){ };

    virtual void setGravitationalParameter( const double& gravitationalParameter ) = 0;

    virtual double& getGravitationalParameter( ) = 0;

protected:

    double gravitationalParameter_;

private:

};

#endif // GRAVITYFIELDMODEL_H

predefinedCelestialBodies.h

#ifndef PREDEFINEDCELESTIALBODIES_H
#define PREDEFINEDCELESTIALBODIES_H

#include <tr1/memory>
#include "celestialbody.h"
#include "predefinedGravityFieldModels.h"
#include "centralGravityField.h"

using std::tr1::shared_ptr;

using predefined_gravity_field_models::createPredefinedCentralGravityField;

namespace predefined_celestial_bodies
{

enum PredefinedCelestialBodies
{
    earth,
    mars
};

CelestialBody createPredefinedCelestialBody( PredefinedCelestialBodies predefinedCelestialBody )
{
    CelestialBody predefinedCelestialBody_;

    shared_ptr< CentralGravityField > pointerToPredefinedCentralGravityField_;

    switch( predefinedCelestialBody )
    {
    case earth:
        pointerToPredefinedCentralGravityField_ = createPredefinedCentralGravityField( predefined_gravity_field_models::earth );
        break;

    case mars:
        pointerToPredefinedCentralGravityField_ = createPredefinedCentralGravityField( predefined_gravity_field_models::mars );
        break;

   default:
        cerr << "Case is not defined" << endl;
    };

    predefinedCelestialBody_.setGravityFieldModel( pointerToPredefinedCentralGravityField_.get( ) );

    return predefinedCelestialBody_;
}

}

#endif // PREDEFINEDCELESTIALBODIES_H

predefinedGravityFieldModels.h

#ifndef PREDEFINEDGRAVITYFIELDMODELS_H
#define PREDEFINEDGRAVITYFIELDMODELS_H

#include <iostream>
#include <tr1/memory>
#include "centralGravityField.h"

using std::cerr;
using std::endl;
using std::tr1::shared_ptr;

namespace predefined_gravity_field_models
{

enum BodiesWithPredefinedCentralGravityFieldModels
{
    earth,
    mars
};

shared_ptr< CentralGravityField > createPredefinedCentralGravityField(
    BodiesWithPredefinedCentralGravityFieldModels
    bodyWithPredefinedCentralGravityFieldModel )
{
    shared_ptr< CentralGravityField >
            pointerToPredefinedCentralGravityField_(
                new CentralGravityField );

    switch( bodyWithPredefinedCentralGravityFieldModel )
    {
    case earth:
        pointerToPredefinedCentralGravityField_.get( )->setGravitationalParameter( 1.0 );
        break;

    case mars:
        pointerToPredefinedCentralGravityField_.get( )->setGravitationalParameter( 2.0 );
        break;

    default:
        cerr << "Case is not defined" << endl;
    };

    return pointerToPredefinedCentralGravityField_;
}

}


#endif // PREDEFINEDGRAVITYFIELDMODELS_H

main.cpp

#include <iostream>
#include "gravityFieldModel.h"
#include "celestialbody.h"
#include "predefinedCelestialBodies.h"

using std::cerr;
using std::endl;

int main( )
{
    CelestialBody predefinedEarth;

    predefinedEarth = predefined_celestial_bodies::createPredefinedCelestialBody( predefined_celestial_bodies::earth );

    std::cout << predefinedEarth.getGravityFieldModel( )->getGravitationalParameter( ) << std::endl;

    return 0;
}

I'm also uploading a zip file with my Qt project containing all these files.

Anyway, this has been abstracted out of my library, which I don't want to post up here now. The error I'm receiving now isn't the pure virtual function error, the program just crashes. I guess that means there's a memory leak somewhere. I tried running Valgrind but it just spits out a lot of stuff that's unintelligible to me.

Thanks for all your help!

Cheers,

Kartik

mike_2000_17 2,669 21st Century Viking Team Colleague Featured Poster · Answer 11 · 2011-06-23T00:50:59+00:00

You are using shared_ptr the wrong way and it is causing the crash (or equivalently the "calling a pure virtual method" error). The problem is this. You are creating in 'createPredefinedCelestialBody' a shared_ptr to a GravityField object. This part is correct. The problem is that you are giving away, to the 'predefinedCelestialBody_' the raw pointer stored in the shared_ptr (via the get() function). At this point, you still have just one shared_ptr, so when 'pointerToPredefinedCentralGravityField_' goes out of scope, the reference counter goes to zero and the GravityField object gets deleted. But your 'predefinedCelestialBody_' still has a raw pointer to the object that got deleted, and thus, attempting to access it will cause a crash (or a call to a pure virtual function if you are "lucky").

The fix is simple, hold the GravityField as a shared_ptr in the 'CelestialBody' class. This way, it will work. Basically, the rule with any kind of smart pointer (auto_ptr, scoped_ptr, unique_ptr, shared_ptr, etc.) is that you should:
1) Always assign the pointer resulting from the dynamic allocation to a shared_ptr immediately (as you did in 'createPredefinedCentralGravityField').
2) Never, ever, use the underlying raw pointer for anything, this will nullify the whole point of using a smart pointer in the first place.

So, your CelestialBody class could look like this:

class CelestialBody
{
public:
    CelestialBody( ){ };

    ~CelestialBody( ){ };

    void setGravityFieldModel( shared_ptr<GravityFieldModel> pointerToGravityFieldModel )
    {
       pointerToGravityFieldModel_ =  pointerToGravityFieldModel;
    }

    shared_ptr<GravityFieldModel> getGravityFieldModel( )
    {
        return pointerToGravityFieldModel_;
    }

protected:

private:
    shared_ptr<GravityFieldModel> pointerToGravityFieldModel_;
};

Then the assignment would be:

predefinedCelestialBody_.setGravityFieldModel( pointerToPredefinedCentralGravityField_ ); //notice no more 'get()' call.

Finally, you do not need to use get() to actually access the object. Smart pointers act exactly as if they were the actual raw pointer, so, in theory, you should never even need to use the get() function anywhere. As for those lines:

switch( bodyWithPredefinedCentralGravityFieldModel )
    {
    case earth:
        pointerToPredefinedCentralGravityField_->setGravitationalParameter( 1.0 );
        break;

    case mars:
        pointerToPredefinedCentralGravityField_->setGravitationalParameter( 2.0 );
        break;

    default:
        cerr << "Case is not defined" << endl;
    };

Notice that get() disappears, it is not needed.

**IMPORTANT**
Always make your base-class destructor 'virtual'. This is paramount in this case, and should always be done. You forgot to do this in the GravityFieldModel class. FIX THAT.

TheWolverine 0 Junior Poster in Training · Answer 12 · 2011-06-23T01:20:02+00:00

You are using shared_ptr the wrong way and it is causing the crash (or equivalently the "calling a pure virtual method" error). The problem is this. You are creating in 'createPredefinedCelestialBody' a shared_ptr to a GravityField object. This part is correct. The problem is that you are giving away, to the 'predefinedCelestialBody_' the raw pointer stored in the shared_ptr (via the get() function). At this point, you still have just one shared_ptr, so when 'pointerToPredefinedCentralGravityField_' goes out of scope, the reference counter goes to zero and the GravityField object gets deleted. But your 'predefinedCelestialBody_' still has a raw pointer to the object that got deleted, and thus, attempting to access it will cause a crash (or a call to a pure virtual function if you are "lucky").
The fix is simple, hold the GravityField as a shared_ptr in the 'CelestialBody' class. This way, it will work. Basically, the rule with any kind of smart pointer (auto_ptr, scoped_ptr, unique_ptr, shared_ptr, etc.) is that you should:
1) Always assign the pointer resulting from the dynamic allocation to a shared_ptr immediately (as you did in 'createPredefinedCentralGravityField').
2) Never, ever, use the underlying raw pointer for anything, this will nullify the whole point of using a smart pointer in the first place.
So, your CelestialBody class could look like this:
class CelestialBody
{
public:
    CelestialBody( ){ };

    ~CelestialBody( ){ };

    void setGravityFieldModel( shared_ptr<GravityFieldModel> pointerToGravityFieldModel )
    {
       pointerToGravityFieldModel_ =  pointerToGravityFieldModel;
    }

    shared_ptr<GravityFieldModel> getGravityFieldModel( )
    {
        return pointerToGravityFieldModel_;
    }

protected:

private:
    shared_ptr<GravityFieldModel> pointerToGravityFieldModel_;
};
Then the assignment would be:
predefinedCelestialBody_.setGravityFieldModel( pointerToPredefinedCentralGravityField_ ); //notice no more 'get()' call.
Finally, you do not need to use get() to actually access the object. Smart pointers act exactly as if they were the actual raw pointer, so, in theory, you should never even need to use the get() function anywhere. As for those lines:
switch( bodyWithPredefinedCentralGravityFieldModel )
    {
    case earth:
        pointerToPredefinedCentralGravityField_->setGravitationalParameter( 1.0 );
        break;

    case mars:
        pointerToPredefinedCentralGravityField_->setGravitationalParameter( 2.0 );
        break;

    default:
        cerr << "Case is not defined" << endl;
    };
Notice that get() disappears, it is not needed.

**IMPORTANT**
Always make your base-class destructor 'virtual'. This is paramount in this case, and should always be done. You forgot to do this in the GravityFieldModel class. FIX THAT.

Thanks for that explanation. I did originally think of doing it this way, but the problem is that it alters the interface for the set and get functions in the CelestialBody class doesn't it? By that I mean that if a user now creates an object of the CelestialBody class, to use the set function, the input parameter passed cannot be an ordinary pointer but must be a shared_ptr right?

The problem with this is that it complicates the user interface. For the set function it's easy enough to use function overloading to create a function that will take a regular pointer too, but how do I resolve this for get function? My understanding is that return-type overloading is not possible. How would I go about solving that? The only thing I can think of now is creating a CelestialBody duplicate class that contains the functions with regular pointers too, but that just means duplicating a lot of code, which I'd rather not do if there's a more elegant solution.

I did in any case apply the changes you suggested and the code is working now indeed. Thanks also for pointing out that I should make the destructor virtual.

Thanks for all your help!

Cheers,

Kartik

mike_2000_17 2,669 21st Century Viking Team Colleague Featured Poster · Answer 13 · 2011-06-23T02:18:47+00:00

>>the problem is that it alters the interface for the set and get functions in the CelestialBody class doesn't it?

Sure, it alters the interface of the set/get functions in CelestialBody, because it actually makes it a proper interface. What I mean by that is that a raw pointer (or a reference for that matter) does not convey enough information to constitute a good interface, that is why they rarely should be used unless it is obvious what they do. What is missing from a raw pointer is an indication of ownership, i.e., Who owns the object that is pointed-to by the pointer? Ownership means "who will delete it". An object can only be deleted once, no more no less, and it should only be deleted once it no longer is useful (i.e. referenced) by any part of the code. There are several ways to define ownership, mainly: unique and shared. Either you guarantee that this pointer is unique and thus can perform a delete when it is done with it (goes out of scope). Or, you have several pointers that share ownership of the object (which should be deleted once every pointer has gone out of scope). There is a third model, which is "no ownership", this means that the pointer is pointing to something that could have been deleted already by its owner(s), this is modelled with weak_ptr in C++. In any case, a raw pointer does not tell the user which of these ownership models is used for the GravityFieldModel in your CelestialBody class. So, an interface which uses a raw pointer is ill-defined (e.g. if I were to do a sort of "code review" on a class that has such an interface, I would flag it as "completely unacceptable"). So, it changes the interface, but it does so for the better.

>>the input parameter passed cannot be an ordinary pointer but must be a shared_ptr right?

Yes. That's the point. It tells the caller that the ownership of the pointed-to object must be shared. That's called "good programming practices". It is also very nice because it makes erroneous code (e.g. code which does not implement a shared ownership) fail at compile-time, which is much better than crashing at run-time! Don't you agree?

>>The problem with this is that it complicates the user interface.

That is one way to look at it. But the way I look at it, it makes the interface clearer and more informative.

>>how do I resolve this for get function?

Don't. Don't overload the set function either. I hope the points I made above will convince you that it is not a good idea to do so.

>>How would I go about solving that?

Only if you have a lot of existing code that relies on the "old" interface of CelestialBody, then you might consider using a conditional compilation to swap the interfaces between old and new, such that the newer code can use the improved interface, and not break old code that uses the older, crappier interface. But, I would say, you really need to have a lot of old code that uses this old interface to justify doing something like this, because it will get ugly eventually. You are better off updating the rest of the code that uses CelestialBody to use shared_ptr instead of raw pointers, it will improve those bits of code in the process.

TheWolverine 0 Junior Poster in Training · Answer 14 · 2011-06-23T02:33:49+00:00

>>the problem is that it alters the interface for the set and get functions in the CelestialBody class doesn't it?
Sure, it alters the interface of the set/get functions in CelestialBody, because it actually makes it a proper interface. What I mean by that is that a raw pointer (or a reference for that matter) does not convey enough information to constitute a good interface, that is why they rarely should be used unless it is obvious what they do. What is missing from a raw pointer is an indication of ownership, i.e., Who owns the object that is pointed-to by the pointer? Ownership means "who will delete it". An object can only be deleted once, no more no less, and it should only be deleted once it no longer is useful (i.e. referenced) by any part of the code. There are several ways to define ownership, mainly: unique and shared. Either you guarantee that this pointer is unique and thus can perform a delete when it is done with it (goes out of scope). Or, you have several pointers that share ownership of the object (which should be deleted once every pointer has gone out of scope). There is a third model, which is "no ownership", this means that the pointer is pointing to something that could have been deleted already by its owner(s), this is modelled with weak_ptr in C++. In any case, a raw pointer does not tell the user which of these ownership models is used for the GravityFieldModel in your CelestialBody class. So, an interface which uses a raw pointer is ill-defined (e.g. if I were to do a sort of "code review" on a class that has such an interface, I would flag it as "completely unacceptable"). So, it changes the interface, but it does so for the better.
>>the input parameter passed cannot be an ordinary pointer but must be a shared_ptr right?
Yes. That's the point. It tells the caller that the ownership of the pointed-to object must be shared. That's called "good programming practices". It is also very nice because it makes erroneous code (e.g. code which does not implement a shared ownership) fail at compile-time, which is much better than crashing at run-time! Don't you agree?
>>The problem with this is that it complicates the user interface.
That is one way to look at it. But the way I look at it, it makes the interface clearer and more informative.
>>how do I resolve this for get function?
Don't. Don't overload the set function either. I hope the points I made above will convince you that it is not a good idea to do so.
>>How would I go about solving that?
Only if you have a lot of existing code that relies on the "old" interface of CelestialBody, then you might consider using a conditional compilation to swap the interfaces between old and new, such that the newer code can use the improved interface, and not break old code that uses the older, crappier interface. But, I would say, you really need to have a lot of old code that uses this old interface to justify doing something like this, because it will get ugly eventually. You are better off updating the rest of the code that uses CelestialBody to use shared_ptr instead of raw pointers, it will improve those bits of code in the process.

Very interesting, this is the first time I'm hearing about this. So if that's the case, when and why would you ever use raw pointers?

I guess the way I got around the "ownership" issue is by basically not using dynamic memory allocation anywhere. I guess so far my code in general has been simple enough that it hasn't been necessary.

So, to give you an example, to use the set function in CelestialBody with a raw pointer, this is what I typically do:

CelestialBody earth;
CentralGravityField earthCentralGravityField;
earth.setGravityFieldModel( &earthCentralGravityField);

I just recently discovered though that passing an address like this is not passing by reference. I'm not sure if that's a problem. Anyway, using it this way means that there's no memory leaks since nothing is dynamically allocated. Would this be a legitimate use of raw pointers? Or would you recommend in this case too changing the input argument of the setGravityFieldModel() to accept a shared_ptr instead of a raw pointer?

It's definitely better if the code crashes at compile-time indeed. Guess my concern is simply from a users' perspective for the library I'm building, most people are only aware of raw pointers from basic C++ tutorials they've followed. I'm sure no one in the project right now knows about smart pointers and their uses. This means that it increases the learning curve somewhat for users of the library, which is something we're sensitive to, as it's no being developed to be used in general by people with anything more than basic C++ knowledge.

The codebase for the project is quite large now and we've been consistently using raw pointers in the fashion I've demonstrated for set and get functions. If this really is something particularly wrong, even if dynamic allocation is not being used, then I will definitely undertake the task of converting all the old code to make use of smart pointers. Just want to make sure.

Thanks again for your input.

Cheers,

Kartik

mike_2000_17 2,669 21st Century Viking Team Colleague Featured Poster · Answer 15 · 2011-06-23T03:35:32+00:00

>>Would this be a legitimate use of raw pointers?

Ishh. Hard to tell. Raw pointers are seldom used in my code, for example. They do have legitimate uses, but it almost always falls under the hood of a library, not at the interface. In the use-case you presented, it is reasonable to use a raw pointer as a lesser-evil case, but I wouldn't go as far as saying it is a legitimate use of a raw pointer. In this case, you do want to avoid dynamic allocation whenever possible, so that's a given. You can turn a local variable into a shared_ptr by using a "null-deleter", so that is an option to comply to the shared_ptr interface while not having to have a dynamically allocated object, as so:

//define a callable null_deleter:
struct null_deleter
{
    void operator()(void const *) const
    { /* intentionally left empty */
    }
};

int main() {
  //..
  CelestialBody earth;
  CentralGravityField earthCentralGravityField;
  earth.setGravityFieldModel( 
    shared_ptr<GravityFieldModel>(&earthCentralGravityField,
                                  null_deleter()) );  
  //..
};

This simply has the effect of creating a shared_ptr that will never actually delete the pointed-to object.

But the real solution to this problem is RAII (Resource Acquisition Is Initialization), which is arguably the most powerful idiom in C++. Normally, a use-case like the one you presented would be implemented like this:

class BaseHelper {
  //some interface.. with virtual functions.
};

class DerivedHelper : public BaseHelper {
  //some overriding of the base-class interface.. and whatever else..
};

class Foo {
  private:
    BaseHelper& helper;
  public:
    Foo(BaseHelper& aHelper) : helper(aHelper) { };

  //... some code that executes something and uses "helper" to do so.
};

int main() {

  DerivedHelper h;
  
  Foo f(h); //construct object of RAII class Foo.

  f.do_something(); //use the object f which is using h.

  return 0;
};

The point with the above is that f is constructed with a "hard-link" (via a reference) to its helper object, which naturally leads to a use-case where you have the helper object and create a local Foo object that will use it temporarily. RAII is nice in this way that it naturally uses the scoping rules of C++. But, let's not get carried away. Your use of raw pointers is a lesser-evil in that it is more intuitive for beginners (or shall I say, OOP programmers, possibly coming from a background in another language than C++).

Note also, procedural programming is an option as well. I don't want to get into the details of it, but just know that OOP is, in my opinion, more often than not, not the best programming paradigm for a given problem. C++ offers many more programming paradigm than OOP, restricting yourself to that is never a good idea. Just throwing this out there.

Oh and,

>>most people are only aware of raw pointers from basic C++ tutorials they've followed.

Plus

>>The codebase for the project is quite large now and we've been consistently using raw pointers in the fashion I've demonstrated for set and get functions.

Equals:

Disaster waiting to happen... in my opinion.

TheWolverine 0 Junior Poster in Training · Answer 16 · 2011-06-29T00:44:04+00:00

Sorry I disappeared for a few days. Been busy with other things. Also been looking in a few things that you've pointed out.

>>Would this be a legitimate use of raw pointers?
Ishh. Hard to tell. Raw pointers are seldom used in my code, for example. They do have legitimate uses, but it almost always falls under the hood of a library, not at the interface. In the use-case you presented, it is reasonable to use a raw pointer as a lesser-evil case, but I wouldn't go as far as saying it is a legitimate use of a raw pointer. In this case, you do want to avoid dynamic allocation whenever possible, so that's a given. You can turn a local variable into a shared_ptr by using a "null-deleter", so that is an option to comply to the shared_ptr interface while not having to have a dynamically allocated object, as so:
//define a callable null_deleter:
struct null_deleter
{
    void operator()(void const *) const
    { /* intentionally left empty */
    }
};

int main() {
  //..
  CelestialBody earth;
  CentralGravityField earthCentralGravityField;
  earth.setGravityFieldModel( 
    shared_ptr<GravityFieldModel>(&earthCentralGravityField,
                                  null_deleter()) );  
  //..
};
This simply has the effect of creating a shared_ptr that will never actually delete the pointed-to object.

That's a simple enough solution. However, isn't the benefit of the shared_ptr then also nullified in a sense? I understood the shared_ptr case being pertinent particularly when you do have dynamic memory allocation.

But the real solution to this problem is RAII (Resource Acquisition Is Initialization), which is arguably the most powerful idiom in C++. Normally, a use-case like the one you presented would be implemented like this:
class BaseHelper {
  //some interface.. with virtual functions.
};

class DerivedHelper : public BaseHelper {
  //some overriding of the base-class interface.. and whatever else..
};

class Foo {
  private:
    BaseHelper& helper;
  public:
    Foo(BaseHelper& aHelper) : helper(aHelper) { };

  //... some code that executes something and uses "helper" to do so.
};

int main() {

  DerivedHelper h;
  
  Foo f(h); //construct object of RAII class Foo.

  f.do_something(); //use the object f which is using h.

  return 0;
};
The point with the above is that f is constructed with a "hard-link" (via a reference) to its helper object, which naturally leads to a use-case where you have the helper object and create a local Foo object that will use it temporarily. RAII is nice in this way that it naturally uses the scoping rules of C++. But, let's not get carried away. Your use of raw pointers is a lesser-evil in that it is more intuitive for beginners (or shall I say, OOP programmers, possibly coming from a background in another language than C++).

Yes, this option I understand well, although the problem I see with respect to the functionality I need is that the link can only be made when the object is declared since it is through the constructor right? One of the conscious choices I've taken is to make use of explicit, verbose set functions, to make the code read almost like prose. This is necessary from the perspective of the students that are working with the library. As more and more objects are added to the constructor list, the input arguments become unwieldy. In addition to this, there are times when an object needs to be set at a point after construction. The set functions allow me to (re)define this. Is there anyway it's possible using this methodology? Or is it only possible one-time, through the constructor?

Note also, procedural programming is an option as well. I don't want to get into the details of it, but just know that OOP is, in my opinion, more often than not, not the best programming paradigm for a given problem. C++ offers many more programming paradigm than OOP, restricting yourself to that is never a good idea. Just throwing this out there.

I'm not sure what procedural programming is. I'll have to Google it. I'm myself a novice at C++ I'd say, so the only knowledge I have is of the basics, which is always presented from an OOP perspective (based on the resources I've used so far).

Oh and,
>>most people are only aware of raw pointers from basic C++ tutorials they've followed.
Plus
>>The codebase for the project is quite large now and we've been consistently using raw pointers in the fashion I've demonstrated for set and get functions.
Equals:
Disaster waiting to happen... in my opinion.

That doesn't sound good. I'm definitely interested in finding out better ways to restructure all of this. Do you know of any good resources that you could point me to, to get a better grasp of what I really should be doing instead?

Thanks for all your invaluable feedback!

Kartik

mike_2000_17 2,669 21st Century Viking Team Colleague Featured Poster · Answer 17 · 2011-06-29T10:29:58+00:00

>>However, isn't the benefit of the shared_ptr then also nullified in a sense?

That's true, the purpose of shared_ptr is, to some extent, nullified by using a null_deleter functor. However, like many things in C++, there is a philosophy of verbosity that makes the code and intent clear. For example, in C++, static_cast is essentially equivalent to a C-style cast in most cases, but a static_cast is preferred (and explicitly preferred via compiler warnings) because it's considered better style to use a more verbose style to get rid of any ambiguity (i.e. if someone does a C-style cast, it can mean a static_cast or a reinterpret_cast, it is ambiguous and can cause silent bugs). I think you understand that already. This is why it can be preferred to use a shared_ptr, even if it means often explicitly nullifying its effect, because, in those cases, breaking the shared ownership model is explicit and intentional, it is still safer than a raw pointer.

>> the link can only be made when the object is declared

That's right. If there are ways to avoid that restriction, they are hacks that should be avoided anyways.

>>One of the conscious choices I've taken is to make use of explicit, verbose set functions, to make the code read almost like prose.

That's certainly a wise choice. My purpose of pointing alternatives out is mostly to make sure your choices are conscious and informed. Nothing is black-and-white, there are many factors to consider (like those you have mentioned) that are yours to weight.

>>As more and more objects are added to the constructor list, the input arguments become unwieldy.

There are idioms that can make constructors nicer and more self-documenting (like the parameter-lists, the named-constructor idiom and the named-parameter idiom). But, in general, I find that rigorous doxygen-style comments is enough, that's what most main-stream libraries use anyways.

>>In addition to this, there are times when an object needs to be set at a point after construction.

This is often considered an indication that there is something ill-defined with your classes or class-hierarchy. Fundamentally, an object is defined by its state, and its functionality should generally be tightly coupled to its state, and almost exclusively coupled to its state. When you have to intervene, sometime after construction, to modify the state means that there is something fishy in that class (I'm sorry for being vague, but this is a quite general argument). One possibility is that you have, prior to setting the state, a zombie-state in your object, i.e., a living-dead which probably shouldn't exist yet (delay construction until you actually have what you need to fully construct it). Another possibility is that you are essentially reusing an object to do something completely different (e.g. like resetting it up for a new simulation or model or whatever), which usually mandates that a new object be created. Finally, the last possibility I can think of is that this data element you need to reset really does not belong in the class in question. This last possibility is a very common mistake and it does relate to my comment about procedural programming being often more appropriate. If you have any doubts about if a data member should be part of a class or not, it probably should not.

>>I'm not sure what procedural programming is.

Procedural programming: "They describe, step by step, exactly the procedure that should, according to the particular programmer at least, be followed to solve a specific problem." -wikipedia

Basically, if you are familiar with C programming, you are familiar with procedural programming. Many C++ programmers tend to get trap in this fantasy world of OOP and are led to believe that the only kind of acceptable code is code that should have one free function (the main() function) and everything else should be classes and their data members and member functions. This is a big falacy. Many, if not all, of the most prominent C++ Gurus have advocated free functions instead of member functions as the default choice, like Herb Sutter and Scott Meyers, for example.

My rule for this is quite simple. If you are implementing something that is fundamentally of algorithmic nature, do it with free functions (or mostly so). If you are implementing something that is fundamentally about data structuring, then do it with a proper class hierarchy and object tree. Don't mix the two.

Another interesting idiom in the kind of things I think you are doing (numerical simulations, which is what I do too) is "IoC" (Inversion of Control). Ironically, it is often said to be the antithesis of procedural programming, but, in fact, it is a procedural programming idiom, and I would argue it is amongst the nicest ones. The idea is to hand over the necessary objects for an algorithm to run, call that algorithm, and let it control the flow. This may seem abstract to you now, but it is great, believe me. One example is essentially all the graph algorithms in the Boost.Graph Library which relies on generic visitors, which is a very nice form of IoC.

Let me put procedural programming into an example. Say, I need to make a simulator for one particle and some force model (gravity or whatever). A naive, but often seen, approach is as follows (very much in OOP style):

class ParticleModel {
  //..
};

class ForceModel {
  //..
};

class SimulationManager {
  private:
    ParticleModel* particle;
    std::vector<ForceModel*> force_fields;
  public:
    void run() {
      //some implementation.
    };
};

The above approach is very common, it is a classic model of having one class that manages all the execution of the code. So, a main() function would normally construct this manager object, add a whole bunch of data to it, and then execute. This is crazy, if you ask me (but I admit, I have done this in the past, for a long time, it was my default software design model). This manager class is really just describing a function, an algorithm, and its data structure (data members) is really just temporary, it is setup before execution and is only useful for the execution. Why is this in a class? There is no point in doing this. You can replace it by this:

//just a free function:
void run_particle_simulation(ParticleModel& particle, std::vector<ForceModel*>& force_fields) {
  //.. same implementation.
};

The big differences between those two approaches are:
- The free function is much simpler and with fewer code on the caller site.
- The free function allows more RAII-like, clear object life-times and ownership.
- The free function can be overloaded for any other set of parameter you like (and also with parameter-lists I mentioned earlier).
- The free function can be made highly modular and very efficient using generic programming techniques.
- By giving visitor objects and functors as parameters to the free function, you can implement the generic parts of the algorithm in the free function and have all the specific, customized parts encoded in the parameters, this is inversion of control.

Of course, there is an intermediate, which is to just have some data members and some parameters to the "run()" function of a class. This is an in-between that can be a good compromise, but usually having to do this indicates that there should be a split (that class is too big or ill-defined).

>>Do you know of any good resources that you could point me to, to get a better grasp of what I really should be doing instead?

I would recommend you go through all the Parashift C++ FAQ-lite first (Clive Marshall gives great design advice all throughout). Then, the book you cannot live without if you make any kind of non-trivial software design in C++ is "C++ Coding Standards: 101 Rules, Guidelines and Best Practices" by Alexandrescu and Sutter. And certainly, anything by Scott Meyers or Herb Sutter are great.

TheWolverine 0 Junior Poster in Training · Answer 18 · 2011-07-09T15:32:35+00:00

Hi Mike,

I've been swamped with work the last week and been travelling, so my apologies for not responding to your msg earlier. Thanks a lot for all the feedback. There's definitely some very thought-provoking stuff for me you've put down.

>>However, isn't the benefit of the shared_ptr then also nullified in a sense?
That's true, the purpose of shared_ptr is, to some extent, nullified by using a null_deleter functor. However, like many things in C++, there is a philosophy of verbosity that makes the code and intent clear. For example, in C++, static_cast is essentially equivalent to a C-style cast in most cases, but a static_cast is preferred (and explicitly preferred via compiler warnings) because it's considered better style to use a more verbose style to get rid of any ambiguity (i.e. if someone does a C-style cast, it can mean a static_cast or a reinterpret_cast, it is ambiguous and can cause silent bugs). I think you understand that already. This is why it can be preferred to use a shared_ptr, even if it means often explicitly nullifying its effect, because, in those cases, breaking the shared ownership model is explicit and intentional, it is still safer than a raw pointer.

That makes sense indeed. I'll have to consider this carefully as it would require a lot of rewriting of the library that I have now.

>> the link can only be made when the object is declared
That's right. If there are ways to avoid that restriction, they are hacks that should be avoided anyways.

That's good to know.

>>One of the conscious choices I've taken is to make use of explicit, verbose set functions, to make the code read almost like prose.
That's certainly a wise choice. My purpose of pointing alternatives out is mostly to make sure your choices are conscious and informed. Nothing is black-and-white, there are many factors to consider (like those you have mentioned) that are yours to weight.
>>As more and more objects are added to the constructor list, the input arguments become unwieldy.
There are idioms that can make constructors nicer and more self-documenting (like the parameter-lists, the named-constructor idiom and the named-parameter idiom). But, in general, I find that rigorous doxygen-style comments is enough, that's what most main-stream libraries use anyways.

Very interesting. The named parameter lists would be a great alternative. The only problem I can think of with this is that there are certain things that are sometimes set in the classes that I have that are optional. Would I then have to overload the constructor for every single combination of the option parameters? Or is there are way by which I can just have one long parameter list in the constructor, and that if the user doesn't pass a specific optional parameter, that the compiler knows how to parse the rest of the parameter list using just one constructor definition?

>>In addition to this, there are times when an object needs to be set at a point after construction.
This is often considered an indication that there is something ill-defined with your classes or class-hierarchy. Fundamentally, an object is defined by its state, and its functionality should generally be tightly coupled to its state, and almost exclusively coupled to its state. When you have to intervene, sometime after construction, to modify the state means that there is something fishy in that class (I'm sorry for being vague, but this is a quite general argument). One possibility is that you have, prior to setting the state, a zombie-state in your object, i.e., a living-dead which probably shouldn't exist yet (delay construction until you actually have what you need to fully construct it). Another possibility is that you are essentially reusing an object to do something completely different (e.g. like resetting it up for a new simulation or model or whatever), which usually mandates that a new object be created. Finally, the last possibility I can think of is that this data element you need to reset really does not belong in the class in question. This last possibility is a very common mistake and it does relate to my comment about procedural programming being often more appropriate. If you have any doubts about if a data member should be part of a class or not, it probably should not.

All very good points, and I'm leaning towards thinking that a lot of my classes now might be ill-defined unfortunately. A good example of where something is modified after the fact in my library, is when I make use of predefined planets. I have for instance now a class Planet, which takes an enum as parameter for its constructor. The enum contains all the planet names, and I use a switch statement in the constructor to set the object to predefined settings for each planet. Now, it can happen that a user requires the predefined Earth object with all the predefined settings, except a custom value for the mass of the Earth. The way I've set things up now is that the user creates the predefined Earth, with all the preloaded settings, and then uses a set-function to reset the value of the mass. Do you have a suggestion as to a better way to go about doing something like this?

The point you make about different models or simulations mandating different objects is a good point and something that I have tried to take into account so far.

>>I'm not sure what procedural programming is.
Procedural programming: "They describe, step by step, exactly the procedure that should, according to the particular programmer at least, be followed to solve a specific problem." -wikipedia
Basically, if you are familiar with C programming, you are familiar with procedural programming. Many C++ programmers tend to get trap in this fantasy world of OOP and are led to believe that the only kind of acceptable code is code that should have one free function (the main() function) and everything else should be classes and their data members and member functions. This is a big falacy. Many, if not all, of the most prominent C++ Gurus have advocated free functions instead of member functions as the default choice, like Herb Sutter and Scott Meyers, for example.
My rule for this is quite simple. If you are implementing something that is fundamentally of algorithmic nature, do it with free functions (or mostly so). If you are implementing something that is fundamentally about data structuring, then do it with a proper class hierarchy and object tree. Don't mix the two.

Yes, I live in that fantasy world, as I've kinda assumed that everything that can go in classes should for code reusability. Guess I have to relearn this and figure out what parts of the library are in fact procedural in that case.

Another interesting idiom in the kind of things I think you are doing (numerical simulations, which is what I do too) is "IoC" (Inversion of Control). Ironically, it is often said to be the antithesis of procedural programming, but, in fact, it is a procedural programming idiom, and I would argue it is amongst the nicest ones. The idea is to hand over the necessary objects for an algorithm to run, call that algorithm, and let it control the flow. This may seem abstract to you now, but it is great, believe me. One example is essentially all the graph algorithms in the Boost.Graph Library which relies on generic visitors, which is a very nice form of IoC.
Let me put procedural programming into an example. Say, I need to make a simulator for one particle and some force model (gravity or whatever). A naive, but often seen, approach is as follows (very much in OOP style):
class ParticleModel {
  //..
};

class ForceModel {
  //..
};

class SimulationManager {
  private:
    ParticleModel* particle;
    std::vector<ForceModel*> force_fields;
  public:
    void run() {
      //some implementation.
    };
};
The above approach is very common, it is a classic model of having one class that manages all the execution of the code. So, a main() function would normally construct this manager object, add a whole bunch of data to it, and then execute. This is crazy, if you ask me (but I admit, I have done this in the past, for a long time, it was my default software design model). This manager class is really just describing a function, an algorithm, and its data structure (data members) is really just temporary, it is setup before execution and is only useful for the execution. Why is this in a class? There is no point in doing this. You can replace it by this:
//just a free function:
void run_particle_simulation(ParticleModel& particle, std::vector<ForceModel*>& force_fields) {
  //.. same implementation.
};
The big differences between those two approaches are:
- The free function is much simpler and with fewer code on the caller site.
- The free function allows more RAII-like, clear object life-times and ownership.
- The free function can be overloaded for any other set of parameter you like (and also with parameter-lists I mentioned earlier).
- The free function can be made highly modular and very efficient using generic programming techniques.
- By giving visitor objects and functors as parameters to the free function, you can implement the generic parts of the algorithm in the free function and have all the specific, customized parts encoded in the parameters, this is inversion of control.
Of course, there is an intermediate, which is to just have some data members and some parameters to the "run()" function of a class. This is an in-between that can be a good compromise, but usually having to do this indicates that there should be a split (that class is too big or ill-defined).

This is precisely what I've been doing. Looks like I need a major rethink of my code in that case. One of the advantages of making the simulation manager a class though as I have used it now, is that the simulation manager can be called by other simulation managers, and if they fall within a class of such methods, I've been using a base class that they inherit from and using polymorphic pointers. This has leant great flexibility to the code. For instance, I have a Propagator class that propagates orbits of satellites around planets. The Propagator class assembles a lot of required data for the simulation to run. To run the simulation, the Propagator class makes use of an Integrator class object. The integrator class object is in itself a simulation manager, as all it does is integrate a set of first-order differential equations. So, in the Propagator class I have a set-function to set a polymorphic pointer to an Integrator class object to be used to propagate a satellite orbit. Setting it up this way, the Propagator class can therefore take any of the derived classes of the Integrator base class as an integrator for the propagation of the orbit.

Is such polymorphic behaviour possible if you're using procedural programming and free functions? Additionally, I have standard functions, often set- and get-functions that are shared amongst for instance Integrators. So having a base class that the integrators all inherit from also means that I can maximise code reusability. Is it possible to have some sort of inheritance structure for free functions?

>>Do you know of any good resources that you could point me to, to get a better grasp of what I really should be doing instead?
I would recommend you go through all the Parashift C++ FAQ-lite first (Clive Marshall gives great design advice all throughout). Then, the book you cannot live without if you make any kind of non-trivial software design in C++ is "C++ Coding Standards: 101 Rules, Guidelines and Best Practices" by Alexandrescu and Sutter. And certainly, anything by Scott Meyers or Herb Sutter are great.

Great! I will look for those books.

Thanks for all your tips and advice. It's helped me rethink a lot of what I'm doing.

Cheers,

Kartik

TheWolverine 0 Junior Poster in Training · Answer 19 · 2011-07-10T15:25:31+00:00

This is exactly what the Boost Parameter Library does. You can pass parameters in a simple "identifier = value" comma-separated list, where order doesn't matter, and you can make any of the parameters optional with some default value. Normally, you should be able to program a single function, you won't need overloads. Of course, it is somewhat verbose when you create a function to be called this way because it requires using several MACROs that are very complex under-the-hood (uses Boost PreProcessor Lib.). But, the point is that it makes the call-site syntax simple and clear (which is the point of writing library code: put all the complex construct under-the-hood to expose the simplest and clearest possible interface to the user).

Great! That's good motivation indeed then to make use of Boost libraries. Gives me a strong argument to present to the rest of the group as to why we should include Boost.

Oh yes. There are better ways to do this. Personally, I just use serialization to take care of this. Serialization means to take your object hierarchy and save it to a file(s) to be reloaded later. For example, I built multi-body dynamics models (i.e. robots), I usually make a simple program that builds the "default" model as an object hierarchy (e.g. links and joints, and motors, etc.) with default system parameters, then I serialize it to a file, and then I can copy and modify the file for any alternative parameter sets. Then, I have a simple "run-simulation" program that loads whatever model it is given and runs a simulation with it. One example of a serialization library is Boost.Serialization.

I'm not familiar with serialization yet, so forgive me if I state something wrong. Basically, if I understand correctly after perusing a Boost example, what this permits is that the state of a class object is written to file. This file can then be used at a later time to recover that state of the object. So, that would mean in my predefined planets, I would serialize an object of the Planet class with all the settings that pertain to my predefined Earth, for example. That would yield a file that basically describes all the settings of the Planet class to make it a predefined Earth, e.g., mass, atmosphere, gravity field etc. Then, through deserialization at a later stage, the state of the predefined Earth can be recovered for use. Is that correct?

Given that that's correct, is the idea that you serialize your object once and store the generated file in the library so that users only have to deserialize? Or do you serialize and deserialize everytime you run code in the library?

As a general question, why is this better than storing the serializable data in the class itself? Is it because it's intrusive to do so? The reason I ask is that interfacing with files seems like something unstable to me, since files, particularly text files, can be corrupted easily. Not sure if this is true, but feels like it is to me. I understand that it does have the benefit of not having to have a switch statement in your class for all the different instances of the class settings, but I'm guessing the fact that you have to read from a file means that the performance gain by not using the switch statement doesn't have to be significant.

Another option is to have some kind of standard way by which to load configuration data. Boost.Program-Options is a good example of that.

This one went right over my head :P. Think I have to spend some more time looking at the tutorial example cause at first look I can't really make head or tail of the code example given.

I will quote Alexandrescu and Sutter on that (from "C++ Coding Standards"):
Which means that you should not make derived classes because they can reuse the code that is in the base class (of course, they can reuse the base-class code, but it is no justification for deriving a class, prefer composition for that). The real justification for deriving a class is such that code which uses the base-class can be reused with the derived class, not the other way around. I just wanted to be sure you understood that, it is the fundamental concept that defines polymorphism. Then, the question of whether you need to use polymorphism at run-time, or if you are OK with using polymorphism at compile-time, in the former case, you use OOP (dynamic polymorphism), and in the latter case, you can use Generic Programming (static polymorphism).

That just made my head spin :P.

In fact, while I'm at it, I think I should quote a few more rules that I find to be of critical importance:

Great list. I need to think a bit more about the items in that list with respect to my own code.

First of all, remember that procedural programming and object-oriented programming are not mutually exclusive, far from it, they are complementary. If you have a justification for using inheritance (more precisely, dynamic polymorphism), then you can do so. And if it is more convenient to use free functions for something, then do so. And, of course, you can use free functions from polymorphic types, and conversely, you can use polymorphic types within free functions. No problem there.
In fact, there is a much more powerful form of polymorphism available in generic programming, at least, in my opinion. But this requires knowledge of function templates and class templates. Consider this:
//You can define generic integration functions like so:

template <typename StateRateEvaluator, typename StateVector>
void euler_integration(StateRateEvaluator f, StateVector& x, double& t, double t_step, double t_end) {
  while(t < t_end) {
    x += f(x,t) * t_step;
    t += t_step;
  };
};

template <typename StateRateEvaluator, typename StateVector>
void runge_kutta4_integration(StateRateEvaluator f, StateVector& x, double& t, double t_step, double t_end) {
  //...
};

template <typename StateRateEvaluator, typename StateVector>
void rkf45_integration(StateRateEvaluator f, StateVector& x, double& t, double t_step, double t_end, double t_step_min, double t_step_max, double tolerance) {
  //...
};

//... other integrators you can think of, or care to implement.


//then you define a specific system, with its state vector class:

struct one_dof_state_vector {
  double x;
  double x_dot;
  struct rate {
    double x_dot;
    double x_dotdot;
    friend one_dof_state_vector operator *(const rate& dv_dt, double dt) {
      one_dof_state_vector v;
      v.x = dv_dt.x_dot * dt;
      v.x_dot = dv_dt.x_dotdot * dt;
      return v;
    };
    friend one_dof_state_vector operator *(double dt, const rate& dv_dt) {
      one_dof_state_vector v;
      v.x = dv_dt.x_dot * dt;
      v.x_dot = dv_dt.x_dotdot * dt;
      return v;
    };
  };
  one_dof_state_vector& operator +=(const one_dof_state_vector& dv) {
    x += dv.x; x_dot += dv.x_dot;
    return *this;
  };
};

//and its system function and parameters:

struct mass_spring_system {
  double mass;
  double stiffness;
  one_dof_state_vector::rate operator()(const one_dof_state_vector& v, double t) const {
    one_dof_state_vector::rate dv_dt;
    dv_dt.x_dot = v.x_dot;
    dv_dt.x_dotdot = - v.x * stiffness / mass;
    return dv_dt;
  };
};

int main() {
  one_dof_state_vector v;
  v.x = 1.0; v.x_dot = 0.0;
  double t = 0;
  mass_spring_system sys;
  sys.mass = 1.0;
  sys.stiffness = 0.1;
  std::ofstream out("results.dat");
  out << "% t x x_dot";
  out << std::endl << t << " " << v.x << " " << v.x_dot;
  for(int i = 0; i < 1000; ++i) {
    euler_integration(sys, v, t, 0.0001, t + 0.01);
    out << std::endl << t << " " << v.x << " " << v.x_dot;
  };
  return 0;
};
Of course, the above example is crude and simplified, it can be made much nicer. The point is that you can achieve polymorphism by not requiring any kind of specific objects to be given to the integrators (they are generic parameters, and thus, the name "generic programming"). The requirement is implied by the fact that the integrator requires that the "StateRateEvaluator" is callable with the parameters of the current state and the current time, and to output rate of change of the state vector. This is equivalent to having an implicit "abstract interface", just like a base-class explicitly defines the abstract interface. Note that the interface can also be made explicit in generic programming using what is called "concepts" which is analogous to base classes. The disadvantage of the above approach is that things have to be determined at compile-time (types and functions to call have to be known at compile-time), and this is why this is called "static polymorphism". Fortunately, you can wrap this in dynamic polymorphism very easily:
//say you have a base-class to evaluate a model's equations of motion:
class ModelEvaluator {
  public:
    virtual void evaluate(const double* state_vector, double* state_rate, double t) = 0;
  //...
};

//.. then you can have any derived classes you want...

//then you create a simple wrapper functor to call the base-class pointer's evaluation function:
struct ModelEvaluatorFunctor {
  ModelEvaluator* mdl; //this could also be a smart-pointer, but it is not necessary since this class is not part of the interface (should not be).
  state_rate_vector operator()(const state_vector& x, double t) const {
    mdl->evaluate(/*....*/);
    return ...;
  };
};
You can also wrap the integrator calls in a set of derived classes and base classes (and have parameters associated to the integrator objects, that are used when the integration calls are made). The advantage of this whole thing is that you can use static polymorphism (which is much more efficient and generic than dynamic polymorphism) whenever appropriate. And when you need dynamic polymorphism (and you will be surprised how seldom you need it), you can, with minimal efforts to wrap the GP part into an OOP framework. The other advantage is that code reusability is the main motivation for GP, and it is much more scalable. And it is much more flexible. This allows you to create a much more flexible and generic code-base to perform the under-the-hood calculations in a way that truly reduces the amount of coding you have to do to create new elements and reuse other algorithms, without the often-awkward problems of trying to fit things correctly into a class hierarchy. Then, you can expose an OOP thin-wrapper that contains not much more than composition of some minimalistic classes and calling some generic functions.

Wow, very neat. Not sure I understand all the details, but think I get the general idea. It's quite clean indeed. Right now, I've tried to avoid the use of templates, because hardly anyone in the group is familiar with the uses of templates other than the basic ones that are generally outlined in basic C++ tutorials. If it does indeed make the code more scalable and flexible and minimizes the code addition that's necessary if new models etc. are added, then it's definitely worth considering for implementation in some form or another.

I think that your whole problem really boils down to a careful design of the abstractions that you are using, not so much the matter of PP vs GP vs OOP. For this purpose, read the "C++ Coding Standards" book that I referred to above. I believe that "Effective C++" is also very good for this. I'll reiterate aspects of designing your abstractions:
5. Give one entity one cohesive responsibility
13. Ensure resources are owned by objects. Use explicit RAII and smart pointers
32. Be clear what kind of class you're writing
33. Prefer minimal classes to monolithic classes
34. Prefer composition to inheritance
36. Prefer providing abstract interfaces
44. Prefer writing nonmember nonfriend functions

Thanks, I'm gonna try to pick the book(s) up today at a local bookstore. Something that I don't quite understand is the choice of member functions versus nonmember functions. Doesn't point 44 kinda contradict the idea of using classes in the first place? Isn't a class without member functions just a data structure? If so, then why do classes exist at all? After all, there is also the "structure" object. It's kinda got me confused as to when you would choose to use a struct and when a class.

If you like, you might want to take a look at the library I am coding, here.

Great! That's really neat, thanks! Had a browse and there are some very interesting things you have there that directly pertain to things I'm working on, particularly the integrators and optimizers. Multi-body dynamics is also part of what is being worked on within my group, particularly with respect to the dynamics of re-entry vehicles, with the dynamics of the capsule and parachute included in a multi-body fashion. So, there might be more direct overlap also with some of the dynamics in your library, which is interesting from the perspective of benchmarking what we're doing.

The library I'm working on is unfortunately not open-source at the moment, but we're working our way towards that (I'm pushing for it). If you're interested though, I can ask the group if it's ok to give you access to our repository with the code so you can browse it too. We've been talking about having a computer scientist, or someone with the insights of a computer scientist, have a look at the library to see what we're doing "wrong" and in what ways we can improve our code. I know it's a big ask, but based on the feedback you've given me so far, I think you could probably point out some pretty obvious things we're doing wrong or not entirely correctly. Anyway, if you're interested in taking a look, let me know and I'll ask the rest of the group. At the least, I can show you the code that I've authored in the library, if the rest of the group doesn't favour exposing the whole repository at this time.

Thanks for the feedback once again!

Cheers,

Kartik

mike_2000_17 2,669 21st Century Viking Team Colleague Featured Poster · Answer 20 · 2011-07-10T22:28:28+00:00

I'm not familiar with serialization yet, so forgive me if I state something wrong. Basically, if I understand correctly after perusing a Boost example, what this permits is that the state of a class object is written to file. This file can then be used at a later time to recover that state of the object. So, that would mean in my predefined planets, I would serialize an object of the Planet class with all the settings that pertain to my predefined Earth, for example. That would yield a file that basically describes all the settings of the Planet class to make it a predefined Earth, e.g., mass, atmosphere, gravity field etc. Then, through deserialization at a later stage, the state of the predefined Earth can be recovered for use. Is that correct?

That is correct.

Given that that's correct, is the idea that you serialize your object once and store the generated file in the library so that users only have to deserialize? Or do you serialize and deserialize everytime you run code in the library?

The idea is the store the generated file so that the user only have to deserialize. Of course, you can also provide a program that generates all the predefined model files as well, such that the user can generate fresh ones if needed.

why is this better than storing the serializable data in the class itself?

Because it does not put "magic numbers" in your code. If you are not happy with the predefined parameters that you have (either as a user or lib coder), how do you fix it if it is coded into the class? You have to either recode the class or make a habit of resetting each parameter you are not happy with. Either solutions are a maintainability nightmare. It almost always better to put all "magic numbers" or model parameters in a set of configuration files (or serialized data). You don't require recompilation to change predefined parameters or add predefined models. And you provide an easy way for the user to tweak parameters to his liking.

The reason I ask is that interfacing with files seems like something unstable to me, since files, particularly text files, can be corrupted easily.

Code can easily be corrupted too. If the user is required to use special functions (set/get) to tweak the parameters to his liking, he has a good opportunity to screw up. This is also true with tweaking the file, but in my experience, the chance is less, if it is done right. And, the two are not mutually exclusive either, the user can load a pre-defined file and then tweak inside the code.

but I'm guessing the fact that you have to read from a file means that the performance gain by not using the switch statement doesn't have to be significant.

Performance is not a concern. You should not be loading your models (whether loading from file or from a switch-statement) within a performance-critical loop. So this is for load-time code, for which the performance hit from reading from the file to load the model parameters is not really relevant.

Doesn't point 44 kinda contradict the idea of using classes in the first place? Isn't a class without member functions just a data structure?

The point to "prefer nonmember nonfriend functions" does not preclude the use of member functions. The point of making classes is mainly to encapsulate, to do data-abstraction and to create abstractions. Basically, you create a class to provide some sort of functionality, and hide all data and helper functions it needs to implement that functionality, now, the functionality is abstract and the data is encapsulated. This usually involves making a lot of data private (or protected) and having private member functions (or protected), and exposing an abstract interface. To do this, you need member functions. The point is you _need_ member functions, so you _use_ member functions. If you have any additional functionality you want to add to a class, which is not part of the basic abstract interface, then you should prefer to implement this functionality as non-member functions, and non-friend if you can. This allows you to reduce the abstract interface to its essentials, and leave it clean of any accessory functionality. This promotes smaller, more cohesive classes that implement a clear and simple abstraction. Additionally, nonmember functions have several advantages beyond that, but I will leave it at that for now.

If so, then why do classes exist at all?

Classes don't exist to have member functions, member functions exist to have classes. Without member functions you cannot create abstractions (which is what classes are), whether that abstraction is meant to be a polymorphic interface or not.

After all, there is also the "structure" object.

Well, first of all, structures in C++ are the same as classes (only default access rights and inheritance are different), they can have member functions, constructors, base-classes, overloaded operators, nested classes, be templated, etc.. But I understand what you mean by "structure object", the term that is used for this is POD (Plain-Old-Data) types. Considering what I mentioned above about reducing your interface to a minimal cohesive abstraction, if you go through that thought process for a class and realize that it reduces don't to just being a simple data structure, then the data is the interface and a POD class is appropriate (C-style struct).

Multi-body dynamics is also part of what is being worked on within my group, particularly with respect to the dynamics of re-entry vehicles, with the dynamics of the capsule and parachute included in a multi-body fashion.

Seems like interesting stuff. Your MBD guys might want to take a look at Kinetostatic Transmission Elements (KTEs), this is a very neat and simple abstraction for dynamics elements. Basically, all my dynamics elements have three functions doMotion(), clearForce(), and doForce(), and that's it, very neat way to abstract the essentials of what a dynamic/kinematic element does (propagate (or generate) motion and forces). The original development is from Prof. Andres Kecskemethy (also a mexican).

I'm also working on a Space application (on-orbit servicing and active space debris removal, using a robotic manipulator on a servicing spacecraft). So, there is also some interest on my part to eventually develop some orbit mechanics and attitude dynamics code (which will be quite trivial to add to my KTE-based MBD code, but it is not the highest priority for me right now).

The library I'm working on is unfortunately not open-source at the moment, but we're working our way towards that (I'm pushing for it).

A good argument might be also that putting a project open-source, under GPL, does not give away your IP rights. You can always redistribute under a commercial license for commercial use (dual-licensing, like in Qt for example).

If you're interested though, I can ask the group if it's ok to give you access to our repository with the code so you can browse it too.

That could be interesting. I'm too busy to do a comprehensive code review, with a real deadline. But I can take a look and give some recommendations. But, of course, the code is only a part of it. I would want to look at three things:
- Code documentation (doxygen-like documentation of the code)
- The project's "coding guidelines" document (if you don't have one, it is about time you get busy making one, this is the very first issue (issue 0) that is described in "C++ Coding Standards")
- The code
Of course, doing this for free implies that you cannot expect a deadline, and should give proper credits for any contributions I make. Of course, open-source is very much preferred too. PM me to go forward with this (but further questions on your code should remain on this thread since it can help other people too). BTW, what organization are you working with?

TheWolverine 0 Junior Poster in Training · Answer 21 · 2011-07-10T23:07:31+00:00

That is correct.

The idea is the store the generated file so that the user only have to deserialize. Of course, you can also provide a program that generates all the predefined model files as well, such that the user can generate fresh ones if needed.

Because it does not put "magic numbers" in your code. If you are not happy with the predefined parameters that you have (either as a user or lib coder), how do you fix it if it is coded into the class? You have to either recode the class or make a habit of resetting each parameter you are not happy with. Either solutions are a maintainability nightmare. It almost always better to put all "magic numbers" or model parameters in a set of configuration files (or serialized data). You don't require recompilation to change predefined parameters or add predefined models. And you provide an easy way for the user to tweak parameters to his liking.

Code can easily be corrupted too. If the user is required to use special functions (set/get) to tweak the parameters to his liking, he has a good opportunity to screw up. This is also true with tweaking the file, but in my experience, the chance is less, if it is done right. And, the two are not mutually exclusive either, the user can load a pre-defined file and then tweak inside the code.

Performance is not a concern. You should not be loading your models (whether loading from file or from a switch-statement) within a performance-critical loop. So this is for load-time code, for which the performance hit from reading from the file to load the model parameters is not really relevant.

Ok, then it's as I guessed it is, in which case I agree that it's much better than meddling with code in the class itself. I understand that it's robuster and makes the code more modular. I guess the only question is how much effort it costs to set up the code to be serializable, as complexity of code is definitely something that we are adverse to given that the students starting the project are generally not well-versed in C++.

The point to "prefer nonmember nonfriend functions" does not preclude the use of member functions. The point of making classes is mainly to encapsulate, to do data-abstraction and to create abstractions. Basically, you create a class to provide some sort of functionality, and hide all data and helper functions it needs to implement that functionality, now, the functionality is abstract and the data is encapsulated. This usually involves making a lot of data private (or protected) and having private member functions (or protected), and exposing an abstract interface. To do this, you need member functions. The point is you _need_ member functions, so you _use_ member functions. If you have any additional functionality you want to add to a class, which is not part of the basic abstract interface, then you should prefer to implement this functionality as non-member functions, and non-friend if you can. This allows you to reduce the abstract interface to its essentials, and leave it clean of any accessory functionality. This promotes smaller, more cohesive classes that implement a clear and simple abstraction. Additionally, nonmember functions have several advantages beyond that, but I will leave it at that for now.

Classes don't exist to have member functions, member functions exist to have classes. Without member functions you cannot create abstractions (which is what classes are), whether that abstraction is meant to be a polymorphic interface or not.

Well, first of all, structures in C++ are the same as classes (only default access rights and inheritance are different), they can have member functions, constructors, base-classes, overloaded operators, nested classes, be templated, etc.. But I understand what you mean by "structure object", the term that is used for this is POD (Plain-Old-Data) types. Considering what I mentioned above about reducing your interface to a minimal cohesive abstraction, if you go through that thought process for a class and realize that it reduces don't to just being a simple data structure, then the data is the interface and a POD class is appropriate (C-style struct).

Yes, ok that makes sense too. I guess we've been making the mistake of putting as many things in classes as possible as opposed to the strategy that you're elucidating, which is to actually minimize doing so to the base essentials: /me thinks that means having a good, long, hard look at the code we've setup already.

Seems like interesting stuff. Your MBD guys might want to take a look at Kinetostatic Transmission Elements (KTEs), this is a very neat and simple abstraction for dynamics elements. Basically, all my dynamics elements have three functions doMotion(), clearForce(), and doForce(), and that's it, very neat way to abstract the essentials of what a dynamic/kinematic element does (propagate (or generate) motion and forces). The original development is from Prof. Andres Kecskemethy (also a mexican).
I'm also working on a Space application (on-orbit servicing and active space debris removal, using a robotic manipulator on a servicing spacecraft). So, there is also some interest on my part to eventually develop some orbit mechanics and attitude dynamics code (which will be quite trivial to add to my KTE-based MBD code, but it is not the highest priority for me right now).

Interesting. There are two MSc students working on space debris right now in fact in the project. They're not dealing with the robotics side of things that you're talking about, but more interested in performing conjunction and risk analysis for debris collision, and the subsequent debris created from the collision. So that involves propagation of the Two-Line Elements, and statistical analysis of the data. I know there have been space debris projects in the past in my group too, and probably will be in the future too.

A good argument might be also that putting a project open-source, under GPL, does not give away your IP rights. You can always redistribute under a commercial license for commercial use (dual-licensing, like in Qt for example).

Yes, true, and I am a strong advocate of this. There are just some concerns that some of the stuff some of the students are doing is fairly novel, and the code in those cases is something the profs want to keep in-house. So what we'll probably end up doing is just having an internal version in the group that includes that code, that's not available as opensource code to the rest of the world. At most though, I think this amounts to about 20% of the code we have now. Anyway, we have a major internal release coming up at the end of August, and after that I'm going to push for the next big release to be opensource finally.

That could be interesting. I'm too busy to do a comprehensive code review, with a real deadline. But I can take a look and give some recommendations. But, of course, the code is only a part of it. I would want to look at three things:
- Code documentation (doxygen-like documentation of the code)
- The project's "coding guidelines" document (if you don't have one, it is about time you get busy making one, this is the very first issue (issue 0) that is described in "C++ Coding Standards")
- The code
Of course, doing this for free implies that you cannot expect a deadline, and should give proper credits for any contributions I make. Of course, open-source is very much preferred too. PM me to go forward with this (but further questions on your code should remain on this thread since it can help other people too). BTW, what organization are you working with?

Yes, I fully understand that there will be no deadline. Any feedback on the actual code we have, based on what you've told me so far, would be valuable I think. In addition, to clarify, providing proper credits is something I'm big on, so you will for sure be credited appropriately.

We have all our code Doxygen'd. In addition, we have a "N Commandments" document that we've setup with best practices/golden rules that we are providing to new developers in the project. We have a Users' manual and a Developers' manual, and we also have quite a good wiki site. So I think the documentation is quite good at present.

I'm glad you are interested in having a look at the project. I understand fully that you can't provide deadlines for code review, and that the code review will also not be exhaustive/extensive. I'm actually a PhD student, so I fully understand time pressures.

The only thing that I need to still do is to ask the rest of the group if they are ok with me exposing the code to the outside world at this moment. I'm hoping/guessing that that's a formality once I explain to them how helpful you've been so far.

I'll send you a PM after this reply with more details.

And yes, I'll post any code related questions here so others can benefit from the discussions too.

Thanks a lot!

Kartik

Strange memory leak error for functions called within functions

Recommended Answers Collapse Answers

All 26 Replies

Recommended Answers