Hello,

I am making a number of libraries to help me with my projects and I am torn between two approaches. I will show you what each looks like. I have no idea which would be considered 'better' in general.

Locals:

#ifndef SOME_GUARD
#define SOME_GUARD VERSION_NUMBER

#ifdef __cplusplus
extern "C" {
#endif

typedef struct { /*something here*/ } myNewTypeTAG, *myNewType;

myNewType function(myNewType);  // for example

#ifdef __cplusplus
}

class myNewTypeCppMode
{ 
  //private:
    myNewType t;
  public:
    myNewTypeCppMode &function(myNewType o) {
      t=function(o);
      return *this;
    }
};

#endif

#endif

Context (weightless):

#ifndef SOME_GUARD
#define SOME_GUARD VERSION_NUMBER

#ifdef __cplusplus
extern "C"{
#endif

typedef struct { /*something here*/ } myNewTypeTAG__, *myNewType__;

typedef struct { 
  myNewType *x;
  size_t s;
} myNewContextTAG, *myNewContext;

typedef size_t myNewType;

myNewContext getContext();

myNewType function(myNewType, myNewContext);  // uses context->x[object] to access the data

#ifdef __cplusplus
}

class myNewTypeCppMode
//see above example

#endif

//optional main override:
#ifdef OVERRIDE_MAIN

int myNewMain( /*any args I wish to pass*/ , myNewContext);

#define main(X,Y) \
  main(X,Y) { \
    myNewContext con = getContext(); \
    return myNewMain( /*args again*/ , con); \
  } \
  int myNewMain( /*args*/ , myNewContext con__)

#define REALfunction(X) function(X,con__)

#endif // OVERRIDE_MAIN


#endif

Context (heavy):

#ifndef SOME_GUARD
#define SOME_GUARD VERSION_NUMBER

#ifdef __cplusplus
extern "C"{
#endif

typedef struct { /*something here*/ } myNewTypeTAG__, *myNewType__;

typedef struct {
  myNewType *x;
  size_t s;
} myNewContextTAG, *myNewContext;

typedef size_t myNewType;

myNewContext current_context__;

myNewContext newContext(); // also sets the context

void setContext(myNewContext);

myNewType function(myNewType); // uses current_context__->x[object] to access the data

#ifdef __cplusplus
}

class myNewTypeCppMode
//see above example

#endif

//optional main override:
#ifdef OVERRIDE_MAIN

int myNewMain( /*any args I wish to pass*/ );

#define main(X,Y) \
  main(X,Y) { \
    myNewContext con = newContext(); \
    myNewMain( /*args*/ ); \
  } \
  int myNewMain( /*args*/ )

#endif // OVERRIDE_MAIN


#endif

I tend to prefer the heavy context method, but that breaks the 'rule' that libraries should be weightless (even if it is only 1 pointer too heavy). I do not like the locals idea because then if people don't delete every object they make you can end up with a memory leak. Which way is considered 'best'?

Edited 3 Years Ago by mike_2000_17: Fixed formatting

You are going to have to explain what you are actually trying to achieve, because it's really not clear at all from the code.

Then, there are a few obvious problems, of course.

First, seeing stuff like this:

typedef struct { /*something here*/ } *myNewType;

makes steam come out of my ears. Don't "hide" pointer types behind a typedef, especially not raw pointers, and it's even worse when you don't even mention the fact that it is a pointer in the name of the typedef. Just don't do it. I can live with people doing things like typedef std::shared_ptr< MyClass > MyClassPtr;, because there is a benefit in shortening the type name and passing a shared-pointer around is quite safe. For raw pointers, I say, no way José!

Second, defining a MACRO to replace the main function, well... that's just terrible. Any solution that requires this (or similar nasty hacks) is a bad solution, period.

Third, I'm trying to understand whether you are trying to wrap a C library with some C++ code or whether you are trying to expose a C++ library with a C interface. Although the latter is more usually the case, your code is very odd if that's what you are trying to do. The sort of canonical code for exposing a C++ library via a C-compatible interface goes something like this:

#ifndef MY_CLASS_H
#define MY_CLASS_H

#ifdef __cplusplus

class MyClass {

  /* some C++ code, as usual */
  public: 
    void foo();  // some member function.

};

#else

struct MyClass;  // forward-declaration, for C interface.

#endif


// Then, the C interface:

#ifdef __cplusplus
extern "C" {
#endif

MyClass* MyClassCreate();       // allocate / construct / return pointer
void MyClassDestroy(MyClass*);  // take pointer / destruct / deallocate

// and then, member functions like this:
void MyClassFoo(MyClass*);

#ifdef __cplusplus
}
#endif

#endif // MY_CLASS_H

// MyClass.cpp:

#include "MyClass.h"

MyClass* MyClassCreate() {
  return new MyClass();
};

void MyClassDestroy(MyClass* p) {
  delete p;
};

void MyClassFoo(MyClass* p) {
  if(p)
    p->foo();
};

The above is pretty much the easiest and most usual way to export C++ classes as a C interface. In other words, provide functions to allocate and deallocate the resource (object), wrap all the members of the class as free functions taking the object pointer as first argument, and expose only opaque pointers (or handles) either through a forward-declaration (like above) or through a void-pointer that you cast back and forth.

What your code seems to be doing is the opposite. You create some C-style struct to store all the important info of the class, expose that as a C struct at the interface, and then store it inside your C++ class (i.e., a wrapper for the C library code). I don't why you would deliberately want to do this, you get the worst of both worlds. You have to go by the rules of C for all the relevant parts of the code, and then somehow make a pretty-package of C++ on top. This kind of pattern is what you see when people are stuck with a C library and need to incorporate it into some C++ OOP-ish code, and generally, you do it a bit differently too (i.e., as a PImpl / Cheshire Cat).

Fourth, I don't see any significant difference between the "heavy" and "light" context versions, except that the heavy version exposes unnecessary implementation details and is vulnerable to initialization order problems. In other words, the heavy version is just a bad implementation of the "light" version, as far as I can tell. The one thing that is better about the "context" version is that you expose a "handle" to the user, which is an opaque token, which is much better than exposing a pointer to a struct that you have actually defined in the interface (even classic C APIs like Win32, POSIX, X-window, or OpenGL don't do this in general). So, I guess I would have to prefer the "context" version just on that basis.

I tend to prefer the heavy context method, but that breaks the 'rule' that libraries should be weightless

Since when are libraries supposed to be weightless? Whatever that means to you. A golden rule of libraries is that they should take care of themselves and not expose vulnerable implementation details to the outside. Usually, this means that they expose only opaque pointers and specific collections of information, and they manage their own objects internally. If you have to give the library some "weight" to be able to do this, then by all means do so.

I do not like the locals idea because then if people don't delete every object they make you can end up with a memory leak.

I would say, go with the flow. C-style code doesn't have managed memory or managed contexts and stuff like that. You create / allocate stuff by calling some specific function and you destroy / deallocate stuff by calling a corresponding function. This is the way it is, this is the way C programmers expect things to be, this is what you have to deal with when wrapping a C-library resource within a C++ RAII wrapper class. Going with what is expected and idiomatic is often much safer than trying to come up with some odd scheme to "solve" the problem. Providing a C interface doesn't just mean to provide a C-compilable header file and library, it means exposing a C-style interface, that is, following the idioms and established practices of that language.

Again, I'm not really sure what you are trying to do.

Wow... I had no idea you could do that little struct myClass thing and actually use a class in there for the c implementation. That makes my work a whole lot easier! Also I didn't realize that it was considered 'acceptable' for a c library to make you clean your own garbage, I was trying to find a way to make it so that all my types c or c++ would auto-delete. Anyways, I guess your solution is clearly best... But what exactly are the rules for that... can you just pass a void pointer to the c-style functions and then cast it to a class in the c++ definition of the function? I would have guessed that would be impossible. Anyways I still do not fully understand exactly how the syntax would work... would it be something like this:

//myLib.h
#ifndef SOME_GUARD
#define SOME_GUARD VERSION_NUMBER

#ifdef __cplusplus
class myClass
{
    //privates
    public:
    //publics
    myClass();
    myClass &someFunction();//to better understand how this would work
    int operator[](int)const;//again... for understanding purposes
};
#else
typedef void *myClass;
myClass myClassConstructor();
const myClass myClassSomeFunction(myClass);//since a const pointer can act like a reference if necessary
const int myClassOperatorSubscript(int,const myClass);
#endif

#endif

//myLib.cpp
#include "myLib.h"
myClass::myClass(){/*something*/}
myClass &myClass::someFunction(){/*something*/}
int myClass::operator[](int i){/*something*/}
void *myClassConstructor(){return new myClass;}
const void *myClassSomeFunction(void *x){return &((myClass*)x)->someFunction;}
int myClassOperatorSubscript(int i, const void *x){return ((myClass*)x)[i];}

Then just compile it as a static link library and I'm done?

I had no idea you could do that little struct myClass thing and actually use a class in there for the c implementation.

In C++, struct and class can be interchanged at will. The only difference between the two is the default access rights (private for class, and public for struct) and the default inheritance (private for class, and public for struct), but these only matter for the declaration of the class/struct (i.e., where you have all the data members and functions). When the same class name appears in multiple places, it is usually because there is one declaration and then a number of other things like forward declarations and friend declarations. In those multiple places, it doesn't matter (strictly speaking) whether you write struct or class. However, some compilers might complain with a warning.

can you just pass a void pointer to the c-style functions and then cast it to a class in the c++ definition of the function? I would have guessed that would be impossible.

There is a requirement in the C++ standard that states that any pointer type can be cast to a void* and that for any type A, a pointer to an object of type A (lets call it a_ptr) can be cast using void* v_ptr = reinterpret_cast< void* >(a_ptr); and then cast back to a pointer to type A, yielding the same as the original pointer, i.e., ( a_ptr == reinterpret_cast< A* >(v_ptr) ) will always be true.

This is pretty much the only "safe" use of void-pointers, that is, casting a pointer to a void-pointer and then immediately casting it back to the original pointer type. Each individual cast has undefined behavior (and thus, not safe to use individually), but the cast-to-void-ptr-and-cast-back operation is well-defined, and useful in this particular case (and that's essentially why this requirement exists).

Personally, however, for C interfaces, I prefer to just use the forward-declaration as I demonstrated in the last post, it's much cleaner that way. And because there is only a forward-declaration, the type remains "incomplete" which means that it is literally as opaque as a void-pointer.

would it be something like this:

Not quite, you made a few mistakes. Here's the corrected version:

//myLib.h
#ifndef SOME_GUARD
#define SOME_GUARD VERSION_NUMBER

#ifdef __cplusplus

class myClass
{
    //privates
  public:
    //publics
    myClass();
    ~myClass();
    myClass& someFunction();   //to better understand how this would work
    int operator[](int) const;   //again... for understanding purposes
};

#endif   // Notice, the remained is compiled in either C or C++:

typedef void* myClassHandle;    // AARRGH! Don't use the name 'MyClass' to mean a pointer!

// You forgot the extern "C" declaration:
#ifdef __cplusplus
extern "C" {
#endif

myClassHandle myClassConstructor();
void myClassDestructor(myClassHandle);  // <-- don't forget the destroy / free function!

myClassHandle myClassSomeFunction(myClassHandle);
int myClassOperatorSubscript(int, const myClassHandle);

#ifdef __cplusplus
}
#endif

#endif


//myLib.cpp

#include "myLib.h"

myClass::myClass() { 
  /*something*/
}

myClass::~myClass() { 
  /*something*/
}

myClass& myClass::someFunction() {
  /*something*/
  return *this;
}

int myClass::operator[](int i) const {   // <-- the constness of the function must match!
  /*something*/
}


myClassHandle myClassConstructor() {
  return reinterpret_cast< void* >( new myClass );
};

void myClassDestructor(myClassHandle p) {
  delete reinterpret_cast< myClass* >( p );
};

myClassHandle myClassSomeFunction(myClassHandle p) {
  return reinterpret_cast< void* >( & ( reinterpret_cast< myClass* >( p )->someFunction() ) );
};

int myClassOperatorSubscript(int i, const myClassHandle p) {
  return reinterpret_cast< const myClass* >( p )->operator[](i);
};

However, there is a much nicer way to do the above. One thing to understand about C is that it is not type-safe at all, and one of the manifestations of that is the fact that you can't overload functions (and that you have to use extern "C" when writing C-compatible functions in C++). In other words, the C language implicitly forgets the type of the parameters passed at the call-site and just assumes that they had the right type when entering the function body. And there is no "conversion" involved anywhere. So, you can use that to avoid all those casts by simply telling the C compiler that the functions take a void* and then program them to take a myClass* in the C++ side. This gives the following:

//myLib.h
#ifndef SOME_GUARD
#define SOME_GUARD VERSION_NUMBER

#ifdef __cplusplus

class myClass
{
    //privates
  public:
    //publics
    myClass();
    ~myClass();
    myClass& someFunction();   //to better understand how this would work
    int operator[](int) const;   //again... for understanding purposes
};

typedef myClass* myClassHandle;  // on C++ side, a handle is a pointer to myClass.

#else

typedef void* myClassHandle;    // on C side, a handle is a void-pointer.

#endif

#ifdef __cplusplus
extern "C" {                    // on C++ side, this will strip type information
#endif                          // from the symbol-table.

myClassHandle myClassConstructor();
void myClassDestructor(myClassHandle);

myClassHandle myClassSomeFunction(myClassHandle);
int myClassOperatorSubscript(int, const myClassHandle);

#ifdef __cplusplus
}
#endif

#endif


//myLib.cpp
//  This code is C++ code, so 'myClassHandle' is a 'myClass*' type.

#include "myLib.h"

myClass::myClass() { 
  /*something*/
}

myClass::~myClass() { 
  /*something*/
}

myClass& myClass::someFunction() {
  /*something*/
  return *this;
}

int myClass::operator[](int i) const {   // <-- the constness of the function must match!
  /*something*/
}


myClassHandle myClassConstructor() {
  return new myClass;
};

void myClassDestructor(myClassHandle p) {
  delete p;
};

myClassHandle myClassSomeFunction(myClassHandle p) {
  return &( p->someFunction() );
};

int myClassOperatorSubscript(int i, const myClassHandle p) {
  return (*p)[i];
};

As you can see, things could hardly get any simpler. But still, I prefer the forward-declaration version because it avoids having to define this myClassHandle type, and works the same otherwise.

Then just compile it as a static link library and I'm done?

Yes, either as a static link library or as a dynamic link library (DLL or .so). Of course, in the DLL case, you'll have to add some specifiers to the C interface functions, such as __declspec( dllimport ) and __declspec( dllexport ), as well as __stdcall (to follow Windows "traditions").

I was assuming that this is for a DLL. Because there isn't much reason to go through all this trouble in other cases. When using a Microsoft-style static link library (.lib), you can use C++ constructs because the static link library will have to have the correct binary compatibility anyways. In all other cases (Unix-like systems, and Mac OSX), binary compatibility is guaranteed by the GNU/Intel Itanium C++ ABI specification (which hasn't changed in a decade now), so it's safe to use C++ throughout. And, in Unix-like environments, static and dynamic libraries work essentially the same from a programmer's perspective. So, the only real case remaining is when writing DLLs (on Windows). But, of course, this code works in all cases, and will be usable from pure C code.

Okay... I am still a little confused. I have written DLLs and SLLs? before, but I never fully understood them. For instance I never understood what extern "C" { does exactly. Also, what do you mean that a .lib c++ constructs are fine? Wouldn't I still have to define c-style ones too if I want it to be useable in c?

I have written DLLs and SLLs? before, but I never fully understood them.

Ok. Let's take it from the top. When the compiler compiles each c/cpp file, it generates an object file (usually with extension .o or .obj, depending on the compiler). These object files contain, among other things, a section of compiled code for each function. That section of code is marked with a symbol to identify it (the symbol is usually the function name, possibly with some extra decoration). Any function call in the code usually remains in the object file as a symbol reference, that is, a kind of "this needs to be replaced by a jump to the final address of that function" marker.

Then, the job of the linker is to take all the object files, collect all the symbols into a massive symbol table (i.e., associates symbols with where its corresponding code can be found), and then goes through all the symbol references and replaces them with a jump to the appropriate code (finding the symbol in the table is called resolving the reference (and hence the errors like "unresolved reference to ..."), and replacing the reference with a jump (or call) is called linking).

A static-link library means that the linking is done completely before generating the final executable or DLL. Essentially, a static-link library is just a collection of object files packed together in one file. Linking to a static-link library is pretty much identical to having all the cpp files of that library added to your "project", except that they are pre-compiled. This also means that all the code of the static libraries is put into the final executable / DLL, which is then a stand-alone executable (doesn't need the library to be installed on the system to be able to run the executable).

A dynamic-link library means that the final linking is delayed until the executable is loaded (i.e., starts running). What this means is that the linker still resolves the symbol references, but instead of linking them to the code in the DLL, it creates code to do so automatically when the executable is loaded (run). This causes a number of interesting things. First, this means that the DLL must be installed on the system (or somewhere where the executable can find it) in order to run the executable. Second, the advantage of using DLLs is that the code does not need to be incorporated into the executable, i.e., if you have many applications that use the same library of code, it's more economical to have many small executables and one large DLL as opposed to many large executables. Third, DLLs can be shared between applications, and I mean, one DLL can be used by many applications at the same time. This means that DLLs are loaded (when the first application that needs them are loaded) on their own on the system, which implies a number of technical issues that I won't get into here. Finally, this means that you can't control the version of the DLL, meaning that the actual DLL on the system might not be the same version as the one you original wrote your code for, it also means that the DLL might not have been compiled with the same compiler (or the same version or options used). And that final issue is where a lot of the trouble comes from.

Because the DLL version cannot be controlled by the executable (at least, not in any practical way), this means that the interface (or API) of the DLL must be very stable and very strictly enforced by the people writing the library. One part of the interface is what is called the application binary interface (or ABI) which is about the way different code constructs are actually represented in memory. Things like how a C-style struct looks in memory (e.g., in what order do members appear, how the bytes are aligned to the native word size (32bit vs. 64bit), etc.), and things like how a function is called (which is called the calling convention). One issue with C++ is that the C++ standard does not specify the ABI for C++ constructs (classes, member functions, inheritance, virtual tables, etc.), which is left to the implementer (i.e., the compiler vendor). On the other hand, the C language standard does specify the ABI (which is a lot simpler, given the simpler constructs available in C). This is why it is conventional and effectively necessary to restrict yourself to a C-style interface for the DLLs, because that is the only way to guarantee that the ABI will be consistent between any executable and any DLL. That's why C is the fallback language for virtually every other programming language that exists.

For instance I never understood what extern "C" { does exactly.

In C++, you can do function overloading (functions with the same name, different parameters). In C, you cannot. This means that, in C++, when the compiler creates the symbols when compiling the code, it must, somehow, encode information about the number of parameters and their types such that each symbol is unique to each function overload. The same goes for function templates and other "fancy" C++ constructs. This encoding that the compiler does is called "name-mangling", and it usually generates some weird-looking gibberish to name each function. If you want to make C++ function callable from C code, you have to turn this off because, in C, there is no need for this type of name-mangling (no overloading, no fancy things) and the function names, unmodified, usually serve as the symbols directly (sometimes with a few added characters for encode the calling convention used). What extern "C" does it tell the compiler to generate C-compatible symbols for the functions, instead of the usual C++ name-mangling (i.e., it translates to "make these functions to look like C functions"). This also means that functions marked with extern "C" will have their type information disappear (to the eyes of the linker) and cannot be overloaded.

Also, what do you mean that a .lib c++ constructs are fine?

It means that static-link libraries are essentially like collections of object files, which are usually incompatible between compilers (versions and options) regardless of whether you use C++ or C. From my earlier explanation of the issues with DLLs that make it so that you need to use C interfaces, you see that those issues don't apply to static-link libraries, and thus, it is safe to use C++ interfaces there.

Wouldn't I still have to define c-style ones too if I want it to be useable in c?

Yes, of course. If you want to call the code from C code, then yes, you need to make it a C interface. I was just assuming that you wanted to call the code from C++ code, which, when using a DLL, usually involves creating a C interface for the DLL and then wrapping the C interface in C++ code at the other end. If you want to statically link to some C++ code from some C code, you can certainly do so, as long as the C++ code only exposes a C interface. Also, for any other programming language, C interfaces are required as well (all languages can call C code and expose a C interface, i.e., it's the universal inter-language medium).

It's just not as usual to "downgrade" like you do (i.e. call some C++ library-code from C code), I usually see the reverse problem or language-to-language cases. ;)

Sorry about the massive question load, I just really want to ensure I understand. What you are saying is that a DLL with a properly written c-interface can have its functions (or classes if in c++) called by pretty much any language that supports calling c functions and on most operating systems, whereas a static link library would only really work for people using MY compiler (or a similar one). If this is true I think I want a DLL despite versioning issues (especially since I can always store the version in a small global variable) since a DLL will allow me to 'always' be able to use the code, although sometimes it may take some work. Particularly I am creating a little 'package' of 3 libraries that contain functionality that I find myself using a lot. I have a graphics library that simplifies my OpenGL work (no more tedious window set-up, etc), an integer library that does arbitrary precision arithmetic on integers (and extends to rationals as well), and a file handling library designed to get the raw data out of compressed or complicated file formats. I want to ensure the following conditions on the libraries:

1) I can use any number of them in any combination without an issue. (IE: I can use the graphics and math libraries at the same time)

2) I can use any number of the libraries with c OR c++. (or it seems even other languages???)

3) I can run any two programs that use the same libraries at the same time.

4) Using a library is as simple as including the correct header and then doing something simple with the file.

It seems that a DLL would satisfy this, but a SLL would have an issue with 2 since it wouldn't necessarily be c/c++ compatible, especially if I use different compilers.

What you are saying is that a DLL with a properly written c-interface can have its functions (or classes if in c++) called by pretty much any language that supports calling c functions and on most operating systems

Well, if by "most operating systems" you mean "Windows-only" (and probably a relatively recent version of it), then yes. DLLs are only for Windows, and they could use code (via OS calls) that are not supported on older versions. In every other kind of operating system (all Unix-like environments), we don't call them DLLs, we called them "shared object files" (with extension ".so") and they are similar to DLLs, but they are also very different (and far more powerful, actually).

whereas a static link library would only really work for people using MY compiler (or a similar one).

Not exactly. Here again the world is split as Microsoft vs. the World. If you use the Microsoft compiler (MSVC) (with .lib extensions for static libraries), then, yes, static libraries are not compatible between compiler versions, but there are really only a few versions (2008 (MSVC9), 2010 (MSVC10), 2012 or 2012-nov-update (MSVC11)) that anyone would reasonably still use today. All other compilers (except for older Borland compilers) use the Intel C++ ABI standard, and all the C/C++ static libraries (and dynamic libraries) are compatible across all versions since the adoption of that ABI standard. So, in the non-Microsoft world, all these issues we've been discussing don't really exist.

Another thing, when you compile code that uses your DLL, you still need to link with what is called an "import library" which also a ".lib" file (paired with the DLL) that the linker uses to be aware of the symbols that exist in the DLL and be able to generate the code to load the DLL. And that import library is a static library, subject to the same issues. So, using a DLL doesn't really solve the problem of having to provide a static library for each compiler. If you distribute your code as a library for other people to use in their code, you still have to distribute an import library for each supported version of the Microsoft compiler. Oh, and for all other compilers, you don't need import libraries, the linker can understand the DLLs directly.

(if you ever wonder why people hate the Microsoft compilers, you have your answer)

I can always store the version in a small global variable

Sure, but that's not a really practical solution in the long run though. Also, DLLs don't allow you do export global variables (i.e., "extern" global variables are not exported by the DLL). So, you have to use a function instead, i.e., call it to retrieve the value. The same goes for any "global data" that you want to share.

Unix-like environements, with .so files, global variables are shared. That's one of the important differences between DLLs and .so files.

So in summary you are saying that for my purposes a static library, compiled with not MSVC (which is fine since I use either MinGW or g++ depending on situation) would probably be best. Then all I need to do is include my header files and link to my library to get access to all my functions? And I can also have a global variable or two in the header file that my library functions can access?

Then all I need to do is include my header files and link to my library to get access to all my functions?

Yes.

And I can also have a global variable or two in the header file that my library functions can access?

If your headers declare some extern global variables and you have the definition of those global variables in your library's cpp files, then you can access them if you compiled your library as a static link library. Under Windows, you cannot access those variables if the library is compiled into a dynamic link library, regardless of the compiler you use (it's part of the design of DLL system in Windows).

This article has been dead for over six months. Start a new discussion instead.