Sending and Interpreting Commands over a socket

Question

zachattack05 70 Posting Pro in Training

13 Years Ago

I've been struggling with this for a while.

It's not so much a "problem" as it is an approach. I've been working on my client/server socket program and I've reached the point in my planning (which isn't very far in) where I really need to think about how the server will receive commands and respond to them.

Since the server is going to be accepting commands from network clients, interpreting them and (if necessary) running a query on a local SQL server and returning the results to the client the idea of string based commands is not optimal in my opinion, serialized objects sound better.

I was hoping someone could take a minute and give me some feedback on how I plan to approach this.

I know that protocols are all different, and I'm not really asking for that, I have that, but the actual data being transmitted is where I'm stuck.

My idea currently is to have one class for each command type (get data, save data, update data, delete data etc...) and each class would be publicly accessible to both the server and the client. The advantage to this is that the class would only contain data needed to perform that specific task, updating for example would contain a string that held a catalog name, an integer for the row to update or maybe just a datarow and the catalog name. The command type could be transmitted with the object over the socket so that the server doesn't have to determine what type of command it is, it would cypher it out of the socket stream and then send the object to a queue to be processed.

Is there any downside to this approach that anyone can see? Most socket servers examples I've seen are all string based: get a string on the socket, show it in the terminal, send the same data back to the sender, close the socket...boring...and honestly quite useless.

I'd appreciate any feedback on sending objects over the network socket stream. Anything I should be weary about?

2 Contributors
13 Replies
350 Views
1 Week Discussion Span
Latest Post 13 Years Ago Latest Post by mcriscolo

mcriscolo 47 Posting Whiz in Training

13 Years Ago

The main reason you see strings in all of these examples is that strings require the least amount of marshaling to push them across a connection. A string-based protocol is not necessarily a bad implementation; however, if you are in control of the client and server, then you can be a bit more elegant.

You may be able to use .NET object serialization. You declare your class then mark it as [Serializable]. You can then use the BinaryFormatter class to serialize and de-serialize your class to a stream. There are a lot of examples of .NET serialization on the web to examine.

Another approach is to create your own classes that wrap up the details of formatting your data into a byte array suitable for transmission over a socket. You can create a basic object – call it a "NetMsg" object that handles the basics of wrapping a message up into a byte array – for example, every message will probably have a distinct message type and a length. You can then derive a subclass from this class and add the details for the specifics of that type of message. You could have a "SaveDataNetMsg" object, for example. This message would override an “encode” method from the parent that would take the specifics of a “SaveData” message and encode them into a byte array (the base class would take care of encoding the length and the message type).

Once you have the object's data in a byte array, you can squirt that across your socket to the server. When the server gets it, it will have to examine the first few bytes (assuming you lead the binary data with the length, then the type (total of 2 integers – 8 bytes)), so it can un-marshal the data and reconstruct the object on the server side. Your derived class will override a “decode” method that takes the byte array and reassembles the class.

I use this as the basis of a system that pushes a fairly large number of messages per second across 8 machines.

Why not use .NET object serialization? In my case, 4 of the 8 machines are Solaris servers – their side of the code is written in C++ - the other 4 are in .NET/C# on Windows. The latter approach gives you the flexibility to do that – use sockets to connect to machines of different types and pass objects over them (you will have to be cognizant of byte-ordering for the integers, however!).

Hope this helps!

mcriscolo 47 Posting Whiz in Training

13 Years Ago

Attached is some code that implements the base concept. Using the DLL, declare a new class, and have it implement the "IMsg" interface, and subclass the "BaseMsg" class, like so:

public class MyDataMsg : BaseMsg, IMsg

You then implement the "encode" and "decode" methods to put your object's specific data into the data stream. Finally, you use code like this:

byte[] baData = oMsg.encode();

to extract the data for your object as a byte array. You can push this over a socket, write it to a file, etc. When you want to decode it, you read the data into a binary array, then execute code like this:

MyDataMsg oNewMsg = new MyDataMsg();
oNewMsg.decode(byteArrayOfData);

Of course, there are always provisos. When you read the data from a socket, file, etc. - how do you know what type of object it was? The first 8 bytes of the data is the length and the "command id", respecitvely. Both are 4-byte integers, encoded onto the stream in Network (big-endian) format. When you read in the data, you extract the 2 ints, convert them to host format, then examine them to allocate a byte array large enough to hold all of the data, then use the "command id" to allocate the correct descendant of "BaseMsg". Finally, you can call the "decode" method to un-marshal the data and produce a new object with the data intact.

The sample program creates an object, writes it to a file, reads it back into a second object and prints the contents.

This is, of course, a heavily modified version of the code I use for the system I maintain (there are IP restrictions that prevent me from disclosing the code completely), but hopefully you'll find it useful. May not be perfect, but in various forms over the last 17 years (C, C++, Java, .NET and a fledgling version in Objective-C), it's served me pretty well. Truth be told, you may find some horrid constructs in code - but again, I sort of pulled it together quickly. Let me know if you have any problems with the attachment.

Good luck!

This attachment is potentially unsafe to open. It may be an executable that is capable of making changes to your file system, or it may require specific software to open. Use caution and only open this attachment if you are comfortable working with zip files.

NetMsg.zip (3.55 KB)

mcriscolo 47 Posting Whiz in Training

13 Years Ago

That's exactly how I do it. I actually have a class that just has the enums in it. We have just over a hundred different commands, so yeah, just trying to remember the IDs would be a hassle.

mcriscolo 47 Posting Whiz in Training

13 Years Ago

Well, we have a strict policy that new messages go at the end of the list. I don't know how many folks you'll have modding your code, but I'd try to go with something like that.

Strings - you really have to specify a length for each one in your classes. Think of a pure C/C++ struct - you *could* use "char *", but really, the structs will have a field and a defined length, like "char myString[24]".

In C#, you just need to set sizes for each string. I use constants at the top of each class, and in the encode/decode methods, use those constants to set the size of the strings as you put them on and take them off the wire.

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

zachattack05 70 Posting Pro in Training · Answer 1 · 2011-03-16T19:24:03+00:00

Hmmm...

I'll have to re-read your post after I've had more coffee. I think I gather what you are saying, and it sounds similar to what I am already doing (or had in mind I suppose).

Thanks for the help! Would you, if you have time of course, post a "shell" of the concept? Some code that I can cypher through just to illustrate and better understand the idea. I would appreciate it greatly! :)

zachattack05 70 Posting Pro in Training · Answer 2 · 2011-03-23T19:25:00+00:00

I know this thread is getting moldy, but I wanted to say thanks and ask a followup question.

Do you think that having an enumeration of commands would be easier to maintain than having a list of commands that are integer based?

Granted, they would still be integer based (enums are integers after all) but as far as programming and readability and to make sure you dont type 10001 instead of 100001 as the command (or whatever) when programming do you think it might help?

Do you see any downfall to this?

The only downfall I can see is possibly having someone alter the enum list (which would move the integer values around) but even that could be fixed by specifying a value for each enumeration item like so:

enum Command
{
     NoCommand = 0x0000,
     FirstCommand = 0x0001,
     SecondCommand = 0x0002
}

That way even if someone inserted a new command between the first and second one in the list, the command numbers don't change.

What do you think?

zachattack05 70 Posting Pro in Training · Answer 3 · 2011-03-23T19:46:27+00:00

Do you normally explicitly define the integer in your command enum or actually use the enum "string" name in the switch or just leave the default values (0,1,2,3,4 etc...)

Sending a variable length string would be a nightmare I would imagine, I would probably shoot myself, but an integer is a set size. But I would guess that if you are using a switch statement with integer values, you would HAVE to explicitly define the enum values...unless I guess you just add commands to the end of the list if a new one is added, otherwise if you wanted them in alphabetical order and you added one, the command number would change and so the wrong methods would be called because the server would think command 5 is still command 5 because it doesn't realize that the real command 5 moved down 2 spaces because 2 new commands were added above the 5th one.

See what I mean? Does that make sense or am I worrying about nothing?

zachattack05 70 Posting Pro in Training · Answer 4 · 2011-03-24T19:27:10+00:00

Good points.

I think I'm really hitting a different wall though. After talking this over with you in this discussion, for the most part everything you are saying makes sense, some of them I thought of already others are different takes on things I've thought of and there are new ideas too.

I think the problem is the actual design of the system. I think it's a conceptual thing.

Maybe you can help shed some light or offer some advice on that end? The thing is, I'm having problems coming up with the objects (serialized classes) to send.

I had ideas of creating a class for each command, each containing it's own set of fields and methods to help the client and server process the incoming and outgoing data stream. The sheer number of classes I would have to create seems enormous and to be honest, I'm scared to start creating classes just to find out on the 50th one that oops! I should actually do it *this* way instead....which has happened before.

I was hoping to have one class that the client could serialize with information and send to the server, the server would deserialize it, find the field that contains the command and based on that command would look for other fields. Process the command, create a new class instance with the response and send it back to the client.

Two classes...command, response. I just can't figure out how to impliment it, each command could have n number of variables required.

Getting data from a server is easy "Hey! Lemme see this..."

But updating data is tricky it seems "Okay, update this and this and this, but not this."

It just all seems a bit daunting and I'm scared to start in one place only to find out its not the best way.

mcriscolo 47 Posting Whiz in Training · Answer 5 · 2011-03-24T23:46:09+00:00

All valid concerns. However, at some point, you'll have to get started with things :). I don't think there's a single system I've ever developed where I didn't want to go back and redo some aspect of it after the fact. That "hindsight is 20-20" thing is very prevalent in software development!

All that said, I would suggest going with a more simplistic approach on the messages, rather than a message that would have a whole lot of data in it - and you may only be using a portion of it. That is, I'd treat the network as a precious resource - and only send over data that you know you are going to use. Some may argue that today's networks have tons of bandwidth - and they do - but if you design your software like that you almost always run into trouble later down the road.

Yes, a lot of smaller messages may be more of a management issue, but if you pile everything into one message - and then have to make changes to it - you risk possibly breaking code that was dependent on an earlier version of that message.

Even if you have to send multiple small messages to accomplish a task - that may be better from an "atomicity" (is that a word?) standpoint than sending a larger data package that you have to pick through. Perhaps at a later time, if you find that for a certain transaction, you are always sending the same 3 messages each time, you could combine those particular ones into a single message.

Hope this helps.

zachattack05 70 Posting Pro in Training · Answer 6 · 2011-03-25T03:38:21+00:00

Would you say then that serializing and deserializing classes is a bad approach? I'm not sure how much bandwidth something like that would take? I guess I could always serialize my classes to a file and get an idea of their size, but I'm really not sure.

mcriscolo 47 Posting Whiz in Training · Answer 7 · 2011-03-25T18:36:39+00:00

Not at all. However, when you talk about "serializing", I'm assuming that you mean a good chunk of the discussion we've had in this thread, not necessarily the .NET version of serialization. First, I'm not as familiar with that, as I've stated earlier, but I feel certain that you can do that over a socket (any stream, for that matter).

The approach I've used is a custom serialization process. I wouldn't be able to use .NET serialization, since I have UNIX processes as some of the endpoints, and didn't want to have to "invent" the serializer/deserializer on that platform, or perhaps, attempt to use Mono or something like that (OK, the real truth is the UNIX side of the software was written before .NET was even invented). It works for me, and should work in most applications where you're having to send data over sockets to various processes.

As far as size is concerned, you'll have to do some tests to see how large your objects are and determine if that's the best approach. However, how else would you do it? If you have to get the data to a process on another machine, it's going over the network somehow - either directly, via your soon-to-be-developed solution, or via FTP/SCP/Samba/E-Mail - i.e., file transfer - but one way or another, the data's got to get there.

Start small, run some tests and see what results you get. Hopefully, you can gather enough data to help make a good choice.

zachattack05 70 Posting Pro in Training · Answer 8 · 2011-03-26T03:26:29+00:00

Not at all. However, when you talk about "serializing", I'm assuming that you mean a good chunk of the discussion we've had in this thread, not necessarily the .NET version of serialization. First, I'm not as familiar with that, as I've stated earlier, but I feel certain that you can do that over a socket (any stream, for that matter).
The approach I've used is a custom serialization process. I wouldn't be able to use .NET serialization, since I have UNIX processes as some of the endpoints, and didn't want to have to "invent" the serializer/deserializer on that platform, or perhaps, attempt to use Mono or something like that (OK, the real truth is the UNIX side of the software was written before .NET was even invented). It works for me, and should work in most applications where you're having to send data over sockets to various processes.
As far as size is concerned, you'll have to do some tests to see how large your objects are and determine if that's the best approach. However, how else would you do it? If you have to get the data to a process on another machine, it's going over the network somehow - either directly, via your soon-to-be-developed solution, or via FTP/SCP/Samba/E-Mail - i.e., file transfer - but one way or another, the data's got to get there.
Start small, run some tests and see what results you get. Hopefully, you can gather enough data to help make a good choice.

You know, the more I think about it, the more I'm thinking that I need to create a class for each command. Like you said (which is true) the data has to get there. Instead of having the client send part of it, the server get it and ask for another part, and the client responding, if you just send everything at once in a great big chunk you get it over and done with and the server and client both have to do less work. If the server gets the object and finds that information it needs is missing, it logs an error and sends an error response back.

I think I enjoy making my life more complicated than it needs to be.

A single class for each command and response would be simple to follow: send the length of the whole message, send an ID (that corresponds with an enum) so the receiver knows what kind of command/response it is getting, then send the actual object. Easy. The receiving can get the class object, put it back together and it knows the class type already because it was included in the message. Easy as pie (well I think).

Also, to be honest when I say serializing...I mean using a binary formatter and getting the byte array of a class, and sending that array over the network where the recipient will put it back together again.

Is that a bad idea?

mcriscolo 47 Posting Whiz in Training · Answer 9 · 2011-03-26T04:03:54+00:00

Sounds like a plan. One class per command keeps it straightforward. Good luck!