Rather then do that, how about going with Worker Threads and I/O completion ports!
Very lightweight and you can associate a socket with a user value so since few threads are handling a multitude of sockets, the user value (an index) can be used to connect the traffic to the correct data buffers.
The very quick brief is first you calculate how many processors are in your computer then multiply by two and launch that many threads. You'll have to do a bit more to micro-manage them for the processor distribution if you want more balance. They get put to sleep waiting for an event generated by the I/O completion port. The I/O completion port also waits for events generated by your sockets. (IOCP's can be used for other event types too!) An I/O event occurs, a thread is wakened and a user value is handed to the thread as to what event object caused an event. So the thread wakes up, checks the status of the socket, and deals with it accordingly then goes back to sleep.
That's the short brief but its a very cool mechanism. I have used it for running thousands of sockets on quad and sixteen core systems.