Is there any chance that faulty hardware (router, switches, NICs) or bad drivers can cause packet duplication or packet corruption
Yes, there are a lot of reasons why packets would be re-transmitted on the network. A faulty switch port can easily cause CRC errors. Bad cables, cross-talk, etc...
JorgeM
Industrious Poster
4,159 posts since Dec 2011
Reputation Points: 297
Solved Threads: 564
Skill Endorsements: 119
In the TCP/IP conceptual model, retransmissions and data correction happens at various layers from the NIC to the application layer. I'm not sure how to advise you with your situation regarding a client/server implementation.
The only thing I'm sure of is that packets arent always going to make it to the destination and they may not always get there intact either.
JorgeM
Industrious Poster
4,159 posts since Dec 2011
Reputation Points: 297
Solved Threads: 564
Skill Endorsements: 119
From what you say, if you are receiving duplicate packets, I can only assume that you are utilizing the UDP or other packet-based protocol, as opposed to a stream-oriented one such as TCP. The TCP protocol will discard duplicates - you should never see them. UDP will not discard them, and it is up to you to detect and discard them.
rubberman
Posting Maven
2,676 posts since Mar 2010
Reputation Points: 378
Solved Threads: 316
Skill Endorsements: 53
There is no guarantee that the ack gets back to the sender, as in the lost ack scenario. However, each packet has a sequence # and CRC, so the receiving network stack will keep track, especially if there are missing packets in between the ones it got. If it gets a duplicate, then it will discard the new copy, yet ack that as well so more don't get sent (hopefully). Naks are for packets that were received with a bad CRC. If the connection is really flaky, then it is possible that a duplicate will be received after the original one has been transmitted to the receiving application; however, the stack (driver) will STILL know that the current window has passed by the duplicate packet, and it will ack/discard it.
This is a problem at the network stack level - and not at the application level. I do network programming in high-volume, high-speed environments (multiple thousands of interacting systems world wide with 10 gigabit network connections internally, and multi-gigabit connections to the internet. This is NOT a problem! :-) It is a good school problem, however, which will impact hardware and firmware designers. My suggestion is that you develop a rigorous finite-state-machine representation of how to deal with this situation. I always find that it clarifies such edge cases very nicely.
FWIW, I have implemented a full TCP/IP stack for real-time embedded systems in the past (around 1990), so this is an issue I am intimately familiar with, and that software runs a lot of US Navy gear - network glitches are not well-received when you are running a warship far out at sea and have to deal with incoming threats! :-)
BTW, when I say I implemented the full stack, it was not from existing source code, but from the RFC's in the DDN (Defense Department Network) Protocol Handbook (the famous set of White Books) from the US DARPA (Defense Advanced Research Projects Agency) that define TCP/IP (thanks Vint Cerf and friends at BBN!). I had to model all interactions using state machines in order to prove that the code would work as designed.
rubberman
Posting Maven
2,676 posts since Mar 2010
Reputation Points: 378
Solved Threads: 316
Skill Endorsements: 53
Question Answered as of 3 Months Ago by
rubberman
and
JorgeM