If the network already been set up and configured, for example, you have verfied network connectivity? If so, what ever you are trying to accompish regariding listenting/recording "voice" (maybe you mean audio/video files) will be done at an application level. For example, if you want to stream an audio or video file from a central source on your network to one of the end points, you'll need some software to run on the endpoints that has the ability to connect back to the central source over the network. Usually, its a streaming app at the endpoint that connects back to the central resource via SMB (file share).
If this is not what you are asking about, sorry about that, let me know.
Ok, understood. You'll need some software then to be able to connect to the remote devices. A quick google search provides tons of results on this topic. The first result I came up with is software that can connect to a remote PC and control the webcam and audio input. If the webcams are not connected to PCs, but IP based cameras, you just need management software that can connect directly to the IP of the camera, since many of the IP based cameras run mini-webservices on them.
Realize that there are rules/laws governing whether this is legal and to what extent you are allowed to listen to traffic on the network. You should be well aware of these rules/laws before you decide to continue down this path.
Instead of looking for an active solution to interacting with the devices themselves, why not just listen to the network traffic passively at a common node in the network (gateway device or router, for example).
From that point of view you can record all traffic on the network and slice it up later according to interesting streams. With the sliced streams you can grab whatever original data was sent over the wire.