Using a class A for a network would be crazing using the default mask (also inefficient due to the size of the broadcast domain). The potential would be for over 16 million devices. This would be a waste of addresses. Therefore when using a class A like 10.0.0.1 it's logical to use a 16 bit, 255.255.0.0 or 24 bit 255.255.255.0 mask. It makes the network more efficient becuase it reduces the size of the broadcast domain. It also stops you running out of IP addresses.