Hello I have a homework question and although I'm not looking for someone to answer this for me, a guide would be appreciated from senior programmers here.

A disc can transfer only one 512 byte block with each access. Student records are 80 bytes longs, are to be organized into hash buckets. How many records should each bucket contain?


You want fast responses and a small filesize.

You can't get fast responses if you can't fit all your records into one hash bucket -- you'll have to read a second one, and so sometimes you'll double your latency. Low latency is basically the most improtant goal.

To make the probability of overflowing a single bucket small, you might try to average one record per bucket. But if then you get an extra-large filesize.

That also actually hurts performance, because probably the buckets are going to be cached in RAM (i mean you'd be crazy not to do so) which means filesize is quite important because you don't want to overflow RAM. (However, you could also cache in a way that collapses small buckets together, which involves some extra in-memory copying when loading and storing buckets from disk, since, when reading/writing from disk, things need to be aligned on 512-byte boundaries. And since your records are exactly 80 bytes long (which is just a ridiculous fake contrived scenaro) doing this is actually easy.