Hello Ben,
I do not remember the Com Sci specifics in class, but if you can sort your data in memory, then a Btree should be the fastest search. You will have to design your code though so that the tree is evenly balanced. Of course, if the data is random, then good luck with you there. But if you can sort it, I would B-TREE it.
Your worst case is going to be a linked-list situation where your data is stored in the last node of the data stream. Then again, if you make a lopsided B-TREE, then you are left with the linked list.
If possible, you might be able to also build into your data bucket, a repition field, so that when you are loading your data structure, if you have a repeat of the exact data, you can increment a repeate variable, so that you know right away that your data set had X instances of the same information.
Good luck coding it. And if your compiler supports profiling (generating statistics), you might find some interesting things out.
Christian