Can Arrays outperform Hashsets for small numbers in JVM? No.

2 min readNov 2, 2021

Everyone knows that hashmaps and hashsets are *the* go-to data objects for storing and retrieving objects when you have large amounts of data.

But what about when you have small amount of data? When does the performance cost of that extra hash function outweigh the benefit of iterating? Imagine if you had only one object — checking only that object may be cheaper than running a hash on the object value, right?

Thus my initial question, “when, if at all, will Array.contains() outperform Hashmap.containsKey()”. No one seemed to answer this on the interwebs, so I set out to test it myself.

Results

Basically, never. :(

Surprisingly for me, the time taken for hashmaps/hashsets for string and int retrievals was roughly equal, which is odd — doesn’t the hash function of string take much longer than of int?

Code can be found here, I think the methodology (create data structure with n objects, choose 10% of them for retrieval, generic data structure / class functions) was the real learning process here.

Publication bias is the real guilty party for me not being able to find this information myself, and so if this saves you half an hour of messing around with data structure comparisons, that’s a good enough result for me.

Can Arrays outperform Hashsets for small numbers in JVM? No.

Results

Written by Yair Morgenstern

Responses (1)