The Benchmark documentation says:
CAVEATS
Comparing eval'd strings with code references will give you inaccurate results: a code reference will show a slightly slower execution time than the equivalent eval'd string.
The real time timing is done using time(2) and the granularity is therefore only one second.
Short tests may produce negative figures because perl can appear to take longer to execute the empty loop than a short test; try:
timethis(100,'1');
The system time of the null loop might be slightly more than the system time of the loop with the actual code and therefore the difference might end up being < 0.