Borrowed idea from Go's usage of this
optimization for ReadFrom() on client
side, we should re-use the 32k buffers
io.Copy() allocates for generic copy
from a reader to writer.
the performance increase for reads for
really tiny objects is at this range
after this change.
> * Fastest: +7.89% (+1.3 MiB/s) throughput, +7.89% (+1308.1) obj/s