Thursday 21st November 2013: 6.01am. Location: Waterloo, Ontario.
My 'coding sprint' of two days is over as I have a complex systems conference next two days and this weekend will be busy. My sprint was to further evolve the design of my async hash engine which needs to spot when hash op rounds can be coalesced into SIMD 4-Hash rounds, or whether to throw (hyper)threads at the work queue instead. It must do this very quickly and on a data structure simultaneously in use by other threads - the atomic ordering and locking semantics are proving quite tricky - while also handling arbitrary digest lengths and hash termination. An earlier design simply ran a wall clock and synchronised all workers after each hash round which hugely eases the scheduling, but it couldn't be expected to scale well once your memory prefetchers (one per SIMD stream) start tripping over one another as you max out main memory. This, much looser, design ought to scale much better, if I can get it race condition free. Hopefully I'll grab a few more hours before Thanksgiving.