OpenSSL doesn't have any interface for doing just the low level raw block hash part of SHA256. SHA256 begins by wrapping your data in a specially formatted buffer. Setting up the buffer takes an order of magnitude longer than the actual hashing if you're only hashing one or two blocks like we do. It's intended that the time is amortised if you were hashing many KB or MB of data. In BitcoinMiner, we format the buffer once and keep reusing it.
If you can find SHA256 code that's faster (with MinGW/GCC) than what we've got, that would be really great! (although, keep licensing in mind) The one we have is the only one I tried, so there's significant chance for improvement.
When I wrote it more than 2 years ago, there were screaming hot SHA1 implementations but minimal attention to SHA256. That's a lot of time for them to come up with better stuff. SHA256 was a lot slower than the fastest SHA1 at the time than I thought it should be. Obviously SHA256 should be slower than SHA1 by a certain amount, but not by as much as I saw.
(hope you don't mind I renamed your thread, SHA-256 optimisation is something important that I keep forgetting about)