BitcoinTalk
Bitcoin x64 for Windows

View Satoshi only

External link

So, after a lot of experimentation, pulling out of hair, cursing of the developer, I finally managed to get a build of Bitcoin compiled under MSVC.

all optimizations are on including SSE2, LTCG and favouring of Intel64 (well hey, that's my processor)

Performance difference? The two builds I made (namely 32 and 64 bit) are practically equal in terms of performance - however, their performance is not equal to the stock build currently available.

On my quad-core with the stock Windows binary available from bitcoin.org I get about 1700k hashes a second. With the builds I produced under MSVC I get 2500K a second. Anyone interested?
I am... would it be even faster on SSE3?
sure, I'd like to see if it removes my 20% difference from the linux bin's.
So, after a lot of experimentation, pulling out of hair, cursing of the developer, I finally managed to get a build of Bitcoin compiled under MSVC.

all optimizations are on including SSE2, LTCG and favouring of Intel64 (well hey, that's my processor)

Performance difference? The two builds I made (namely 32 and 64 bit) are practically equal in terms of performance - however, their performance is not equal to the stock build currently available.

On my quad-core with the stock Windows binary available from bitcoin.org I get about 1700k hashes a second. With the builds I produced under MSVC I get 2500K a second. Anyone interested?
I've got a spare server using a 8-core 64-bit Intel - 64 bit Windows 2003 server. I'd be curious if there is a difference from the stock, the machine currently does about 5000khash/s, I'd be curious if the 64bit native build would be better.
I am... would it be even faster on SSE3?

potentially yes, I compiled it with the Microsoft compiler, I'm in the process of getting the Intel compiler so that could well offer a significant performance boost given that it supports far more instruction sets than MS's does.

Anyhow, I've uploaded the 2 binaries and the required DLLs to make them work and you can get them from right here. After a bit of testing I found that the x64 binary does outperform the 32 bit by 100-200k or so but it does depend a lot on what other crap you have running on your system.

And if anyone feels generous... 18em7jEuKe1W74ChAZMFShUuqmwudWmpgu
see my previous post.

Feedback on performance is appreciated, bitcoins even moreso.
see my previous post.

Feedback on performance is appreciated, bitcoins even moreso.
I keep getting a "MSVCR100.dll" not found error when trying to start it. I copied the file over from another system into the system32 folder, but the program still can't find it for some reason. Do we need the full Microsoft dev environment installed to test it?
How do we know this is not a scam?  Give us your IP to donate with or something so we can make sure you are reputable.  I am scared to install it, it might steal my bitcoins.
How do we know this is not a scam?  Give us your IP to donate with or something so we can make sure you are reputable.  I am scared to install it, it might steal my bitcoins.
I'll guinea pig, I'm running it on a test system that has no balance yet, well trying to run, it won't start.
It's not a scam, I decided to install it and see. Unless it's a very complex scam which waits for a while before stealing my coins Wink

This version has between 200-700 more khash per second on my machine.
It's not a scam, I decided to install it and see. Unless it's a very complex scam which waits for a while before stealing my coins Wink

This version has between 200-700 more khash per second on my machine.
No missing DLL errors when you try to start it?
It's not a scam, I decided to install it and see. Unless it's a very complex scam which waits for a while before stealing my coins Wink

This version has between 200-700 more khash per second on my machine.

correct, it's not a scam, the code is completely vanilla, I didn't obsfusicate it so you can feel free to examine it in the disassembler of your choice.

so, with my build you've gone from 5000k to 5700k?
Get visual C 2010 redist from Microsoft -- pointers here:

http://www.mydigitallife.info/2010/04/17/visual-c-2010-runtime-redistributable-package-x86-x64-ia64-free-download/

The new code is full of win.. hash rate (4 core) went from 1350 to 1880 khash/s.  Sweet!
For those that get the DLL error and don't want to install the full dev environment, you need to download the Visual C++ 2010 Redistributable x64 located
http://www.microsoft.com/downloads/details.aspx?FamilyID=BD512D9E-43C8-4655-81BF-9350143D5867&displaylang=en
I recommend getting the redist from the official source:

x86 here: http://www.microsoft.com/downloads/details.aspx?FamilyID=a7b7a05e-6de6-4d3a-a423-37bf0912db84

x64 here: http://www.microsoft.com/downloads/details.aspx?FamilyID=bd512d9e-43c8-4655-81bf-9350143d5867

If there's demand for it, I can compile an Itanic Itanium build... for a reasonable consideration Smiley
How do we know this is not a scam?  Give us your IP to donate with or something so we can make sure you are reputable.  I am scared to install it, it might steal my bitcoins.

If I were scamming, what incentive would I have to ask for donations knowing that those who used it would have their account emptied anyway. Frankly though, my username is far more of a verification of my authenticity than my IP would ever be Smiley

I can assure you the code is 100% clean.
Get visual C 2010 redist from Microsoft -- pointers here:

http://www.mydigitallife.info/2010/04/17/visual-c-2010-runtime-redistributable-package-x86-x64-ia64-free-download/

The new code is full of win.. hash rate (4 core) went from 1350 to 1880 khash/s.  Sweet!


that's very interesting, what processor/frequency are you on? x86 or x64?
Win7 64-bit
Intel Xeon 5130 (2.0GHz dual-proc)

I'll send you the first 50.0 it finds. Smiley
Nice job, this is just my initial testing, but the 64 bit compile speeds up hashing by over 28% so far over the 32 bit counterpart.

So for example, my 8-core system does (600 khash/s per core =  4800 khash/s) for normally, but now averages 5700 khash/s

I'll certainly be sending some bitcoins your way  Wink
Testing with the 32bit client, your build 32bit client build crashes sad to say on a stock system (Windows XP) anyway. Going to later test on Vista, 7, etc and report back.
Testing with the 32bit client, your build 32bit client build crashes sad to say on a stock system (Windows XP) anyway. Going to later test on Vista, 7, etc and report back.
]

do you have the Visual C++ 2010 runtime installed? if not, that's why.
Does that apply for the x86 one also, since that's what I was using to test the 32bit build he had?

[edit] I just checked, it was good for the 32bit  Roll Eyes
no, that's definitely correct.

there is another possibility however... what processor is in your XP machine? if it doesn't support SSE2, it'll crash.
Testing with the 32bit client, your build 32bit client build crashes sad to say on a stock system (Windows XP) anyway. Going to later test on Vista, 7, etc and report back.
]

do you have the Visual C++ 2010 runtime installed? if not, that's why.
Yes, straight from the download link on the previous page for x86.

I looked through the error log a little, didn't spot anything that stood out.
no, that's definitely correct.

there is another possibility however... what processor is in your XP machine? if it doesn't support SSE2, it'll crash.

Tested it a Celeron machine (1.1 GHz)

Good point, I'll see what the others do.
How do we know this is not a scam?  Give us your IP to donate with or something so we can make sure you are reputable.  I am scared to install it, it might steal my bitcoins.

If I were scamming, what incentive would I have to ask for donations knowing that those who used it would have their account emptied anyway. Frankly though, my username is far more of a verification of my authenticity than my IP would ever be Smiley

I can assure you the code is 100% clean.


Sorry to be so skeptical, but there was a victim on this site when someone claimed to have compiled a CUDA client that used the graphics processor to hash.  One guy fell for it and lost some bitcoins.

But yes, if I get a chance I will look at the dlls I suppose before running it.

Or, maybe you can release the project files and we can compile it ourselves?
How do we know this is not a scam?  Give us your IP to donate with or something so we can make sure you are reputable.  I am scared to install it, it might steal my bitcoins.

If I were scamming, what incentive would I have to ask for donations knowing that those who used it would have their account emptied anyway. Frankly though, my username is far more of a verification of my authenticity than my IP would ever be Smiley

I can assure you the code is 100% clean.


Sorry to be so skeptical, but there was a victim on this site when someone claimed to have compiled a CUDA client that used the graphics processor to hash.  One guy fell for it and lost some bitcoins.

But yes, if I get a chance I will look at the dlls I suppose before running it.

Or, maybe you can release the project files and we can compile it ourselves?
Point well taken, but the CUDA never materialized, this one actually does what he said it does. So if it is a scam, it sure is taking a lot of effort.  Smiley
I think he should really present to this devs to get credit for it and some BC because I'm going to send some to him for raising my khash/s up and beyond the already insane levels that I have.  Grin
For 32bit clients, this makes a huge speed increase (if your PC supports it)

I was using a little netbook to test with, it could manage about 185 khash/s but his compile does 238 khash/s so over a 28% increase (what I saw in the 64 bit clients), so another thumbs up for this build. (Exe size is smaller too  Wink )
Here's a weird question.. has anyone actually generated a block with this faster version?

I have a few machines that used to regularly generate, and since switching to this version -- zip.  I know these things are subject to random variation, and it could be a dry spell, and the difficulty is going up, but... could it be a bug?
Don't worry, getting BTC now requires supercomputers, clusters and/or botnets  Wink

Counterexamples, with Khash/s noted, welcome  Smiley
Ichi: I got 2 blocks in a row back when I was running 1200khash.  I'm up to 2500 now, as per the update, so we'll see if I get some more blocks. =)
I did the same thing with VC++ 2008: http://bitcointalk.org/index.php?topic=453.0
As I wrote there, the results were slower than stock.

Haven't yet tried x64 or 2010 because rebuilding the dependencies/setting up is a hassle. Your builds give improvement over stock on my system though.

Curious.

It should be the sha.cpp module that matters. Did you modify the source in any way, or specify any special defines?
Hey, I just made this improvement that will vastly improve your khash/s!

In main.cpp, ~2751, change:
   string strStatus = strprintf("    %.0f khash/s", dHashesPerSec/1000.0);
to:
   string strStatus = strprintf("    %.0f khash/s", dHashesPerSec/300.0);

Then on ~2758, change:
  printf("hashmeter %3d CPUs %6.0f khash/s ", vnThreadsRunning[3], dHashesPerSec/1000.0);
to:
  printf("hashmeter %3d CPUs %6.0f khash/s ", vnThreadsRunning[3], dHashesPerSec/300.0);


(DISCLAIMER: THIS IS A JOKE.)
Don't worry, getting BTC now requires supercomputers, clusters and/or botnets  Wink

Counterexamples, with Khash/s noted, welcome  Smiley

Regular old BTC client running on ubuntu (2150-2500 khash/sec) managed to produce a block last night Wink

My 300 khash laptop got lucky and produced one the first day I used Bitcoin.
Regular old BTC client running on ubuntu (2150-2500 khash/sec) managed to produce a block last night Wink

My 300 khash laptop got lucky and produced one the first day I used Bitcoin.
That's good news.  Thanks  Smiley
Regular old BTC client running on ubuntu (2150-2500 khash/sec) managed to produce a block last night Wink

My 300 khash laptop got lucky and produced one the first day I used Bitcoin.

I grabbed one on a 480-500 khash/sec machine last night too, so it's still doable. Also, that same machine is in the 530-550 range with this new build (32 bit). Sending some bit-love to Olipro right now. The effort's gotta be worth something. Nice work.
So --- Bitquux, that was Olipro's binary that found the coin?
So --- Bitquux, that was Olipro's binary that found the coin?


Negative, that was before the switch. And at this rate, it will be a long time before I can tell if it will. Even if it's a dirty bin it's clever enough to earn a donation.
So --- Bitquux, that was Olipro's binary that found the coin?


Negative, that was before the switch. And at this rate, it will be a long time before I can tell if it will. Even if it's a dirty bin it's clever enough to earn a donation.

It shouldn't be dirty, the SHA256 is untouched and I don't think the compiler has introduced any errors.
It shouldn't be dirty, the SHA256 is untouched and I don't think the compiler has introduced any errors.

Yes, let me clarify. This is not a dirty bin. There is no reason so far to believe it's anything other than what Olipro says it is. The performance of my previously mentioned machine has even gone up since I first posted those numbers.
Here's a weird question.. has anyone actually generated a block with this faster version?

I have a few machines that used to regularly generate, and since switching to this version -- zip.  I know these things are subject to random variation, and it could be a dry spell, and the difficulty is going up, but... could it be a bug?

Yes, about 10 so far. The other "regular" clients acknowledge test Coin transfer between the two, so far it appears good. I also have a packet sniffer running on my test machine and it's basically doing what the other clients do, nothing out of the ordinary so far.
This worked perfectly! On linux and windows if you compile yourself. I went from 380khash/s on one core to over 1200khash/s!

yeah, because dividing a number by 300 instead of 1000 makes it look a lot bigger, doesn't *actually* improve performance.
OK, I've made a new build of Bitcoin, this one is compiled using the Intel compiler which is considerably more advanced than the standard MS compiler.

Please note this build is 64-bit only since I see no reason to compile a build for 32-bit when most 32-bit processors lack the newer SSE instructions anyway.

Performance? my MSVC build averages about 2400K for me, this averages 2900K so you're looking at an improvement of about 125k per core or thereabouts although this does come at the cost of a larger EXE, improved performance is worth it in my opinion Smiley

My next goal is to see if this PolarSSL SHA256 algo really is faster. for now however, download Bitcoin x64 ICC optimised build here
Runs about the same speed as the VC++ version here (Xeon, Win7 64-bit).
OK, I've made a new build of Bitcoin, this one is compiled using the Intel compiler which is considerably more advanced than the standard MS compiler.

Please note this build is 64-bit only since I see no reason to compile a build for 32-bit when most 32-bit processors lack the newer SSE instructions anyway.

Performance? my MSVC build averages about 2400K for me, this averages 2900K so you're looking at an improvement of about 125k per core or thereabouts although this does come at the cost of a larger EXE, improved performance is worth it in my opinion Smiley

My next goal is to see if this PolarSSL SHA256 algo really is faster. for now however, download Bitcoin x64 ICC optimised build here
Awesome, gives a 41% speed boost over the stock binary.

My experience so far.

Stock = Stock
Last x64 Build = 28% speed increase over stock
This x64 Build = 41% speed increase over stock

Nice stuff!
Runs about the same speed as the VC++ version here (Xeon, Win7 64-bit).

which Xeon, there's quite a few.

I don't think I need to point this out but here goes anyway: the significance of the performance benefit from using any of my builds will depend entirely on how many cores your computer has.

Also, in case anyone missed it: the PolarSSL algo is *not* faster.

next step... CUDA.
What would I expect for two Xeon 5570?
What would I expect for two Xeon 5570?

a 4 core processor with hyperthreading... and two of them? a pretty significant performance increase.
OK, I've made a new EXE, this one seems to get me an extra 100-200k (or about 25-50k per core).

Difference? I modified the ByteSwap function to operate on 64-bit integers; it does this by using the bswap intrinsic on a 64 bit register followed by rotate right through 32 bits to put the result in the correct order. it also initializes the SHA256 vectors using unsigned 64 bit values (however, the actual hashing still uses 32-bit so I doubt this is making much of a difference) and yes, I did convert the 32 bit numbers to 64 bit correctly (i.e. 0x12345678UL 0xabcdef0UL -> 0xabcdef012345678ULL) if that appears wrong to you, think about how little endian machines store 32 bit integers in memory.

anyway, grab it here
Wow, that tweaked one... I'm getting stunning rates.  I have a quad core Intel laptop.

I have two different number sets, 1 while operating as usual, with other programs running, and one while everything but bitcoin (including explorer.exe) is shut down.
Stock:         x64 v1          x64 v2           x64 v2 Tweaked       
Standard Usage:500-12001000-1800750-15002200-2700
Optimized Usage:1500-18002000-25001500-20002700-3400

O_O
May I kindly ask you, mighty Olipro, to patch bitcoin 0.3.1 or may be 0.3.2? There were some issues in 0.3
OK, I've made a new EXE, this one seems to get me an extra 100-200k (or about 25-50k per core).

Difference? I modified the ByteSwap function to operate on 64-bit integers; it does this by using the bswap intrinsic on a 64 bit register followed by rotate right through 32 bits to put the result in the correct order. it also initializes the SHA256 vectors using unsigned 64 bit values (however, the actual hashing still uses 32-bit so I doubt this is making much of a difference) and yes, I did convert the 32 bit numbers to 64 bit correctly (i.e. 0x12345678UL 0xabcdef0UL -> 0xabcdef012345678ULL) if that appears wrong to you, think about how little endian machines store 32 bit integers in memory.

anyway, grab it here
I'll give this one a run, the last build would crash randomly after a few hours  Wink
May I kindly ask you, mighty Olipro, to patch bitcoin 0.3.1 or may be 0.3.2? There were some issues in 0.3

my builds are from the latest SVN source code, of course, had you checked the about page, you would notice it clearly states itself as 0.3.2
hmmmmm, that's strange... because I am experiencing the problem already fixed in http://bitcoin.svn.sourceforge.net/viewvc/bitcoin?view=revision&revision=102
hmmmmm, that's strange... because I am experiencing the problem already fixed in http://bitcoin.svn.sourceforge.net/viewvc/bitcoin?view=revision&revision=102

then either it got re-broken or he also modified something in wxWidgets itself which I don't have.

on an unrelated note: I have CUDA working for SHA256 hashing... currently the host has to iterate the nonce so performance is terrible, working on moving the whole kit and kaboodle into the card and then performance should be pretty damn sexy.
DEFINITELY in to test CUDA!
Oli: One alternative is to start the CUDA part hashing with a nonce of MAXINT and subtract one, and the host client start at one and go upwards.  Then, either one could find a hash and you're not repeating work.
Wow, that tweaked one... I'm getting stunning rates.  I have a quad core Intel laptop.

I have ... one while everything but bitcoin (including explorer.exe) is shut down.
Stock:         x64 v1          x64 v2           x64 v2 Tweaked       
<snip>
Optimized Usage:1500-18002000-25001500-20002700-3400

On a Quad-Core AMD Opteron 2376 server running Ubuntu 10.04 Desktop x64 (wubi) and Bitcoin 0.3.0 x64, I get ~2,200 khash/s.  With Bitcoin 0.3.0 x86, I get ~2,000 khash/s.  What would I expect running Windows Server 2008 x64 and Olipro's x64 v2 Tweaked?

Is it faster than the stock Linux x64 build?  Could the Linux build be similarly tweaked?
Ichi: Probably pretty fast, I saw a huge increase in performance to the tweaked version.  No tellin until you try it though.
I'll give this one a run, the last build would crash randomly after a few hours  Wink

same here,
MSVC build didnt work at all (missing DLL even after vcredist_x64 install),
Intel build seems to work fine for 1-2hours and then crashes.

I get around 1600 with the regular client, Intel build ~2150, latest Intel tweaked ~2220.

Update: this one seems to crash even faster, 30minutes first run, almost 1hour second run
I'll give this one a run, the last build would crash randomly after a few hours  Wink

same here,
MSVC build didnt work at all (missing DLL even after vcredist_x64 install),
Intel build seems to work fine for 1-2hours and then crashes.

I get around 1600 with the regular client, Intel build ~2150, latest Intel tweaked ~2220.

Update: this one seems to crash even faster, 30minutes first run, almost 1hour second run


the missing DLL is in my first release (libeay32.dll)
Vista x64 here

From regular version ~1700khash/s to this version ~2300khash/s (last versios posted here)

Really nice, but i think need to be more stable
Vista x64 here

From regular version ~1700khash/s to this version ~2300khash/s (last versios posted here)

Really nice, but i think need to be more stable

given that it's based on stock code and I doubt there's an issue with the compiler, it's more likely there's a bug in the SVN source, I'll make a build from the last stable production version just as soon as I've finished this CUDA code off.
I'll give this one a run, the last build would crash randomly after a few hours  Wink

same here,
MSVC build didnt work at all (missing DLL even after vcredist_x64 install),
Intel build seems to work fine for 1-2hours and then crashes.

I get around 1600 with the regular client, Intel build ~2150, latest Intel tweaked ~2220.

Update: this one seems to crash even faster, 30minutes first run, almost 1hour second run


the missing DLL is in my first release (libeay32.dll)

thats not the one causing MSVC build not to work, that'd be MSVCR100.dll.
anyway, the Intel build does work, although it crashes.  Undecided

keep coding, i'm looking forward to test a cuda-version on my gtx260.  Grin
My experience is similar, the first release was rock solid like the original client, the other releases, while still faster, randomly crash after minutes/hours/days  so it's kind of random.

So it's a trade off between stability and speed. The faster it gets, the less stable it seems.  Grin

Of course, I have mine on a batch to restart if there is a crash and to count how many times the program crashed during the day for example. So far up to about 2 crashes a day.
There is a speedup for me, but it is not faster than the ubuntu 64-bit version. I am surprised; with SSE2 I expected it to fly :S
There is a speedup for me, but it is not faster than the ubuntu 64-bit version. I am surprised; with SSE2 I expected it to fly :S
Yeah, my Linux 64bit systems still have the leg up on my windows servers, even with this optimization that is made that speeds them up by 50%, seems Linux still rules the roost for coin generation speed.

Most of my coin generation comes from my Linux servers more than my windows servers, it's about a 1 to 4 ratio, for every 4 blocks made by my Linux servers, 1 will be made by one of the windows servers.
There is a speedup for me, but it is not faster than the ubuntu 64-bit version. I am surprised; with SSE2 I expected it to fly :S
Yeah, my Linux 64bit systems still have the leg up on my windows servers, even with this optimization that is made that speeds them up by 50%, seems Linux still rules the roost for coin generation speed.

Most of my coin generation comes from my Linux servers more than my windows servers, it's about a 1 to 4 ratio, for every 4 blocks made by my Linux servers, 1 will be made by one of the windows servers.

that's not an indicator of performance mind, block generation is pure luck, it's the hashes per second that mean something.
There is a speedup for me, but it is not faster than the ubuntu 64-bit version. I am surprised; with SSE2 I expected it to fly :S
Yeah, my Linux 64bit systems still have the leg up on my windows servers, even with this optimization that is made that speeds them up by 50%, seems Linux still rules the roost for coin generation speed.

Most of my coin generation comes from my Linux servers more than my windows servers, it's about a 1 to 4 ratio, for every 4 blocks made by my Linux servers, 1 will be made by one of the windows servers.

that's not an indicator of performance mind, block generation is pure luck, it's the hashes per second that mean something.
Yes, sorry, I left that part out, the Linux servers (same hardware) always generate higher khash/s than the windows machines, at least mine do. Your mileage may vary.
There is a speedup for me, but it is not faster than the ubuntu 64-bit version. I am surprised; with SSE2 I expected it to fly :S
Yeah, my Linux 64bit systems still have the leg up on my windows servers, even with this optimization that is made that speeds them up by 50%, seems Linux still rules the roost for coin generation speed.

Most of my coin generation comes from my Linux servers more than my windows servers, it's about a 1 to 4 ratio, for every 4 blocks made by my Linux servers, 1 will be made by one of the windows servers.

that's not an indicator of performance mind, block generation is pure luck, it's the hashes per second that mean something.
Yes, sorry, I left that part out, the Linux servers (same hardware) always generate higher khash/s than the windows machines, at least mine do. Your mileage may vary.

On my desktop, I have a dual boot between Ubuntu and Windows 7. The SSE2 build brings Windows 7 speeds up to around the same as the Ubuntu build.
On my desktop, I have a dual boot between Ubuntu and Windows 7. The SSE2 build brings Windows 7 speeds up to around the same as the Ubuntu build.
OK.  Could someone please explain the why of that, in terms that a non-programmer can understand?  Why is the stock Windows build slower than the stock Ubuntu build?
On my desktop, I have a dual boot between Ubuntu and Windows 7. The SSE2 build brings Windows 7 speeds up to around the same as the Ubuntu build.
OK.  Could someone please explain the why of that, in terms that a non-programmer can understand?  Why is the stock Windows build slower than the stock Ubuntu build?
Optimizations mainly. When the program is being compiled on Windows, certainly optimizations make the program more efficient.

On Linux, we have both 32bit and 64bit builds to take advantage of the 64bit arch of the system. On windows, there was only a 32bit build.
Thank you.  I'm glad that I'm moving to Ubuntu.
Thank you.  I'm glad that I'm moving to Ubuntu.

I still prefer windows 7 for many things, but I am liking Ubuntu more and more. If only some things weren't such a pain in the ass to use (Tor: Windows 7? 1 minute install. Ubuntu? WTF...)
I really like Windows Server 2008, which is pretty close to Windows 7.  It'll be a while before I give that up.

OTOH, I'm thinking very seriously re going with Ubuntu for my next primary desktop.  As a test, I've recently installed Ubuntu 10.04 Server x64 + GNOME on an old Core 2 Quad machine.  Everything except boot lives on an encrypted RAID5 array managed by LVM.  Losing Excel 2007 might be a problem, though.

Although I don't use Tor, I'm sure that there must be many setup guides for Ubuntu.

This is way off topic.  Sorry.
I absolutely hate Ubuntu, mostly because of the community surrounding it.  I much prefer gentoo, Debian, etc.
mind if i ask?
satoshi has released a couple of updates recently, mostly about security concerns as far as i can see.
how are the 64bit versions keeping up with these developments? Smiley
Here is several builds based on the latest SVN code (0.3.3), it contains the following:

Visual Studio Builds in x86 and x64 flavour.

Intel Builds in x64 only, one with stock code and the other with the 64bit byte reversal/state init tweak since it seems to squeeze a few more drops of speed out of it.

grab it here
OK, I've made a new build now; this version uses the 64bit SHA256 Assembler code from Crypto++ which means the Byteswap function is now only used for re-ordering the resulting hash - and of course it's using 64-bit ASM to create the hash in the first place.

Performance? I've gone from an average of 2900k hashes on my previous builds up to a pretty stellar 3300k hashes (an improvement of about 100k per core) - I would be very surprised if the Linux builds outperform this.

Grab x64 Asm Optimised Bitcoin here
Can you release the source for these builds, please? I would very much like to look over the changes that you made.
If you want the code, get Crypto++
OK, so given that the SHA256 is now 100% assembler code, I figured I might as well just build it entirely using Visual Studio, so I did just that and performance was exactly the same.

So, for those of you who have found the VS builds to be more stable, click here to get it
after tinkering around a bit and reinstalling the vcredist, i even got the VS build to work now.

on the regular client i get around 1600khash/sec, your latest VS build currently runs at ~2750khash/sec,
+70% that's an outstanding performance!

OK, so given that the SHA256 is now 100% assembler code, I figured I might as well just build it entirely using Visual Studio, so I did just that and performance was exactly the same.

So, for those of you who have found the VS builds to be more stable, click here to get it
Yeah, runs every bit as fast as the Intel tweaked ones from what I could tell in testing, plus the program is half the size compiled.  Smiley
OK, now for some absolutely incredible performance.

Credit to tcatm for the caching part of the SHA context - this offers absolutely brilliant performance. Additionally, the Intel compiler really comes into its own here as its parallelisation abilities give a massive performance boost over Visual Studio.

Performance: 4700khash/s on 4 cores, I think that speaks for itself.

I've included both the VS and Intel build, but there's really no comparison, the Intel build craps all over VS.

Grab SHA state caching Bitcoin here
Wow, all I can say is once again the magic of optimizations and asm come through again. I went from 1300khash/s with the stock 0.3.3 to 3200khash/s with this latest build. My machine is running a dual-core Celeron 3300 @ 3.8ghz.
Um.. Wow.  That last one was a bit of a leap.

Intel Core i7-870 (2.93 GHz) running 4950 khash/s here. (4 cores Turbo'd to 3.2 GHz)
(Intel version)

Anyone seeing a speed difference between VS and Intel?
Wow.  Thank you  Smiley

Windows Server 2008 x64 VM in Hyper-V
4 cores
8 connections

stock 0.3.0 build => ~2,250 khash/s
Intel x64 build => ~5,600 khash/s
It was indeed a large jump, not sure how stable it will be in the long run but, we will find out. The stock client ran fine 24 on this machine so I will post back some time tomorrow. I am getting less khash/s with the vc build, around 2600khash/s. The intel build for me is pushing the most khash/s.
It was indeed a large jump, not sure how stable it will be in the long run but, we will find out. The stock client ran fine 24 on this machine so I will post back some time tomorrow. I am getting less khash/s with the vc build, around 2600khash/s. The intel build for me is pushing the most khash/s.

yes, as I said, the Intel compiler produces far better code than the VS compiler does.
I'm a total newb, I tried to use Olipro's file and it failed because I was missing
MSVCR100.dll I got it, now it says "The application was unable to start correctly." Should I just not mess with this? Is it supposed to be all good to go?

Just say Olipro's new thread. Got it working, so sweet.
if your OS is 64-bit then this has far superior performance, you need this runtime: http://www.microsoft.com/downloads/details.aspx?FamilyID=bd512d9e-43c8-4655-81bf-9350143d5867&displaylang=en
if your OS is 64-bit then this has far superior performance, you need this runtime: http://www.microsoft.com/downloads/details.aspx?FamilyID=bd512d9e-43c8-4655-81bf-9350143d5867&displaylang=en

OS is 64-bit. I loaded the runtime. It works now, but is the same khash as the x86.
if your OS is 64-bit then this has far superior performance, you need this runtime: http://www.microsoft.com/downloads/details.aspx?FamilyID=bd512d9e-43c8-4655-81bf-9350143d5867&displaylang=en

OS is 64-bit. I loaded the runtime. It works now, but is the same khash as the x86.

the more cores your computer has, the more significant the performance boost, you should also use the EXE from the intel folder, the VS one is inferior but included for completeness.
if your OS is 64-bit then this has far superior performance, you need this runtime: http://www.microsoft.com/downloads/details.aspx?FamilyID=bd512d9e-43c8-4655-81bf-9350143d5867&displaylang=en

OS is 64-bit. I loaded the runtime. It works now, but is the same khash as the x86.

the more cores your computer has, the more significant the performance boost, you should also use the EXE from the intel folder, the VS one is inferior but included for completeness.

I'm confused now, I don't see a VS or an Intel folder. When I unzipped vcredist_x64 I'm getting a bunch of numbered folders 1028, 1031, etc. I ran Setup and when I rerun it it just wants to repair Microsoft Visual C++ x64 Redistributable to it's original state each time.

I've also just noticed that it's getting incoming blocks like 3 or 4 at a time. It could be they're coming in close together, but I think I'm getting them like simultaneously.

EDIT: Okay, I see it now.
OK, now for some absolutely incredible performance.

Credit to tcatm for the caching part of the SHA context - this offers absolutely brilliant performance. Additionally, the Intel compiler really comes into its own here as its parallelisation abilities give a massive performance boost over Visual Studio.

Performance: 4700khash/s on 4 cores, I think that speaks for itself.

I've included both the VS and Intel build, but there's really no comparison, the Intel build craps all over VS.

Grab SHA state caching Bitcoin here
Wow, this is the biggest jump I've ever seen. Nearly a 250% increase in speed from the stock version, amazing.  Now let's see how stable it is  Smiley

I hit 2700, up from 1250 stock.

I have 7 cores, how do I tell how many it's using? Can I control it?
I hit 2700, up from 1250 stock.

I have 7 cores, how do I tell how many it's using? Can I control it?
Task Manager and Yes, options in Bit Coin allows you to control how many cores it's using.
I hit 2700, up from 1250 stock.

I have 7 cores, how do I tell how many it's using? Can I control it?
Task Manager and Yes, options in Bit Coin allows you to control how many cores it's using.

Long time visitor, First time poster. How can you possibly have 7 cores? What Processor are you running, Cores come in even numbers . . . I would check your processor is working correctly. How do you even know that you have 7 cores and you don't know how to control them. I'm guessing your a Windows user.

I had to register to rebel against some of the crap I see been posted here!
They do physically, but if you use a virtual machine, you can set environments to have 3 for example. Windows will work with 3 cores or 4 cores just fine, it doesn't care how many it has.
I hit 2700, up from 1250 stock.

I have 7 cores, how do I tell how many it's using? Can I control it?
Task Manager and Yes, options in Bit Coin allows you to control how many cores it's using.

Long time visitor, First time poster. How can you possibly have 7 cores? What Processor are you running, Cores come in even numbers . . . I would check your processor is working correctly. How do you even know that you have 7 cores and you don't know how to control them. I'm guessing your a Windows user.

I had to register to rebel against some of the crap I see been posted here!

Every time he posts I want to poke him in the eye, the one conveniently posted to the left of his friendly well-meaning nonsense  Grin
They do physically, but if you use a virtual machine, you can set environments to have 3 for example. Windows will work with 3 cores or 4 cores just fine, it doesn't care how many it has.

VMware only permits even numbers of processors/cores although I believe you can expose 8 cores and then configure the OS to only see 7.

if he really does have this setup, I'm going to bet that he's opted for more processors/cores than his CPU actually has (yep, you can do this but it will have a pretty negative effect on performance)
This is going off topic completely but, there is a very slim theoretical chance that you could have 7 cores on an amd machine. You could disable a core on a dual quad-core system or it might be possible to get another core or two by core unlocking (using the extra cores that are some times on amd cpus but,"locked"). Based on how many khash/s they are getting they do not have 7 cores or a running in a vm. If they do have 7 cores they should post a cpuz screenshot because, it would seem that we have an outlier in the performance boosts and it would be good to figure out why.

Back on topic, My desktop have been crunching away since last night @  06:57 am forum time (9 hours) using the intel build, no crashes, blocks are coming in, and I have 29 connections.
I hit 2700, up from 1250 stock.

I have 7 cores, how do I tell how many it's using? Can I control it?
Task Manager and Yes, options in Bit Coin allows you to control how many cores it's using.

Long time visitor, First time poster. How can you possibly have 7 cores? What Processor are you running, Cores come in even numbers . . . I would check your processor is working correctly. How do you even know that you have 7 cores and you don't know how to control them. I'm guessing your a Windows user.

I had to register to rebel against some of the crap I see been posted here!

Every time he posts I want to poke him in the eye, the one conveniently posted to the left of his friendly well-meaning nonsense  Grin

Ive only had one post . . . so far! OliPro love the work your doing with BitCoin
They do physically, but if you use a virtual machine, you can set environments to have 3 for example. Windows will work with 3 cores or 4 cores just fine, it doesn't care how many it has.

VMware only permits even numbers of processors/cores although I believe you can expose 8 cores and then configure the OS to only see 7.

if he really does have this setup, I'm going to bet that he's opted for more processors/cores than his CPU actually has (yep, you can do this but it will have a pretty negative effect on performance)
I was thinking about Virtual Box when I wrote that  Wink
I've thought about dialing BitCoin down to 7 active cores on my desktop machine, just to keep one free for general lightweight UI use.
I've thought about dialing BitCoin down to 7 active cores on my desktop machine, just to keep one free for general lightweight UI use.


the BitCoin threads are automatically assigned low priority; as soon as any other processes want to use the CPU it will automatically lose CPU time
Credit to tcatm for the caching part of the SHA context - this offers absolutely brilliant performance. Additionally, the Intel compiler really comes into its own here as its parallelisation abilities give a massive performance boost over Visual Studio.

Performance: 4700khash/s on 4 cores, I think that speaks for itself.

I've included both the VS and Intel build, but there's really no comparison, the Intel build craps all over VS.
Is that still starting from Crypto++?  Lets get this into the main sourcecode.