BitcoinTalk
bitcoin generation broken in 0.3.8? (64-bit)

View Satoshi only

External link

I was starting to wonder when my systems seemed to quit generating coins if there was something going on. They went from about 1 block / day to none in a week.

ArtForz in irc suggested I run a test isolated net with two nodes only connected to each other with empty wallet dir. I took a couple of quad core systems and set them up. they have produced no blocks in about 90 minutes now while hashing at a combined rate of over 8000 khash/sec. Is this evidence of a problem yet or is it more bad luck?

The systems are Linux 64 bit one Intel quad q6600 and one AMD quad phenom II 940.

The difficulty went up again just after 3.8 was out. I have not gotten any with 4000khash in about 15 days, so 90 minutes of 8000k doesn't mean much. I'm sure we'd know if 3.8 wasn't generating any blocks.
The difficulty went up again just after 3.8 was out. I have not gotten any with 4000khash in about 15 days, so 90 minutes of 8000k doesn't mean much. I'm sure we'd know if 3.8 wasn't generating any blocks.


This is on an isolated test net with just two machines, connections = 1, difficulty = 1
The difficulty went up again just after 3.8 was out. I have not gotten any with 4000khash in about 15 days, so 90 minutes of 8000k doesn't mean much. I'm sure we'd know if 3.8 wasn't generating any blocks.

maybe only a problem with 64 bit linux or something

I haven't had any problems with 0.3.8 release, generated some coin on windows and Linux clients (32/64)bit just fine this week.
The difficulty went up again just after 3.8 was out. I have not gotten any with 4000khash in about 15 days, so 90 minutes of 8000k doesn't mean much. I'm sure we'd know if 3.8 wasn't generating any blocks.


This is on an isolated test net with just two machines, connections = 1, difficulty = 1


Oh, that is odd then, that would be like not finding one in over 500 hours, possible I guess, but getting out there
I haven't had any problems with 0.3.8 release, generated some coin on windows and Linux clients (32/64)bit just fine this week.

I guess I want to know how long I need to run 8000 khash/s at difficulty 1.0 to have any reasonable evidence of a problem. ArtForz said it should tell me in an hour or so.
At ~ 8Mhash/s combined not generating a block at 1.0 diff in over 1h is pretty unlikely.
As in, under 0.1% probability unlikely (50% is ~6 min, 1% ~42 min)
<edit>
Forgot to mention, he's running a custom compile of stock 0.3.8 sources.
I haven't had any problems with 0.3.8 release, generated some coin on windows and Linux clients (32/64)bit just fine this week.

I am wondering if its something odd in the way I have my systems set up or Huh just various plain ubuntu installs so far as I know.
I'm having a similar problem with my own compile of the SVN trunk. I just reverted to the stock builds to test and see what happens. I should try setting up my own network of two nodes with my builds.
Ok chatting with lachesis in irc he tried this and I get the same result: We added some prints in main.cpp at the SHA calls like so :

Code:
       loop
        {
            SHA256Transform(&tmp.hash1, (char*)&tmp.block + 64, &midstate);
            printf("mid hash = ");
            for (int i = 0; i < 8; i++)
              printf(" %08x", ((unsigned *)&tmp.hash1)[i]);
            printf(" ");
            SHA256Transform(&hash, &tmp.hash1, pSHA256InitState);
            printf("full hash = ");
            for (int i = 0; i < 8; i++)
              printf(" %08x", ((unsigned *)&hash)[i]);
            printf(" ");

            if (((unsigned short*)&hash)[14] == 0)

and then in the log we get:

mid hash =
 6a09e667 bb67ae85 3c6ef372 a54ff53a 510e527f 9b05688c 1f83d9ab 5be0cd19
full hash =
 6a09e667 bb67ae85 3c6ef372 a54ff53a 510e527f 9b05688c 1f83d9ab 5be0cd19
mid hash =
 6a09e667 bb67ae85 3c6ef372 a54ff53a 510e527f 9b05688c 1f83d9ab 5be0cd19
full hash =
 6a09e667 bb67ae85 3c6ef372 a54ff53a 510e527f 9b05688c 1f83d9ab 5be0cd19

repeating! The hash call isn't doing anything!
(he maybe got a different repeating value, I don't know)
There is one behavior that I've noticed.

Block sync stalling.

When the client (be it Windows or Linux) has been running for a few days, *sometimes* they get stuck in block limbo. What happens is your client keeps trying to solve the current block and loses track of the entire block system. So as time goes by, other blocks are solved and your PC is still stuck on the same block. The problem is, if you stop and restart the client, it just picks up where it left off on the same stuck block. The only way I've seen to fix this is to delete the block chain so that the client will re-download it.

I've had to clear out a few block chains for a few servers because of this, I'll notice they are stuck on block 70,000 for example, while the rest of the network is working on block 70,839 for example.

That might explain why you go weeks without winning a block with a 24/7 PC going.

The server farm I had running before were spitting out blocks all day, but now that I'm just down to a few dozen servers and the difficulty is way up, I barely see about 3 or 4 blocks a day spread across all of them. But I know the 0.3.8 version is working like it should, except for the occasional block freeze.
This seems to be a different problem. The blocks do not seem to be "stuck" on my systems. The getinfo shows them up to date

It seems the sha256 code is not getting built right for linux 64. Not sure if/how it could work on some and not on others.

There is one behavior that I've noticed.

Block sync stalling.

When the client (be it Windows or Linux) has been running for a few days, *sometimes* they get stuck in block limbo. What happens is your client keeps trying to solve the current block and loses track of the entire block system. So as time goes by, other blocks are solved and your PC is still stuck on the same block.

...

Did you compile it yourself? There's a define in makefile.unix. Something like -DCRYPTOPP_DISABLE_SSE2. Remove that. Else the SHA256_Transform will return the initstate. I had the same problem when switchting to 0.3.6.
Did you compile it yourself? There's a define in makefile.unix. Something like -DCRYPTOPP_DISABLE_SSE2. Remove that. Else the SHA256_Transform will return the initstate. I had the same problem when switchting to 0.3.6.

That seems to be the root of the problem. I think even the bundled binary for Linux 64 in 0.3.8 was compiled wrong then.

That seems to be the root of the problem. I think even the bundled binary for Linux 64 in 0.3.8 was compiled wrong then.
That appears to fix it for me too.

EDIT:
I've uploaded corrected Linux builds to http://www.alloscomp.com/bitcoin/binaries/release-r123-2010-08-08/. Enjoy!
Did you compile it yourself? There's a define in makefile.unix. Something like -DCRYPTOPP_DISABLE_SSE2. Remove that. Else the SHA256_Transform will return the initstate. I had the same problem when switchting to 0.3.6.

That seems to be the root of the problem. I think even the bundled binary for Linux 64 in 0.3.8 was compiled wrong then.

I just tested it out, I get the same results. Repeating values with stock, more random values with the flag modification. Probably needs to be fixed for the next release, though I'm not sure how mine are generating anything then?
@lachesis: 404 not found

@knightmb: What about a fixed CentOS build?
@lachesis: 404 not found

@knightmb: What about a fixed CentOS build?
Should be simple enough for CentOS, I was going to test it out first to see if those suffered the same issue, thought it seems like all the builds would suffer this issue if the compile flag is what makes the difference.
This seems to be a different problem. The blocks do not seem to be "stuck" on my systems. The getinfo shows them up to date

It seems the sha256 code is not getting built right for linux 64. Not sure if/how it could work on some and not on others.

There is one behavior that I've noticed.

Block sync stalling.

When the client (be it Windows or Linux) has been running for a few days, *sometimes* they get stuck in block limbo. What happens is your client keeps trying to solve the current block and loses track of the entire block system. So as time goes by, other blocks are solved and your PC is still stuck on the same block.

...

I'm starting to wonder if the two are connected now since the hashing seems to get stuck, maybe that's why it gets stuck on a block.
After testing the 32bit and 64bit builds, the problem I can confirm what was already posted here, it just affects the 64bit builds. When I go back and trace what's being produced, it's the 64bit Linux machines (the 64bit windows machines don't appear to have this problem) that aren't producing any coin since the latest releases. All of the 32bit builds (stock or custom) appear to be just fine.
Did you compile it yourself? There's a define in makefile.unix. Something like -DCRYPTOPP_DISABLE_SSE2. Remove that. Else the SHA256_Transform will return the initstate. I had the same problem when switchting to 0.3.6.
This works for the 64bit builds, strange that it has no affect on the 32bit builds? Should it be disabled for 32bit builds as well?
I don't really know what it changes. I wonder why the define was made in the first place. The code should be able to determine whether it's running on a SSE2 CPU or not. I discovered it while comparing my 4xSSE2 hash function to the original one in 0.3.6 so it really means, that no 64bit build since 0.3.6 was able to generate any coins. Maybe satoshi can tell us why he put the define in the makefile.
Just had a "fun" gdb session with the official 0.3.8 linux 64 bit binary on debian sid.
Same bug, output state on return from SHA256Transform always == initial state.
So... did someone really generate a block running the official 0.3.8 64 bit linux binary (the one in bin/64)?
Oh, and from a quick glance at the svn changelog, that bug probably has been there since r118 = 0.3.6.
Oh, and who THE FUCK thought stripping the official binary was a good idea? I just wasted half an hour hunting down SHA256Transform in a disassembler.
Just had a "fun" gdb session with the official 0.3.8 linux 64 bit binary on debian sid.
Same bug, output state on return from SHA256Transform always == initial state.
So... did someone really generate a block running the official 0.3.8 64 bit linux binary (the one in bin/64)?
Oh, and from a quick glance at the svn changelog, that bug probably has been there since r118 = 0.3.6.
Oh, and who THE FUCK thought stripping the official binary was a good idea? I just wasted half an hour hunting down SHA256Transform in a disassembler.


Not on 64bit linux, just 32bit and 64bit everything else in between (Linux/Windows)

I didn't notice until I checked the generation dates against the release dates of the software.

If you don't strip the debug symbols the binary is 8 times it's normal size.
Just had a "fun" gdb session with the official 0.3.8 linux 64 bit binary on debian sid.
Same bug, output state on return from SHA256Transform always == initial state.
So... did someone really generate a block running the official 0.3.8 64 bit linux binary (the one in bin/64)?
Oh, and from a quick glance at the svn changelog, that bug probably has been there since r118 = 0.3.6.
Oh, and who THE FUCK thought stripping the official binary was a good idea? I just wasted half an hour hunting down SHA256Transform in a disassembler.
Nice work mate! Time to PM Satoshi.

Sent you \u0e3f5.01 for your troubles.
The 32 bit Linux build seems ok for those who don't care to try to build it themselves. It's only a few percent slower than the 64 when built right.

I guess that flag was put in for old 32 bit machines that might not have sse2. Unfortunatly there is no such thing as a 64 bit x86_64 without sse2 so the conditional compilation produced an empty body which did exactly nothing.
I guess that flag was put in for old 32 bit machines that might not have sse2. Unfortunatly there is no such thing as a 64 bit x86_64 without sse2 so the conditional compilation produced an empty body which did exactly nothing.
Yet another argument for cmake or similar.
I found that SSE2 only added a slight 2% speedup, which didn't seem worth the incompatibility.  I was trying to take the safer option.

It doesn't look to me like Crypto++ could be deciding whether to use SSE2 at runtime.  There's one place where it detects SSE2 for deciding some block count parameter, but the SSE2 stuff is all #ifdef at compile time and I can't see how that would switch at runtime.  Maybe I'm not looking in the right place.

Should we enable SSE2 in all the makefiles?  It seems like we must in case someone compiles with 64-bit.

I will recompile the 64-bit part of the Linux 0.3.8 release.
I found that SSE2 only added a slight speedup, about 2%, which didn't seem worth the incompatibility. 
I do the see point because a non-SSE2 machine, the client would crash?
Quote
It doesn't look to me like Crypto++ could be deciding whether to use SSE2 at runtime.  There's one place where it detects SSE2 for deciding some block count parameter, but the SSE2 stuff is all #ifdef at compile time and I can't see how that would switch at runtime.  Maybe I'm not looking in the right place.

Should we enable SSE2 in all the makefiles?  It seems like we must in case someone compiles with 64-bit.

I will recompile the 64-bit part of the Linux 0.3.8 release.
Depends on which is easier really. If you enable something that is going to break compatibility, the only people who will feel the pain are the non-technical users of the client. From their perspective, it just doesn't work for some reason.

I think some builds notes would be better, example (If compiling on a 64bit system, be sure to do this part).

Have one branch of source that cross-compiles across multiple operating systems is always tricky business.  Grin
Yet another argument for cmake or similar.

It's a fact.

I intend to eventually sort out all building hassles and have a reliable build procedure across all win/unix 32/64-bit platform combinations.

On that note, I'm currently tackling x64 build for windows and notice that for 64-bit MSVC, the X86_SHA256_HashBlocks function is deferred to an external definition that is not present in the project. In the original CryptoPP library it seems to be in a separate asm module. I wonder how are people building x64 on windows, are they setting defines to use the C-source sha version?
I uploaded 0.3.8.1 for Linux with re-built 64-bit.  I ran a difficulty 1 test with it and it has generated blocks.

http://bitcointalk.org/index.php?topic=765.0

Download:
http://sourceforge.net/projects/bitcoin/files/Bitcoin/bitcoin-0.3.8/bitcoin-0.3.8.1-linux.tar.gz/download