Acgq75VpCWjdsJaa5abe9JeX3I (don't worry, this isn't a real password to anything)
…I fed this through an online entropy calculator and got 4.29 bits of Shannon entropy
That calculator is giving you bits *per character*.
You can see this several ways:
1. Double the message and the bits per character doesn’t change because the size of the source alphabet doesn’t change.
2. Add a dollar sign to the message, and bpc goes up a bit. (This conflicts with your report that adding a special character didn’t change it, but it did for me.)
3. Turn on the calculator’s case folding option and the bpc value goes down a bit.
One key realization you should get from this calculator is that ASCII text is not 7 or 8 bits of entropy per character. It simply is not, because not all characters in the source text are equally likely. Many code points may never be used in a given corpus.
Another realization is that a random blob of hex noise should asymptotically approach 4 bpc, since each character is 4 bits of data, and the data are supposed to be evenly distributed across the code space.
Here’s some noise from grc.com/pass:
The initial value is 3.91, and pasting it in a bunch of times does increase the value towards 4, suggesting it’s got pretty good entropy.
Now paste in an equivalent number of ‘a’ characters, and you get 0 bits of entropy. Strictly speaking, you get 1 bit of entropy for the whole message, but it shows 0 because the calculator is rounding the result off to 3 significant figures.
dd if=/dev/random bs=100 count=1|od -c
and the result only gave 5.00 bits
That’s plausible. With a much larger sample, the result should approach 7, 8, 16, or 21, depending on your local character set size. (Respectively: pure ASCII, ISO 8859 or similar, UCS-2 and full Unicode.)
Now see if you can guess the asymptotic ideal for this slightly different command:
$ dd if=/dev/random bs=100 count=1 | od
3, because the output is restricted to octal, thus 3 bpc.