Is KeePassXC accurate calculating entropy of a password?

Anyone knows if KeePassXC is accurate when calculating entropy of a password?
I mean if you want to encrypt something, you may use KeePassXC to type the password and see the entropy score. Is it accurate? I mean, this would be better than using some of these websites, since you do not want to type your password on a website obviously. I just don’t know if you can trust it.

If yes, how much entropy would you recommend so no one can crack it?

Would it be possible to memorize a password with impossible to crack entropy score or you would forget it?

3 Likes

I can’t estime the KeePassXC entropy calculation but i know from their github repo that they use GitHub - tsyrogit/zxcvbn-c: C/C++ version of the zxcvbn password strength estimator , maybe you can look at some pointers there. also here is an nice explication on how password entropy How to Calculate Password Entropy? - Password Generator .

Hope this can help :grin:

Entropy is the base 2 logarithm of the total number of passwords that could be generated by your chosen method. The generation being random is crucial to this, so you can’t rely on human choice. Therefore, you can’t measure the entropy of an arbitrary password that has an unknown generation method.

In KeePassXC, the password generation menu only shows a measure of the password’s guessability with zxcvbn as linked above, and should pretty much be ignored. The passphrase generation menu shows the actual entropy of the generator, but does not indicate that the entropy is no longer valid after you manually edit the passphrase. Having both of these things labelled as “entropy“ in similar menus is inconsistent and only serves to confuse unknowledgeable users. They could easily show the actual entropy as a value for the generator and then show the guessability under the password as you change it. But, they would rather mislabel guessability as entropy and display invalid entropy for manually edited passphrases.

The maintainer has a lack of understanding for what entropy is. They seem to think entropy is something other than a measure of the password generator, and continue to stifle discussion on it.

The entropy you should target depends on whether your password will be put through a key derivation function or is otherwise rate limited by a secure hardware element and/or a remote server. I’ll just recommend 80, but don’t use the value in KeePassXC for that. Do log_2((size of character set)^(password length)) for your method of generation. Passphrases should be easier to remember, and you should replace character by word and password length with number of words in the entropy calculation.

6 Likes

LLMs can not explain math. They are fancy lie generators.

Trying to guess the character set size of a given password is not going to be reliable. Even if you could, the metric provided by zxcvbn (KeePassXC’s number) takes into account far more than what your attempt did (the fact that it had artifacts of non-random generation). So, the 69.52 number is much better estimate than 150.8, but it is merely a measure a guessability that attempts to reflect how easy it is crack (not entropy, they mislabel it). 69.52 still seems to be a vast overestimate though, which is just always going to happen because you can’t account for all patterns that human created passwords produce.

If your use case is measuring passwords of unknown origin, then using KeePassXC is as good of a method as you are going to get. If you instead want to produce reliably strong passwords, then pick an entropy target, pick a character set, figure out what length it needs to be (solve X*log2(character-set-size)>=entropy-target, where X is the length), then randomly generate it with KeePassXC or another tool. You may have to use a predefined character set for some tools.

Entropy and guessability both attempt to represent the number of attempts an attacker should have to make at guessing your password (2^(either of them)). The reason entropy is more reliable is because you assume the attacker knows exactly how the password was generated (your character set), while a guessability estimator tries, and often fails, to look for human patterns in an arbitrary password to guess at how many attempts an attacker would take. Entropy is just a far simpler metric that is rigorously defined, while guessability is beholden to whatever complicated implementation someone makes to estimate it.

2 Likes

Please refer to this.

1 Like

But what’s the point of knowing raw entropy if the “KeePassXC entropy value”, aka “guessability/implementation someone makes to estimate it” is more accurate than the raw entropy value?

Perhaps they should make it clear that the value is not raw entropy but a customized benchmark, but at the end of the day, the user wants to know how hard it is to crack their password, so whatever is more accurate at guesstimating this is more useful to know to the user.

Since squealer answered your first question, here are answers to your others:

If you want impossible, then go for 128 bits of entropy in a randomly generated password (see Squealer’s comments), which is as “strong” as the encryption algorithm that is most commonly used for symmetric encryption (AES-128) in the first place.

Yes, it is possible with Diceware! Please see:

A 10-word randomly calculated Diceware passphrase is considered the “golden standard” for sensitive information in some communities, because its entropy is roughly 128 bits, and its fairly easy to remember. 128 bit classical brute force searches are essentially impossible with classical computers.

If you want something that’s physically impossible to brute-force, (as in not enough energy in the universe to do it) you can memorize a 20-word Diceware passphrase (>256 bits), but 10 is more than enough. Dr. Schneier wrote a gem about the in-feasibility of breaking such large keys in his book Applied Cryptography:

One of the consequences of the second law of thermodynamics is that a certain amount of energy is necessary to represent information. To record a single bit by changing the state of a system requires an amount of energy no less than kT, where T is the absolute temperature of the system and k is the Boltzman constant. (Stick with me; the physics lesson is almost over.)

Given that k = 1.38×10-16 erg/°Kelvin, and that the ambient temperature of the universe is 3.2°Kelvin, an ideal computer running at 3.2°K would consume 4.4×10-16 ergs every time it set or cleared a bit. To run a computer any colder than the cosmic background radiation would require extra energy to run a heat pump.

Now, the annual energy output of our sun is about 1.21×1041 ergs. This is enough to power about 2.7×1056 single bit changes on our ideal computer; enough state changes to put a 187-bit counter through all its values. If we built a Dyson sphere around the sun and captured all its energy for 32 years, without any loss, we could power a computer to count up to 2192. Of course, it wouldn’t have the energy left over to perform any useful calculations with this counter.

(This is what it would take to successfully perform a brute-force search on a 10-word diceware passphrase (>128 bits of entropy)

But that’s just one star, and a measly one at that. A typical supernova releases something like 1051 ergs. (About a hundred times as much energy would be released in the form of neutrinos, but let them go for now.) If all of this energy could be channeled into a single orgy of computation, a 219-bit counter could be cycled through all of its states.

These numbers have nothing to do with the technology of the devices; they are the maximums that thermodynamics will allow. And they strongly imply that brute-force attacks against 256-bit keys will be infeasible until computers are built from something other than matter and occupy something other than space.

To answer your question, if you want a passphrase that is mathematically and physically outside of human reach to be able to brute-force, you can memorize a 10-word diceware passphrase. If for some reason you’re concerned about alien perfectly-efficient supernova-powered quantum computers, you could memorize a 20-word passphrase. But really, after 10 words it would just be superfluous.

2 Likes

Because that isn’t what I said? This is the second time you have completely misinterpreted what I said and I can only assume you are trolling.

You did not calculate entropy. I told you that you can’t calculate entropy for an arbitrary password and explained why. You estimated guessability in a much cruder way than KeePassXC does.

So if you come up with 10 random words that you are comfortable with remembering, that is less random than the diceware method? or you just cannot measure the entropy of you typing the words? So you cannot know how safe a password is if you invented it? Interesting. Perhaps a password you invented can be harder to crack than the rolling dice method for all we know, and with lower chances of forgetting it because it’s your invention and ingrained in your brain muscle memory as it where, vs something external.

Humans are terrible at being "random”, especially if that “random” is something that they’re comfortable with remembering.

With Diceware you can calculate the entropy. If you’re choosing the words yourself, the selection is not random and while the entropy can’t calculated, its bound to be less than with an actually random method.

Another thing that humans are terrible at is remembering. You should create an emergency sheet in case your fallible human memory betrays you (simple memory lapse, brain trauma, old age, and so on). An uncrackable password/passphrase is useless if you lock yourself out.

2 Likes

Absolutely not. There’s a wealth of research into this subject, so instead of just trusting us, you should see what cryptologists say on the matter.

Remember that this isn’t 1990, passphrase crackers now use AI that is trained on billions of cracked passphrases, made by people who also thought they were “clever”, which enables passphrase recovery/forensics companies to break very long and “seemingly complex” passphrases. But actual computer-randomized or dice-randomized passphrases aren’t affected by this threat.

2 Likes

So why do we still hear stories of people being thrown in jail because they refused to disclose a password?

Probably because they used these kinds of passwords:

Or the government doesn’t take kindly to you not following their orders even if they have other means like brute force, usually. IIRC there have been cases where people have been thrown in jail for contempt for not disclosing a password, the government successfully brute forces it anyway, and the person remains in jail for contempt afterwards.

Note that (in my non-lawyer understanding) in some jurisdictions in the US it may be considered a constitutional right to not disclose passwords, but there are conflicting federal rulings from different circuits on this so it remains not fully settled until someone appeals a case related to it up to the Supreme Court.

There’s so much nonsense circulating about this subject as some people here already mentioned, but I’ll try to keep it concise. If we’re discussing randomly generated passwords and passphrases, yes, it’s pretty accurate. I don’t understand why people choose to be overly clever instead of taking a few hours or 1-2 days max to remember just a couple of randomly generated words. It really isn’t a big deal. You’re remarkably secure with six words and with seven words it realistically becomes impossible to break your passphrase. Let’s consider this from a different perspective:

Suppose you chose a randomly generated 7-word passphrase, which translates to approximately 90 bits of entropy. If the password manager developer was a complete idiot who has lived behind the moon for the last 15 years and simply hashed your password using MD5 – essentially nothing – then it should still take more than 100k years to break the passphrase using 1000 RTX 5090s. And again, we are talking about MD5 here. A single 5090 RTX GPU can calculate approximately 220 billion MD5 hashes per second. The default KDF in KeepassXC is Argon2id, which is memory-hard and very bad for GPUs. You can increase the memory parameter to 1 or even more gigabytes – this would effectively completely gimp any brute force attempt down to a few hashes per second, rendering it pointless to even try. KDFs buy time and entropy is what counts in the end.

But this is all theoretical. No one, probably not even a nation-state, would attempt to brute-force for more than a few days because it’s simply too expensive and time-consuming. Once they’ve exhausted simple dictionary attacks, I figure they’ll stop. It’s far more effective to pressure you with other stuff.

Bottom line: don’t try to be clever. Remember 5 - 6 words, 7 if you’re paranoid and you’ll be safe from brute force attempts in the foreseeable future.

2 Likes

In my opinion the key is the access to your password’s manager database. That contains the rest of the passwords. So this one, ideally, you want to be able to remember, and do not depend on external devices that you could lose like yubikeys, or some file you have to keep.

What would you recommend for the password manager’s password if you want to remember it? You mentioned 5-6 words or 7 for good measure. Bitcoin wallets use I think 12 words by default for the seed and they aren’t getting their wallets hacked, however remembering 12 words is pretty difficult.

Could you specify about the words? Like, do you think about the 6 words, or do you pick them from the diceware list thing? You wouldn’t bother with extra special characters, just 6 words with spaces?

Thanks

I’ll dare summoning @phoerious (one of the maintainers) for all KeepassXC-specific questions.
I hope it’s fine with you sir. :hugs:

I was actually referring to the master password of your password manager. For passwords stored within your password manager, you can use extremely high entropy passwords or passphrases – as you won’t need to remember them anyway. Better yet, use passkeys.

Naturally, when I mentioned randomly generated passphrases, I was referring to diceware passphrases. It doesn’t really matter how you separate them – you may not even separate them at all. Going for 12 words would be a ridiculous overkill even without key derivation.

Also if I recall correctly KeepassXC uses zxcvbn to estimate password entropy.

I’m mildly confused by this thread. What is the actual question? Should I explain what entropy is or why we don’t use the naive log2(l^n) calculation? I’ve explained that already in one of the issues linked above as us (me?) supposedly “having a lack of understanding what entropy is”, but I can reiterate if it’s not clear.

1 Like

Bitcoin wallets don’t use the diceware EFF list so there are fewer possible words. The Bitcoin BIP39 English word list is 2048 words, whereas the diceware EFF wordlist is 7776 words. Diceware is roughly 12.9 bits of entropy per word whereas BIP39 is roughly 11, and a significant part of the 12th word in BIP39 is a checksum. You only need 10 diceware words for more than 128 bits of entropy, and you probably don’t actually need a full 128 bits if a strong password hashing function like argon2 is used. For protecting a password manager database, for most people, 6-8 is perfectly fine.

I think there is already sufficient explanation in this thread about this. Use random words, always.