How much entropy is lost alphabetising your mnemonics?

TL;DR: you get around a factor 500 million attack speedup for 12 words, bringing it not quite in the realm of feasible attacks, but it's getting close. For 24 words, the speedup is around a factor half a septillion, but still far beyond the realm of feasible attacks.

Repetition of words or not?

First I need to ask: are words permitted to occur multiple times within the same sorted phrase? If not, Mike D's answer is correct. If they are, the question just got a lot more complicated, at least if you care about the exact numbers.

The difficulty with allowing repetition

Once you allow repetition, the distribution you obtain by sorting uniformly chosen phrases, is no longer uniform. For example the sorted phrase "act act act act act act act act act act act zoo" can be obtained from 12 distinct unsorted phrases ("act ... act zoo", "act ... act zoo act", "act ... act zoo act act", ...). The sorted phrase "mad mad mad mad mad mad mad mad mad mad mad mad" can only be obtained in one way. Thus, if you start with a random unsorted phrase, and sort it, the "act ... act zoo" phrase is 12 times more likely than the "mad ... mad" phrase.

This non-uniformity has a subtle but complicated impact on our question. If the phrases are uniformly distributed, with N possibilities, then it holds that:

The number of distinct phrases is N.
The entropy (in bits) of a sampled phrase is log₂(N).
The average number of tries an attacker needs to brute-force guess a sampled phrase is (N + 1) / 2.

When the phrases are not uniform (as would be the case when sorting uniform phrases with repetition), the answer to (2) and (3) cannot be simply expressed anymore as a function of N. It depends on exactly how non-uniform the distribution is. Worse, you can't even compute (3) from (2) or the other way around, because entropy is not a measure for how secure something is, but for how compressible it is. These two are strongly correlated, but not exactly the same, and the difference between the two again will depend on the exact distribution.

This is all to say that the answer to these three questions is distinct, and it depends on what you're interested in:

How many distinct sorted phrases are there?
How much entropy do phrases obtained by sorting uniform unsorted phrases have?
How hard is it for attackers to guess a phrase obtained by sorting a uniform unsorted phrase has?

1. How many distinct sorted phrases are there?

The number of distinct multisets of k elements, drawn from a set of N elements is $\binom{N+k-1}{k}$ .

For k=12 and N=2048 that is 11737947382912650038702322439680, or ~2^103.211.
For k=24 and N=2048 that is 54640990776272965318633918350257589558711775320768326400, or ~2^185.156

Note that these numbers only differ slightly from the corresponding numbers when repetition is not allowed (2^103.118 and 2^184.767 respectively)

2. How much entropy do phrases obtained by sorting uniform unsorted phrases have?

The entropy of a distribution is how many bits on average an optimal compression scheme needs per element to encode a (long) list of randomly sampled elements from that distribution. For a discrete distribution taking on possibilities a₁, a₂, ..., a_n, with respective probabilities p₁, p₂, ..., p_n, that is $-\sum_{i=1}^{n}log_2(p_i)p_i$ . By applying this formula to the probability distribution of sorted phrases exactly, you get:

For 12 words: ~103.196 bits
For 24 words: ~185.097 bits

3. How hard is it for attackers to guess a phrase obtained by sorting a uniform unsorted phrase has?

The optimal strategy for an attacker that tries to brute force, and who knows the probability distribution, is to try values from most likely to least likely. The expected number of tries under such a strategy is $\sum_{i=1}^{n}i~p_i$ , assuming the p_i are sorted from high to low. Applying this formula to the exact probability distribution in question here we get:

For 12 words: ~(2^103.166 + 1) / 2
For 24 words: ~(2^184.985 + 1) / 2

In other words, the average time an attacker needs for guessing these sorted phrase distributions is as if they were uniform distributions with ~103.166 resp. ~184.985 bits of entropy (but, as demonstrated in the previous section, that is not exactly the entropy the sorted phrase distributions have).

Summary

The following table summarizes the numbers for all 4 scenarios (repetition allows and not allowed, sorted or not), under all 3 metrics (number of possibilities, entropy, and expected number of tries for an attacker):

12 words	# possibilities	entropy	mean tries
Normal, repetition	2¹³²	132 bits	(2¹³² + 1) / 2
Normal, no repetition	~2^131.953	~131.953 bits	~(2^131.953 + 1) / 2
After sorting, repetition (*)	~2^103.211	~103.196 bits	~(2^103.166 + 1) / 2
After sorting, no repetition	~2^103.118	~103.118 bits	~(2^103.118 + 1) / 2

24 words	# possibilities	entropy	mean tries
Normal, repetition	2²⁶⁴	264 bits	(2²⁶⁴ + 1) / 2
Normal, no repetition	~2^263.805	~263.805 bits	~(2^263.805 + 1) / 2
After sorting, repetition (*)	~2^185.156	~185.097 bits	~(2^184.985 + 1) / 2
After sorting, no repetition	~2^184.767	~184.767 bits	~(2^184.767 + 1) / 2

(*) = Note that only the lines for after sorting with repetition have distinct numbers in their expressions for the three columns. For all others, they follow the pattern 2^v, v, (2^v + 1) / 2 for some value v perfectly, because they induce uniform distributions.

You can get bonuses upto $100 FREE BONUS when you:
💰 Install these recommended apps:
💲 SocialGood - 100% Crypto Back on Everyday Shopping
💲 xPortal - The DeFi For The Next Billion
💲 CryptoTab Browser - Lightweight, fast, and ready to mine!
💰 Register on these recommended exchanges:
🟡 Binance🟡 Bitfinex🟡 Bitmart🟡 Bittrex🟡 Bitget
🟡 CoinEx🟡 Crypto.com🟡 Gate.io🟡 Huobi🟡 Kucoin.

Bitcoinist

	The Riveting World of Bombay Club
	SOCIAL GOOD 100$ free sign up bonus + 100% Cashback on daily purchases
	Ethereum Expected to Thrive in 2023
	How To Get Started On Cryptocurrency Gaming
	5 Important Features of White Label NFT Marketplace
	How to get started with crypto gambling
	Best 7 Web3 Apps That Pay You Crypto
	6 Must-Have Tools For Crypto Traders
	Can Ethereum 2.0 Be Converted to Cash?
	Cryptocurrency in Gambling: Features and Benefits
	What are the Core Facets of a Successful Cryptocurrency?
	Cost to Develop Your Own Cryptocurrency Website
	EgldRush - The best Elrond NFT project Farming, Staking, Rewarding
	Choosing a Safe Cryptocurrency Exchange 2022
	Let's Talk About Security in Crypto

How much entropy is lost alphabetising your mnemonics?

Repetition of words or not?

The difficulty with allowing repetition

1. How many distinct sorted phrases are there?

2. How much entropy do phrases obtained by sorting uniform unsorted phrases have?

3. How hard is it for attackers to guess a phrase obtained by sorting a uniform unsorted phrase has?

Summary

Comments

Categories

Tags

Subscribe

Repetition of words or not?

The difficulty with allowing repetition

1. How many distinct sorted phrases are there?

2. How much entropy do phrases obtained by sorting uniform unsorted phrases have?

3. How hard is it for attackers to guess a phrase obtained by sorting a uniform unsorted phrase has?

Summary

Related Posts

Comments