Hierarchical key generation

Alexey Shepelev
16 min readDec 28, 2020

--

In this article, we will talk about deterministic wallets, hierarchical key generation, as well as how this works mathematically and in what cases this is convenient to apply in practice. This material will be helpful for specialists whose activities relate to payment gateways, Bitcoin wallets, and other coin storages. Also, the material will be of interest to those keen on elliptic-curve cryptography and digital signature key deployment schemes.

Deterministic wallet

First, let’s define what a deterministic wallet is. When we talk about key generation, we often use the word “wallet” because in the context of cryptocurrencies, possessing a private key is proof of owning coins, and in this case, a wallet and a key have a similar meaning.

A deterministic wallet is a wallet in which all used private keys were generated from one secret shared by all keys. The peculiarity is that it is possible to generate any number of key pairs for digital signature from one secret. You can use new addresses for each incoming payment and change.

Conveniently, the keys of such wallet can be easily transferred to another device, backed up and then restored because, in fact, only the master secret needs to be backed up. Besides, all private keys generated from the master secret are not related to each other in any way. It is also impossible to find relations between generated addresses (determine that they all belong to the same user). What’s more, if you possess a generated private key, you cannot restore the shared secret.

Master secret encoding

Now let’s talk about master secret encoding. There is some standardized approach that was defined in BIP39. This is the so-called Check Encoding, i.e. encoding the master secret into a mnemonic phrase — a set of words that is easy to write down on paper and, if necessary, memorize. When entering, it is possible to check the checksum, i.e. there is a fairly high probability to identify an error if any.

Master secret encoding — BIP39

How does it work? In fact, you have the master secret (Entropy) — the data from which all private keys of the wallet are deployed. This secret is available in various lengths. As for the checksum: there is 1 bit of the checksum for every 32 bits of Entropy, i.e. the Checksum is calculated using the following formula: the length of Entropy in bits divided by 32.

The Entropy is concatenated with the checksum calculated as a double SHA-256 hash (SHA-2 is 256 bits long), after which the required number of bits are stripped off. The concatenated data is converted into another numeral system: from binary to base-2048 (as you can see, 2048 is 211). And if you add the length of the Entropy bits and the checksum, you’ll get a multiple of 11. Thus, we get the number of words in the output mnemonic phrase.

In fact, the data is “sliced” in 11-bit chunks. There is a dictionary of 2048 words (211) to which certain requirements apply. The default language of the dictionary is English, but any language can be used. The words must not exceed a certain length (usually up to 7 characters). They must all be encoded in UTF-8 with some normalization of all characters. The uniqueness of each word for the first four characters is mandatory.

The first four characters uniquely define the word in the dictionary, and the remaining characters are used to complete this word to a convenient form for reading, memorizing, etc. Thus, each piece of data consisting of 11 bits gets a one-to-one correspondence as a word from dictionary. If the Entropy of your secret is 256 bits, then the data to encode will be 264 bits, and your mnemonic phrase will contain 24 words. This is the main approach to encoding the wallet secret in BIP39, and it is most often used in practice.

To make a backup copy in the future and use it, you are offered to write this phrase to an external medium. Paper that you store in a safe place is best. This way you can regain full access to all your keys.

Types of deterministic wallets

Deterministic wallets are of two types. Let’s consider their main differences.

The first one is the simplest. The master secret here is concatenated with the index of the child key we want to get, after which the concatenated data is hashed. This is most often done using the SHA-256 hash function.

The second type includes hierarchical deterministic wallets (HD wallets). Defined in BIP32, their principles are a very common approach to hierarchical key generation.

Deterministic generation

Let’s consider the differences between these types of wallets in the diagram.

A typical deterministic wallet has some seed from which a huge variety of private keys are directly generated. Their number can only be limited by the size of the index, which is concatenated to the secret before hashing. These are usually 4 bytes, which means that the candidate space allows for about 4 billion unique keys for a deterministic wallet. In practice, this should be sufficient for any situation.

Hierarchical deterministic generation

Let’s move on to a hierarchical deterministic wallet, the key deployment scheme of which is so far presented in a simplified form. There is a seed, from which a pair of master keys is directly obtained. Whereas we get one private key in an ordinary deterministic wallet, here we get a pair of keys. What’s more, there are hierarchy levels, at each of which we calculate an index to generate a child key. We can also build public key branches and private key branches.

Hierarchy levels

Regarding HD wallets, it is worth noting that according to BIP32 rules, the parent node has three objects at each level of hierarchy: a private key, a public key, and a chain code, which is used to generate the next level of hierarchy.

Hierarchical generation scheme

Let’s consider in more detail the scheme for generating keys using BIP32.

It all starts with the seed. It is also called the master seed, from which the zero level of hierarchy is calculated — a pair of master keys and a chain code.

A huge variety of key pairs with specific indices can be generated from a master key pair. A new level of hierarchy is formed, and it is used for generating accounts. Let’s say one user has a seed and he wants to create several addresses that will be different from each other. The coins of these addresses will not be mixed, will not be published together, and it will not be possible to find relations between them in finished transactions. These keys will be used completely separately from each other. In one of the accounts, a group of keys will be used for the working budget, in the other, for the personal budget, and another account will be used for false accounting. The coins will not be mixed.

The next level of hierarchy defines different key generation chains. Chains with indices 0 and 1 are most commonly used. The chain with index 0 will generate final keys to form an address for incoming payments, while the chain with index 1 will generate wallets to which coins sent by the user to himself, i.e. change, will come. This is necessary for the wallet to programmatically distinguish payments sent from outside from change, calculate changes in the balance of each transaction, and compile a visual list with the history of all payments. This makes it easier to develop the wallet and use it for routine payments.

Hash-based message authentication code

Now let’s move on to the math behind the hierarchical key generation processes. Let’s start with considering the hash-based message authentication code. This is a different class for calculating hash functions. It differs in that it accepts two values ​​as input, and not one. The first value is the secret key and the second is the message itself.

K — key
m — message
opad, ipad — some constant values ​​needed to generate keys, which differ from each other, at different stages of hashing.
SHA-512 is used as a hash function.

The peculiarity is that to use HMAC, you need to possess the secret key so that you can get the correct hash value of the message.

So, to calculate the hash value using HMAC, the key value is XORed with the ipad constant value, after which the result is hashed. The message is concatenated to this value. After that, the key is XORed with the opad constant value, concatenated with the hash value, and then hashed again. As a result, we get 512 bits of the hash value.

Derivation functions

Let’s consider a few functions that are used in calculating hierarchical keys.

Derivation functions

First of all, we are interested in converting the master seed into a master key pair. After that, you need to derive the private child key and the public child key from the private parent key. In the second case, we use exactly the same function as in the first, but the multiplication of the number by the base point is added, so it will not be considered separately. This is followed by deriving the public child key from the public parent key. It is worth noting that it is impossible to derive the private child key from the public parent key. This limitation is due to certain properties inherent in HD wallets that will be considered next.

So, let’s consider each of the derivation functions separately.

To derive the master key from the master seed, the HMAC function is used, where the constant string “Bitcoin seed” is used as a key, and the data of the master secret is used as a message. Thus, we get a hash value of 512 bits, which we consider as two parts: left and right. The left side is the master private key and the right side is the chain code. Further, these values ​​will be used to generate subsequent levels of child keys.

To derive the master public key, you need to multiply the value of the base point by the value of the master private key. As you can see, this happens in the same way as with usual keys in a group of points on an elliptic curve.

Now let’s see how to derive the private child key from the private parent key.

Let’s use the HMAC function again. We use the chain code of the current level of hierarchy as a key, and as a message, we use a concatenation, where the first part is the public parent key multiplied by the base point. In fact, it is bringing to a point and serializing that point. The concatenation occurs with the index of the child key, which we want to get, serialized to 32 bits, i.e. to 4 bytes.

Based on the result of the HMAC function, we get the I value, and we also consider its two parts separately: the left and right output values ​​of 256 bits each. Then we calculate the private child key ki by adding the private parent key value to the left output IL value. We perform the addition modulo n, where n is the order of the elliptic curve base point, in order not to exceed the maximum value of the private key. Thus, we have derived the private child key.

Accordingly, the child chain code will be equal to the right output value of the HMAC function, i.e. IR. If we want to get the public child key from the private parent key, we multiply the value of ki by the value of the elliptic curve base point. This will give us the public key Ki.

How do we get the public child key from the public parent key?

This calculation will be slightly different. We use the chain code of the current level of hierarchy as a key and send it to the HMAC function, after which we serialize the public parent key and concatenate it with the desired index, serialized to 32 bits. The input data is received in exactly the same way as in the previous case. To calculate the public key, we take the left side of the output value of the HMAC function and consider it as 256 bits, taken modulo the order of the base point, bring it to a point on an elliptic curve, i.e. multiply by the base point, and then add the result and the public parent key. The addition will also result in a point, and this will be the public child key. The right side of the output value of the HMAC function will be the chain code for this key.

Matching keys to each other

A logical question may arise about how the private key and the public key obtained in different ways will correspond to each other. Is it possible to get exactly the same value from a publicly generated public key by taking a private key obtained in another way and multiplying it by the base point? It’s easy to check.

Key matching

The operation of adding points on an elliptic curve is additive.

If we recall how we calculated the private child key and multiply it by the base point, i.e. bring to the point function, and then recall how we calculated the public child key and compare these calculations, we will realize that if we consider the public parent key as the multiplication of the private parent key by the base point, we perform the same calculations, just in a different order.

In one case, we added the private keys and multiplied by the base point, and in the second case, we first multiplied the values ​​by the base point, and then we added them up and got the result. Based on the fact that the operation of adding points on an elliptic curve is additive, we can say that these two values ​​are equal — we will get the same public key calculated in two ways.

Public key example

Out of curiosity, we can take a look at the example of a public key that was obtained for BIP32 test values ​​and calculations. If our Entropy consisted of 128 bits, then in hexadecimal it will look like the image below.

Key example

This same value, encoded into a BIP39 mnemonic phrase, would appear as the 12 words shown. If you use this mnemonic phrase as a master seed for hierarchical key generation, you will get the following master private key with the corresponding 256 bits chain code.

Extended keys

There are also concepts such as extended public key and extended private key. How are they used? To better understand this, let’s describe an illustrative situation.

Let’s say we have a specific user and some service. The service transfers payments with a certain frequency, for example, in bitcoins, to the user. Both the user and the service are interested in using a new address for each payment in order to confuse an outside observer and obscure the user interaction history.

In the simplest case, it would look like this: for each incoming payment, the user generates a new pair of keys, gets the address that he sends to the service, after which the service can form a transaction and make a payment. However, it is inconvenient for either party if the intensity of these payments is high.

Extended public key

In such a situation, the extended public key (xPubKey) helps get rid of inconvenience. The user can enable a third-party service to generate such addresses instead of himself. The addresses will be known to the service, but only the user will possess private keys. The service can generate any number of addresses without the user’s knowledge and send funds to them, and the user can, at his convenience, deploy private keys and gain access to any of these addresses.

How does it work? The user needs to generate a new account at the second level of key hierarchy, calculate the public key for it, as well as the chain code for the current level. After that, he needs to transfer both the public key and the chain code to the service. For convenience, Base58Check Encoding, which we have talked about, is introduced (here is a special version). Next, the public key, chain code, and checksum are concatenated. All this is encoded into base-58, and we get a public extended key already encoded according to a certain standard. It starts with the characters “xpub”, which is easy to recognize. It will look like the image shown.

Extended public key (xPubKey)

The service can accept such a key and deploy public keys for the user using BIP32, as well as get addresses from these keys and make payments to them. However, only the user can get corresponding private keys.

Hardened derivation

In hierarchical key generation, there is such a concept as hardened derivation. This is an approach that prevents public child keys from being calculated from a corresponding public parent key. The difference is that in normal derivation, the HMAC message used is the concatenation of the serialized point as the public parent key, and in hardened derivation, we use the serialization of the private parent key.

Besides, the calculation of the index is different. In normal derivation, the index is directly serialized to 32 bits, and in hardened derivation, it is somewhat transformed: a constant value of 231 is added to it, thus setting the most significant bit to 1 (it is easier to distinguish between derivation types). Accordingly, the candidate space of possible keys is the same for both normal derivation and hardened derivation and is equal to 231.

Hardened derivation

Thus, given the public parent key and hardened derivation, it is impossible to calculate the public child keys. If an attacker gets the public parent key, he will not be able to figure out the child keys. Therefore, he will not be able to get the addresses and data on their relation to the received parent key. In the case of normal derivation, he can use this function and find relations between the addresses.

Derivation paths

Let’s talk about the paths through which keys can be generated.

Derivation path

At each level of hierarchy, there is a specific index that determines the aspects of key generation. The path from the Master key to the final key can be written in the form of slash-separated indices. If we’re talking about a private key, the line begins with a small “m”, and if it’s about generating a public key, then with a capital “M”. If the index is indicated by an apostrophe, then it should be understood that this refers to hardened derivation. If there is no apostrophe, it is normal derivation.

Let’s consider one of the popular ways of generating keys. It is used in BIP32, where hierarchical keys have been determined.

BIP32

It uses a path where the master key is the zero level of hierarchy. It is followed by the indices of the accounts that denote the same user. After that, there are chains, where there may be chains of addresses that are published externally to accept incoming payments. And those chains to which the user sends payments to himself (most often, change) will be created with index 1. The final index will be used to generate the keys from which the addresses will be calculated.

To calculate the very first key with index 0 according to the BIP32 standard, we will have m, 0 with hardened derivation, chain — 0, index — 0 (m/0’/0/0). This gives us the path for the first hierarchically generated key.

There is a Bitcoin improvement proposal, which is called BIP43. It involves writing the number of the improvement, which suggests a new derivation path, to the first level of hierarchy (m/bip_number ‘/ *).

BIP43

This was how BIP44 appeared. It used the feature of the previous proposal, i.e. index 44 was written for the first level of hierarchy, and proposed the following improvements: to write a specific value, which will correspond to the type of coin used for the wallet, in the index of the second level of hierarchy. Keys for different currencies can now be deployed and used within one wallet.

BIP44

For Bitcoin, the path will look like “m/44’/0’/0’/0/0”, for Bitcoin testnet — “m/44’/1’/0’/0/0”, for Litecoin — “m/44’/2’/0’/0/0”, for Dash — “m/44’/5’/0’/0/0”. Interestingly, Ethereum uses exactly the same elliptic curve for calculating keys and digital signatures, and for its wallet, the path will look like “m/44’/60’/0’/0/0”.

There is one more improvement, which is BIP45. The improvement is aimed at defining the rules for generating keys if they are used in multisignature wallets and forming addresses according to BIP16, i.e. P2SH. It involves the BIP43 proposal and indicates index 45 at the first level of hierarchy, while it requires the indication of the cosigner at the second level of hierarchy.

BIP45

For example, there is a 3-of-5 multisignature rule. Therefore, there are 5 cosigners, but you need at least 3 signatures to spend coins. Thus, each of the cosigners will have an HD wallet with his master seed, and he will indicate his sequence number in his path. The sequence number can be calculated as an index when sorting the keys that each user has generated at the first level of hierarchy. Let’s say each user has generated keys at the first level, they exchange them with each other, sort them, and find out each other’s index for the second level of hierarchy. This is necessary to further eliminate the need to interact in this way, but immediately generate addresses correctly and know your sequence number.

This means you can exchange the extended public key once to form multisignature addresses and accept payments to them independently of other group members.

--

--