Build Your Own: TOTP-based MFA
“Please input the six-digit code from your authenticator app” - we are seeing these lines at an increasing frequency day-by-day, and for a good reason. Let’s understand what goes behind the scenes and how these two seemingly disconnected parties come to agree on the same six-digit code every time. We’ll take a look at the Time-based one-time password (TOTP) standard that is commonly used by our MFA (Mutli-Factor Authentication) authenticator apps to generate ephemeral codes or one-time passwords (and no, you don’t need to be online for the MFA to work on your authenticator).
If you provide the right code, the service you are trying to sign into lets you in. How does the service validate your code though? Especially when it doesn’t know what authenticator service you are using? The core idea behind TOTP-based MFA is that both the parties calculate the same number based on a standard set of parameters. To ensure that only the two parties in question can do this, a secret is introduced (which ideally is known only by the two parties) in this calculation. The Internet Engineering Task Force (IETF) has adopted TOTP as a standard and it has been described in RFC 6238.
The Algorithm
The TOTP algorithm is very straightforward. It involves the use of HMACs (hash-based message authentication code) to generate an bytestream, a specific part of which is extracted and converted to an integer and presented as the output. HMAC requires two input parameters to generate a bytestream - a ‘key’ and a ‘message’ on which the HMAC is calculated. The first parameter of the HMAC input, the ‘key’ is the secret we talked about earlier. The service requiring verification generates a random string, and encodes it in base32. This is communicated to the client either by the user typing the key in, or by the use of a QR code (format: otpauth://<TYPE>/<ISSUER?>:<ACCOUNT>?secret=<SECRET>
- see this for more information on the format).
TOTP builds on top of HOTP (HMAC-based OTP - RFC 4226), in which a counter was used as the ‘message’ to generate new codes, leading to the problem of the two parties being out-of-sync after long periods of time. To prevent this, TOTP uses the current UNIX timestamp (the number of seconds elapsed since a given point in time, usually Jan 1, 1970, 00:00:00) as the ‘message’. To provide a reasonable time window for operation, a new code is generated only every 30 seconds.
So, the secret is communicated to the user’s authenticator app and the timestamp is calculated as:
$$ C_T = \bigg ⌊ \frac {T-T_0} {T_x} \bigg ⌋ $$
where CT is the ‘message’ timestamp, which is calculated as the floor integer from the current timestamp (T), relative to an epoch (T0, usually 0), taken in Tx intervals (usually 30).
The TOTP calculation then involves calculating the HOTP of the two, using the algorithm notified by the verifier in the previous exchange along with the secret (SHA1 by default).
$$ TOTP\text{\textunderscore}interim = HOTP(secret, C_T) $$ $$ TOTP = truncated(TOTP\text{\textunderscore}interim) $$
This gives us a long bytestream to work with (20 bytes for SHA1). How do we then get the six-digit code from this? This ’truncation’ is done following the HOTP truncation algorithm, which goes as follows:
- First get the last 4 bits from the HMAC bytestream. This would give us a number between 0 and 15.
- Using this number as an ‘offset’ index, we extract 4 bytes from the HMAC bytestream -
TOTP_interim[offset:offset+4]
. We ignore the most significant bit (to avoid signed/unsigned issues). This would give us an 8-digit number, which can be then truncated to 6 or 7 digits if required (by considering only the last 6 or 7 digits of the number)
Thus, we now have a number that can be generated by both parties independently, and one that refreshes itself after 30 seconds. You go ahead and put this number in the webpage, and you are in!
TOTP in practice
Dealing with network and human latencies
TOTPs are ephemeral, with a default refresh period of 30 seconds. But imagine you reading the TOTP at t=23s, typing it out at t=25s, hitting enter at t=27s and then it reaching the server at t=31s, at which point, since the refresh period has been exceeded, the server will now calculate a different TOTP, and it doesn’t match. Worse, you might input the TOTP quite early in the cycle, but your network isn’t fast enough to ensure that it reaches the server within the refresh cycle! Neither of these are valid reasons to deny you access!
To prevent such problems, the validating servers usually validate your TOTP in a rolling window rather than just the current timestamp. Usually, the server will use a maximum of two past time cycles to generate TOTPs, and if your TOTP is one of the three provided, you attempt won’t be failed. It goes without saying, once a code is used, the server doesn’t allow reuse of the same TOTP.
Security considerations
The biggest security consideration with TOTP is securing the secret used, since all other parameters are either standardised or known by everyone. Therefore, it is necessary that the secrets are only known by the two parties involved. It is recommended that all communications happen over secure channels (TLS/IPSec), keys are generated with sufficient pseudo-randomness. The keys should be encrypted for as long as possible, with storage in a secure form with access restricted only to the programs that require it, and exposing them only when required - and for as short a time as possible. Keys are often decrypted for code generation/validation and immediately re-encrypted to limit exposure in memory snapshots.
These security considerations clearly explain the architecture of the authenticator programs we use - until recently, none of them had the ability to use multiple devices to generate the same code, i.e., by sharing of the secrets across devices, since they were always stored locally. This also meant that losing your phone/second factor would simply lock you out of your account if you didn’t have backup codes to login and change devices (i.e., register a new factor using a new secret). On the other hand, since there is no communication involved between the authenticator program and the validator service, there is no need for your second factor to be online and connected to the internet to generate your TOTPs. This has enabled the growth of dedicated hardware ’tokens’ (e.g., FIDO keys) that are used as the second factor to generate your TOTPs, reducing the risk of phone compromise.
This is how easy it is to understand how your authenticator apps work. I put together a quick implementation of the TOTP algorithm here. Though this is a simple script, it is in no manner the most secure way of generating TOTPs. An application that utilizes this application will have to take into consideration the security aspects of secrets storage and communication. I might take a stab at building a proof-of-concept later.