Jan 13, 2024

Storing WebAuthn challenge values without a database in Elixir

I’ve been messing around with WebAuthn and Passkeys. It’s been a whole ass journey to implement, considering the technology is still very early in the adoption cycle. There’s little support, almost no libraries, all that.

One of the key requirements for the challenge-response authentication in passkeys is the brief storage of a challenge string. The spec authors explain it better than I can:

As a cryptographic protocol, Web Authentication is dependent upon randomized challenges to avoid replay attacks. Therefore, the values of both PublicKeyCredentialCreationOptions.challenge and PublicKeyCredentialRequestOptions.challenge MUST be randomly generated by Relying Parties in an environment they trust (e.g., on the server-side), and the returned challenge value in the client’s response MUST match what was generated. This SHOULD be done in a fashion that does not rely upon a client’s behavior, e.g., the Relying Party SHOULD store the challenge temporarily until the operation is complete. Tolerating a mismatch will compromise the security of the protocol.

If you’re not familiar, replay attacks are when someone grabs the content of your authentication request, sends it to the server and potentially gains access to your account.

Here are the two requirements for challenge strings:

  1. Invalidatable: after use, they should be invalidated (crucial for replay attacks)
  2. Expirable: they should expire after a set time

This is easy for any website using sessions.

  • Generate random challenge string.
  • Store in session variable (e.g. session[:current_challenge])
  • At login time, validate that the user-supplied challenge matches the session variable.
  • Destroy the session variable (invalidate the token).

However, if you’re building an API, this becomes complicated. API-centric applications avoid solutions like sessions to ensure the backend remains stateless.

What do?

Signed tokens?

The first solution that came to mind was using signed tokens via Phoenix.Token. In essence, this can take any piece of arbitrary data, cryptographically sign it and assign an expiry date to it (max_age).

seed = :crypto.strong_rand_bytes(64) 
token = Phoenix.Token.sign(Timeline.Endpoint, "stateless_challenge", seed)
Base.url_encode64(token, padding: false)
# => "3qn945..."

Aside: you might notice I’ve Base64 encoded the challenge. The specification notes challenges should be encoded like that.

At verification time, this is straightforward:

# 3 minutes expiry
max_age = 60 * 3
Phoenix.Token.verify(Timeline.Endpoint, "stateless_challenge", challenge, max_age: max_age)
# => {:ok, data}

This is a stateless solution. There is no server-side storage required. Using signed tokens, it’s possible to issue challenges that reduce the possibility of replay attacks by shortening the vulnerable window of time.

Note that I said reduce. Assuming the user takes 20-30 seconds to sign in, there’s a window between that time and the remaining three minutes for mayhem to occur. The expiry time can be shortened more, but then it’d degrade user experience: what if the challenge expires while they look around the login page?

Furthermore, it’s impossible to invalidate a challenge in rotation until it expires. It’s the same problem JWTs and other forms of stateless authentication face.

Truthfully: odds of this being exploited are nil. Any sane risk assessment would make you understand that this isn’t a likely scenario, especially with such a short attack window. But why not do things the right way?

I didn’t want to bring out the database. Bringing up an Ecto schema, Postgres table, multiple queries for login... it all seemed so heavyweight. So much cruft for something that a simple key-value store would fix.

OTP enters the chat

A database solution felt heavyweight. A stateless token solution compromised on security.

I decided to OTP my way out of the problem. I knew I could use Elixir/OTP GenServers to create lightweight key-value stores for random purposes. I’ve done this before.

I present to you the challenge store, written in pure OTP:

defmodule WebAuthn.ChallengeStore do
  use GenServer
  require Logger

  @table_name :webauthn_challenge_store
  # Let tokens expire after 1 hour.
  @max_age_seconds 60 * 60
  # Prune every 12 hours.
  @prune_interval_seconds 60 * 60 * 12

  def init(arg) do
    :ets.new(@table_name, [
      :set,
      :public,
      :named_table,
      {:read_concurrency, true},
      {:write_concurrency, true}
    ])

    Logger.info("Initialized Webauthn.ChallengeStore.")

    {:ok, arg}
  end

  def start_link(arg) do
    GenServer.start_link(__MODULE__, arg, name: __MODULE__)
  end

  def create_challenge() do
    challenge_str =
      :crypto.strong_rand_bytes(64)
      |> Base.url_encode64(padding: false)


    # challenge => timestamp
    put(challenge_str, DateTime.to_unix(DateTime.utc_now()))

    {:ok, challenge_str}
  end

  def validate_challenge(challenge_str) do
    current_timestamp = DateTime.to_unix(DateTime.utc_now())

    case get(challenge_str) do
      {:ok, inserted_at} when current_timestamp - inserted_at < @max_age_seconds ->
        {:ok, challenge_str}

      {:ok, _} ->
        {:error, :expired}

      _ ->
        {:error, :not_found}
    end
  end

  def invalidate_challenge(challenge_str) do
    :ets.delete(@table_name, challenge_str)
    :ok
  end

  @doc """
  Erase all challenges older than @max_age_seconds.
  """
  def prune_old_challenges() do
    time_now = DateTime.to_unix(DateTime.utc_now())
    # We want to delete all challenges older than now - @max_age_seconds.
    time_to_compare = time_now - @max_age_seconds

    records_deleted =
      :ets.select_delete(@table_name, [{{:_, :"$1"}, [{:<, :"$1", time_to_compare}], [true]}])

    Logger.info("Pruned challenges: #{records_deleted}")
  end

  def peek() do
    :ets.tab2list(@table_name)
  end

  def clear() do
    :ets.delete_all_objects(@table_name)
  end

  def handle_info(:prune, state) do
    prune_old_challenges()

    schedule_prune()
    {:noreply, state}
  end

  defp get(key) do
    case :ets.lookup(@table_name, key) do
      [] ->
        {:error, :not_found}

      [{_key, value}] ->
        {:ok, value}
    end
  end

  defp put(key, value) do
    :ets.insert(@table_name, {key, value})
    :ok
  end

  defp schedule_prune() do
    Process.send_after(self(), :prune, @prune_interval_seconds * 1000)
  end
end

It’s elegant. It maps challenge keys into Unix timestamps of insertion time. Doing simple arithmetic with timestamps, keys expire after a fixed number of seconds. Deleting the key invalidates that challenge string on demand.

Plug it into your supervision tree and you’re done.

ChallengeStore.create_challenge() 
# => "4qw8od7yr9q23dfyr..."
ChallengeStore.validate_challenge(challenge)
# => {:ok, <<194, 250, 84, ...>>}

Using OTP primitives, it periodically prunes any old entries and keeps itself clean.

It’s inspectable via IEx:

ChallengeStore.peek
# [ {"SFMyNTY.g2g ...", 1705111407} ]

OTP is awesome

Solutions like this remind me of why I fell in love with Elixir in the first place. I’m unsure what other programming language has the primitives that allow for building elegant solutions like this.

Song?

References

This article has been super helpful for learning how to implement passkeys from scratch. Check it out!