Building an Arbitrage Bot on Starknet: Part 1 - The basics

󰃭 2024-08-08

Introduction

This article is part of a series on building an arbitrage bot between centralised (CEX) and decentralised (DEX) exchanges on Starknet.

In this article I will cover the functionalities of the bot, explaining its functionality, how it works, the library used, and the underlying mathematics.

What is an arbitrage bot?

There are many resources describing what an arbitrageur is and what arbitrage bots do. However, I will try to give my own perspective on the subject.

In a market where many different entities offer services for the exchange of assets, these assets may have different prices, even if they are the same. This is because each service operates separately and users determine the price through their behaviour.

Popular services attract more users, which means more liquidity and a more stable price. On the other hand, services with less liquidity may have a more volatile price. Fewer trades also update the price more slowly. These differences are known as market inefficiencies. An arbitrageur is an entity that tries to make a profit from these inefficiencies by buying the asset at a lower price and selling it at a higher price.

Arbitrageurs do not exist only in the crypto space but are also a fundamental part of traditional finance.

CEX vs DEX

In the blockchain space, we have two main types of exchanges: centralised and decentralised. A centralised exchange (CEX) is basically a trading platform run by one company that lets people buy and sell crypto assets. They usually have an order book where users can enter their orders and the engine will execute them.

A decentralised exchange (DEX) lives on a blockchain in the form of smart contracts, sometimes regulated by a DAO. They are a backbone of decentralised finance (DeFi). Automated Market Makers (AMMs) are a type of DEX that use a mathematical formula to determine the price. Unlike CEXes, AMMs consist of liquidity pools, typically with two or three crypto assets. The most popular liquidity pools are Uniswap V2 that follows the formula $$x * y = k$$ or Uniswap V3 with concentrated liquidity.

DeFi products are not just AMMs. The most famous order book product is probably DyDx.

Risks

An arbitrage bot has two big risks:

  • Token exposure or inventory risk
  • Failed swaps

Token exposure

The risk is not related to the bot, but with the tokens themselves. An arbitrage bot is not interested in the value of the token it is trading. The problem lies in the value of the tokens, if any go down in value, the profits will also go down, reducing the profitability of the bot. The only solution is not to trade very risky pairs or to sell the profits immediately.

Failed swaps

Trades are not atomic, so one of them may be executed at a difference price or fail. In addition, centralised exchanges may also have hidden orders or may not execute (unless the market type is not selected).

For DEXes it is even more complicated:

  • Bots must constantly monitor the mempool to see if there are trades that change the price pool, but with private or encrypted mempool this is no longer possible.
  • If there are many trades before ours, it may fail because the swapped price is lower than minPrice (or the slippage is higher than the one set).

The solution is to build an atomic arbitrage bot that takes the opportunity opportunity in a single transaction and if it fails, the swaps are not executed. Fortunately, as I wrote in the previous article, Starknet does not have these problems.

Terminology

Before describing how the bot works, let’s look at some definitions. If you are already familiar with them skip this section.

  • Token: a representation of an asset that has been tokenised, i.e. USDC or WBTC
  • Pair or trading pair: describes two tokens that are traded on an exchange, i.e. BTC and USD
  • Base: the first token that appears in a pair
  • Quote: the second token that appears in a pair. Also known as counter currency
  • Symbol: a unique combination of letters and numbers that represents a pair Usually, consists of the base and quote in the pair, e.g BTC/USD
  • Spread: a difference or gap between two prices
  • Bid: the highest price a buyer will pay to buy a token
  • Ask: the lowest price a seller will pay to sell a token

How the bot works

The bot consists of several steps:

  • it fetches the tickers from both exchanges
  • it calculates the spread between the two tickers
  • if there is a profit, it tries to capture it by placing a buy/sell market order on the exchanges.

Here is a sequence diagram showing an interaction of the bot Bot sequence diagram

Step 1: Getting data

The bot needs to get the token prices from the exchange all the time. With centralised exchanges, there is an endpoint called ticker that returns a message like this:

{
  "symbol": "BTCUSDT",
  "priceChange": "-1067.94000000",
  "priceChangePercent": "-1.619",
  "weightedAvgPrice": "66057.25597271",
  "prevClosePrice": "65957.82000000",
  "lastPrice": "64889.88000000",
  "lastQty": "0.00109000",
  "bidPrice": "64889.88000000",
  "bidQty": "3.24493000",
  "askPrice": "64889.89000000",
  "askQty": "0.16658000",
  "openPrice": "65957.82000000",
  "highPrice": "66849.24000000",
  "lowPrice": "64632.00000000",
  "volume": "20494.74928000",
  "quoteVolume": "1353826899.28544630",
  "openTime": 1722370827899,
  "closeTime": 1722457227899,
  "firstId": 3709477216,
  "lastId": 3711014970,
  "count": 1537755
}

We are only interested in the price and quantity of the bid and ask (bidPrice, bidQty, askPrice, askQty from the example).

AMMs work in a different way and they do not produce a ticker, so we need to simulate it. There are two ways we can get the price from a pool:

  • Ask the price directly from the pool (i.e. Uniswap V3 pools) or calculate it from the liquidity (i.e. Uniswap V2 pools).
  • Simulate the trade with a specific amount and then calculate the price from the amounts.

The latter is better because it is more precise. In an AMM the amount you are swapping affects the price. The higher the quantity, the more you unbalance the pool (and the less you get), and the more the price changes. Thus, we can ask to the pool to simulate a trade, calculate the price and then simulate the ticker:

  initial_amount = 1 * 10**18 # amount that we are interested to swap
  amount_out = pool.swap(initial_amount) # the pool returns the swapped amount
  price = amount_out / initial_amount
  # build a fake ticker message
  ticker = {
    "bidPrice": price,
    "bidAmont": initial_amount,
    "askPrice": price,
    "askAmont": initial_amoun,
  }

A fundamental concept to know is that prices are ALWAYS IN BASE. For this reason, in the snippet, bid and ask are the same.

Once the tickers are fetched, we need to find the best bid and ask among them. An important thing to understand is that we are trading against these values. This mean that we will sell at the best bid and buy at the best ask.

If we only trade on two exchanges all we need to do is compare the two tickers and get the best amounts. However, with multiple exchanges, the formula is more general. Let $$N$$ be the number of exchanges we use, $$bid_i,\ ask_i$$ $$i \in N$$ the bid and ask of exchange $$i$$. The $$bestBid$$ and $$bestAsk$$ are defined as

\[ bestBid = \max(bid_{1...N}) \] \[ bestAsk = \min(ask_{1...N}) \]

Why not asking the order book?

Instead of the ticker, the order book gives us much more information about how much liquidity there is in the pair. This information can be useful in later steps when the bot has to decide on the size of the orders.

The decision to use the ticker is because the logic around order sizing is simple: given a predefined amount, the bot tries to create an order with that amount. In addition, when a trade is made, the ticker is updated and, if the opportunity is still there, we can continue with our swaps.

Step 2: Calculate the spread

Once the bot has fetched best bid and ask, it can proceed to calculate the spreads. As I wrote in terminology, the spread is the difference between the bid/ask of one exchange and the ask/bid of the other. Let’s say $$A$$ and $$B$$ are two different exchanges, the spread is:

\[ spread = Bid_{A} - Ask_{B} \] \[ spread\% = \frac{spread}{Bid_{A}} \]

Two spreads are calculated, one for each bid/ask side. If one of the spreads is $$> 0$$ or greater than a threshold, the bot moves to the next step: sending the orders.

Step 3: Sending orders

At this point we know there is an opportunity with and there is a profit to be made. We smell 💵 💵 money 💵 💵. The bot needs to send the two orders to the exchanges.

For the centralised exchange, we will send a MARKET. For the DEX, we will build a transaction with the swap to do. As I mentioned in my previous article, in Starknet transactions are executed immediately so we do not need to wait for the new block to be mined.

But what is the order size? The strategy is very simple and consists of to trade a pre-defined amount. However, the trade may not be successful for two reasons:

  • Bid or ask amounts may be lower
  • Accounts or wallet may have less tokens

We can solve the problem with the following formula. Let $$k$$ be the amount we want to swap, $$ask$$ the price of ask, $$amount_{ask}$$ the maximum amount we can trade at $$ask$$ price and $$wallet_{ask}$$ are the funds available on the exchange. $$bid$$, $$amount_{bid}$$ and $$wallet_{bid}$$ are defined in the same way. Therefore, the maximum value $$A$$ we can trade is:

\[ A = \min(k,\ ask,\ bid,\ wallet_{bid},\ wallet_{ask} * ask) \]

Once the value to swap has been determined, the bot sends the orders, waits for the response and starts again from the first step.

Project setup

I chose Python to develop the bot. For this type of bot, a faster language like Rust is not needed because the bottleneck is the latency of the exchanges. Because Python is a high-level language, we can develop much faster, and you do not have to worry about things like floating-point notation or big numbers.

As a dependency manager I use poetry

$ poetry new stark-arbitrage

And install the dependencies

$ poetry add python-dotenv rich ccxt starknet-py

Note that starknet-py requires additional external packages. See Installation for more information. The library is required to connect to Starknet and retrieve account balances, and to sing and broadcast the transaction.

The centralised exchange that we will use is Binance through ccxt. For the decentralised exchange we will use AVNU to get the prices and to swap.

Ccxt

Ccxt (CryptoCurrency eXchange Trading Library) is a cryptocurrency trading library with support for many exchange markets, including Binance. The library is multi-language (JS, Python, PHP and C#) and provides a unified API across exchanges.

We are interested in the pro version because it offers support for websocket.

AVNU

AVNU is a decentralised exchange protocol (a dex aggregator) designed to provide the best execution. The service has a nice API that returns the price and a simulated price (in this case it provides the best route for our trade) for all supported exchanges.

We will use AVNU to get the pool prices and ask it to create the transaction, which we will then sign and broadcast.

Conclusions

This article ends the long initial explanation on how Starknet and an arbitrage bot works. In the next article I will start to explain how to develop the steps described in the previous sections.

As I mentioned in my first article of the series there is a group to discuss MEV on Starknet. If you are interested in the argument, want to discuss or have some interesting information, you can find me there. Feel free to join the group.