Code Optimization Example

Let's try what we learned in a concrete example. We define a small application that holds a set of NFTs. Each NFT has a type and some underlying data. An NFT can be owned by a single user. We can model this in the following way.

This corresponds to the following Rell code:

entity nft {
  key id: integer;
  index type: nft_type;
  data: byte_array;
  mutable expired: boolean = false;
}

enum nft_type { A,B }

struct nft_a {
  condition: boolean;
  data: integer;
}

struct nft_b {
  data: byte_array;
}

entity nft_owner {
  key nft;
  index owner: user;
}

entity user {
  key name;
}

Each NFT gets a unique ID and a type. Each type corresponds to a structure nft_a or nft_b with different underlying data structures. Each user is required to have a unique name, and the nft_owner that holds the two entities together has a unique constraint on the NFT and is a joint constraint with the NFT and the user so that a user can own several NFTs, but an NFT can only have a single owner.

Let's now say that we want to create a query that gets all NFTs of type A where the condition holds true for a particular user. A very tempting way to write such a query could be:

query get_valid_a(user_name: name): list<(id: integer, data: integer)> {
  val user = user @ { user_name };
  val nfts = nft_owner @* { user }.nft;
  return nfts @* { is_valid_a($) } (.id, data = nft_a.from_bytes($.data).data);
}

function is_valid_a(nft): boolean {
  if (nft.type != nft_type.A) return false;
  if (nft.expired) return false;
  val data = nft_a.from_bytes(nft.data);
  return data.condition;
}

The query fetches a user and then all its NFTs. For each NFT, we check if it is valid and then return the results as a list of named tuples linking the IDs with the data. The conditions are checked in a separate function to reduce bloat.

To investigate the database footprint of this query, we define a few operations to help us:

operation create_user(name) {
  create user(name);
}

operation create_nft(id: integer, type: nft_type, data: byte_array, owner_name: name) {
  val owner = user @ { owner_name };
  when (type) {
    nft_type.A -> nft_a.from_bytes(data);
    nft_type.B -> nft_b.from_bytes(data);
  }
  val nft = create nft(id, type, data);
  create nft_owner(nft, owner);
}

We then create a test case where we create a user and 10 NFTs before doing the query:

function test_nft() {
    print("Creating user");
    rell.test.tx().op(create_user("Alice")).run();
    print("Creating nfts");
    for (id in range(10)) {
        rell.test.tx().op(create_nft(id, nft_type.A, nft_a(true, id*10).to_bytes() ,"Alice")).run();
    }
    print("Perform query");
    get_valid_a("Alice");
}

We can now run the test with SQL logging on:

$ chr test --sql-log --tests test_nft

Looking at the output after the line Perform query in the output, we see:

Perform query
SELECT A00."rowid" FROM "c0.user" A00 WHERE A00."name" = ? ORDER BY A00."rowid"
SELECT A02."rowid" FROM "c0.nft_owner" A00 INNER JOIN "c0.user" A01 ON A00."owner" = A01."rowid" INNER JOIN "c0.nft" A02 ON A00."nft" = A02."rowid" WHERE A01."rowid" = ? ORDER BY A00."rowid"
SELECT A00."type" FROM "c0.nft" A00 WHERE A00."rowid" = ? ORDER BY A00."rowid"
SELECT A00."expired" FROM "c0.nft" A00 WHERE A00."rowid" = ? ORDER BY A00."rowid"
SELECT A00."type" FROM "c0.nft" A00 WHERE A00."rowid" = ? ORDER BY A00."rowid"
SELECT A00."expired" FROM "c0.nft" A00 WHERE A00."rowid" = ? ORDER BY A00."rowid"
SELECT A00."type" FROM "c0.nft" A00 WHERE A00."rowid" = ? ORDER BY A00."rowid"
SELECT A00."expired" FROM "c0.nft" A00 WHERE A00."rowid" = ? ORDER BY A00."rowid"
SELECT A00."type" FROM "c0.nft" A00 WHERE A00."rowid" = ? ORDER BY A00."rowid"
SELECT A00."expired" FROM "c0.nft" A00 WHERE A00."rowid" = ? ORDER BY A00."rowid"
SELECT A00."type" FROM "c0.nft" A00 WHERE A00."rowid" = ? ORDER BY A00."rowid"
SELECT A00."expired" FROM "c0.nft" A00 WHERE A00."rowid" = ? ORDER BY A00."rowid"
SELECT A00."type" FROM "c0.nft" A00 WHERE A00."rowid" = ? ORDER BY A00."rowid"
SELECT A00."expired" FROM "c0.nft" A00 WHERE A00."rowid" = ? ORDER BY A00."rowid"
SELECT A00."type" FROM "c0.nft" A00 WHERE A00."rowid" = ? ORDER BY A00."rowid"
SELECT A00."expired" FROM "c0.nft" A00 WHERE A00."rowid" = ? ORDER BY A00."rowid"
SELECT A00."type" FROM "c0.nft" A00 WHERE A00."rowid" = ? ORDER BY A00."rowid"
SELECT A00."expired" FROM "c0.nft" A00 WHERE A00."rowid" = ? ORDER BY A00."rowid"
SELECT A00."type" FROM "c0.nft" A00 WHERE A00."rowid" = ? ORDER BY A00."rowid"
SELECT A00."expired" FROM "c0.nft" A00 WHERE A00."rowid" = ? ORDER BY A00."rowid"
SELECT A00."type" FROM "c0.nft" A00 WHERE A00."rowid" = ? ORDER BY A00."rowid"
SELECT A00."expired" FROM "c0.nft" A00 WHERE A00."rowid" = ? ORDER BY A00."rowid"

This query made 22 database roundtrips. It first makes a query to get the rowid of the user followed by a really long query which joins the three tables. After this, two lines gets repeated over and over again. This means that this query will scale proportionally in number of database requests with the number of nfts for a user.

If we instead join all tables to start with, and select all fields from the nft directly, we could optimize this query to make a single database query. Here is an updated version of the query:

query get_valid_a(user_name: name): list<(id: integer, data: integer)> {
  val nfts = (user, nft_owner) @* { // <-- 1.1 Join the tables directly
    user.name == user_name,         // <-- 2. Criteria sorting, most distinct condition first
    user == nft_owner.owner         // <-- 1.2 Join cont.
  } (
    id = nft_owner.nft.id, 
    nft = nft_owner.nft.to_struct() // <-- 3. select all fields from the nft (be more specific if needed)
    ); 
  return nfts @* { is_valid_a(.nft) } (.id, data = nft_a.from_bytes(.nft.data).data);
}

function is_valid_a(nft: struct<nft>): boolean { // <-- 4. Takes a struct as argument
  if (nft.type != nft_type.A) return false;
  if (nft.expired) return false;
  val data = nft_a.from_bytes(nft.data);
  return data.condition;
}

Let's see what we did:

We joined the tables user and nft_owner because the user will return, at most, a single value; this is fine.
We put the most specific criteria first.
We used the to_struct() to select all attributes directly.
We ensured that our function takes an in-memory version of the NFT so that we do not make separate selects in the function.

Rerunning the test produces the following result:

Perform query
SELECT A03."id", A03."id", A03."type", A03."data", A03."expired" 
    FROM "c0.user" A00, "c0.nft_owner" A01 
    INNER JOIN "c0.user" A02 ON A01."owner" = A02."rowid" 
    INNER JOIN "c0.nft" A03 ON A01."nft" = A03."rowid" 
    WHERE (A00."name" = ?) AND (A00."rowid" = A02."rowid") 
    ORDER BY A00."rowid", A01."rowid"

We successfully improved our query to make a single database roundtrip, regardless of how many NFTs are in the table.