In this new series of blog posts, we will be introducing Coq and Mi-Cho-Coq for proving Tezos smart contracts written in or compiled to Michelson. This series is structured as follows:
Now that we’ve proven a few trivial contracts, let’s prove something a little bit more complicated, a contract containing an if statement. More precisely, we’ll be proving a contract that fails if it is not called by the contract owner, defined by the storage. Moreover, we won’t be crafting the contract in Michelson like we have been until now, but we’ll be using Ligo, a high level language that compiles to Michelson.
Let’s implement our simple if contract. We’ll define some storage type as a Ligo record that has an owner key of type address. Let’s also say it’s part of a larger contract that also keeps track balances and some supply. We won’t be using them, but it’ll make contract slightly more difficult to model and prove in Michocoq. Without further ado, here is the Ligo code we’ll be proving:
Very simple contract that does nothing but throw if not called by the owner. Let’s compile it to Michelson, we can use the Nyx docker container to have a Ligo compiler ready in seconds:
We can now hop into the container and compile our file:
Which outputs the following Michelson contract where we’ve shown what the stack looks like after each instructions in the comments:
As you can see, the output is significantly harder than that of the previous contracts we’ve proved. Moreover, we’re introduced to new Michelson instructions which might no speak for themselves:
-1 if the stack top element is less than the second stack element
0 if the stack top element is equal than the second stack element;
+1 if the stack top element is greater than the second stack element;
As always, you can have a more formal definition of these instructions by looking into the Michelson specification. Let’s quickly have a look at this Michelson so we can understand what the Ligo compiler did. Essentially, it pushes the owner from the storage as follows: DUP ; CDR ; DUP ; CAR ; CDR ;. The first DUP saves the parameter / storage pair on the stack, the CDR replaces our pair’s copy with its second element, the storage. We the call DUP again to duplicate the storage on the stack, which we can then retrieve the owner from with CAR ; CDR (CAR will retrieve the balance / owner pair from the storage, and CDR the owner from the isolated pair). Once the owner has been pushed on the stack, we push the sender’s address with SENDER, and compare both addresses. If the condition fails, we fail with an error, otherwise we continue executing the contract. The Ligo compiler then retrieves the the storage with DIG 1 ; DUP ; DUG 2, creates the operation list / new storage pair, and cleans the extra stack elements by dropping all unneeded elements. Hence, as you can see, the Ligo compiler uses DIG and DUG quite a bit to store and retrieve Ligo variables on the Michelson stack.
Let’s now see how we might prove this contract.
As we’ve been doing in the previous chapter, in order to prove our contract, we’ll need to set it up in Coq. We’ll first need to define our parameter, storage, and convert the Michelson instructions in Coq, before we can define the contract specification formally and prove that it implements the specification.
Let’s first import some modules we’ll be needing:
As you can see, we’ve imported a new module: comparable.v, which we’ll be needing since our contract compares addresses to know if the sender is the contract owner. In this module, we can find both definitions for comparable types, along with some Lemmas provided to us that’ll make our life simpler when proving more complex complex contracts.
Let’s now define our contract parameter and storage. Since they are quite complex, we define some simpler types first that we reuse to define parameter_ty and storage_ty.
We can now define some modules we’ll be needing:
Let’s now port the contract instructions to Michocoq’s syntax. Since we are now using instructions that operates on elements not directly on the top of the stack (DIG, DUG, DIP an DROP), we’ll need to use a syntax slightly more complex than raw Michelson. Here’s how the Michelson translates to Coq, we’ll analyze it right bellow:
Up until the IF statement, everything translates directly to Coq. But notice that DIG instruction right bellow the IF, it has this funny syntax: DIG (S1 := [_]) 1 eq_refl. Let’s have a look at Michocoq’s syntax.v file to get an idea of what’s going on. As can be read in the Instructions section of Michocoq’s Readme, “We use a dependent inductive type to rule out ill-typed instructions”. The inductive definition can be found here, where the A and B indexes are the instruction input and output stack types. We can find in the inductive definition of instruction all the Michelson instructions implemented by Michocoq. Many of them simply take implicit arguments, such as DUP or CAR, which is why we’ve been able to use them as such. PUSH however, also takes explicit arguments, namely the type of the data to push on the stack, and the concrete_data to push. Which is exactly how we used it the the previous chapter: PUSH nat (nat_constant 1).
Let’s now look at the DIG :
DIG takes one explicit parameter, n, and a few explicit ones, and the input and output stacks respectively have types (S1 +++ (t ::: S2)) and (t ::: S1 +++ S2). Hence Michocoq types the stack as a list of instructions with a subset of the stack, S1, above the element t to bring to the top, and a subset of the stack S2 bellow the element. Notice also that DIG returns an implication that guarantees that the length of S1 in indeed equal to n. Let’s look into it a bit more:
We’d like now to “recover” the instruction. Since we can view the resulting implication as a function taking as single argument the evidence that length ?S1 = 1, we need to pass it eq_refl which Coq gives us as the only evidence available for an equality. When eq_ref takes an argument x, it returns the evidence that x = x. Since we want to show that length ?S1 = 1, let’s try the following:
It is not working because Coq says that 1 = 1 is not an evidence for length ?S1 = 1, which is normal it does not know S1’s length. What we can do however, is explicitly pass the S1 implicit argument as follows:
Now we have to give evidence that length [?a] = 1, which eq_refl 1 is since Coq can now work out the length of S1:
As it turns out, the eq_refl can also be inferred, leading to what is the simplest syntax for Michelson’s DIG 1n: Compute (DIG (S1 := [_]) 1 eq_refl). The instructions DUG and DROP work similarly.
If you are getting started or felling a little bit rusty in Coq, having a look at Software Foundation Volume I’s chapter on inductively defined propositions to get a better understanding of why Michocoq’s syntax.v is defined as such. Moreover, we’ve passed around evidences of propositions in implications around as if the implications were functions, if you found that a bit funky, make sure to have a look at University of Pennsylvania’s Working with Explicit Evidence in Coq and Software Foundations Volume 1’s chapter on the Curry-Howard correspondence.
Now that our Michelson has been ported to Coq, we can start proving that it works as expected!
Finally, we can start proving our contract. As we’ve been doing, the first step is to define the contract specification as we’ll be proving that the implementation meets the specified requirements. The specification is very simple:
For now however, we’ll focus on the first point only. The specification looks as follows:
This specification speaks for itself, but notice the let ... in clause after the forall statement. This is slightly different from what we’ve been doing so far which was to include a storage_in in the forall statement. However, since our storage consists of nested pairs, expressing the fact that we require the owner to be equal to the sender’s address would be verbose and potentially more complex to prove. Hence, we do not introduce storage_in in the universal quantifier, but rather, the components of the input storage (balances, owner and supply) which we use to construct storage_in with the let ... in clause. This gives us the owner variable available to compare with the sender’s address, (sender env).
Now that the contract specification has been defined, we can we can start proving it. Looking at the contract implementation in Michelson, we can see that we have an if condition we’ll need to prove both branches of, both cases being:
Hence, after introducing the universal quantified variables, storage_in defined in the let ... in clause, the fuel hypothesis and doing some re-writting with the Michocoq helpers, we’ll step through the instructions with more_fuel ; simpl until we reach the condition to destruct:
We now want to perform a case analysis on the if ... then ... else clause which we can do as follows: destruct (negb (comparison_to_int (address_compare (sender env) owner) =? 0)%Z). This will introduce a new hypothesis that we’ll name condition by destructing with destruct ... eqn:condition. In the first subgoal, the condition hypothesis will be negb (comparison_to_int (address_compare (sender env) owner) =? 0)%Z = false, whereas in the second subgoal it will be negb (comparison_to_int (address_compare (sender env) owner) =? 0)%Z = true. In the first subgoal, we need to step into the condition body and through the FAIL instruction, we do that using do 2 more_fuel ; simpl. So far, our proof looks as follows:
At this point, we have to prove the if and only if statement, that we can split into two more subgoals:
The first subgoal is trivial, we introduce the hypothesis False, which we can then solve with the contradiction tactic. The second subgoal will require some more work:
At this point, we now that we’ve succeeded since negb (comparison_to_int (address_compare (sender env) owner) =? 0)%Z = true implies that sender env <> owner, but we need to show that! Let’s introduce a Lemma that we’ll admit for now:
The subgoal proof is now trivial, we apply our newly defined Lemma to the condition hypothesis, and use the intuition tactic to solve the subgoal:
Let’s now handle the second subgoal introduced by destructing the condition, which now becomes condition : negb (comparison_to_int (address_compare (sender env) owner) =? 0)%Z = false. Since the condition branch now analyzed does not throw, we can step until the end of the contract with do 6 more_fuel ; simpl.:
Again, since we have an if and only if proposition to prove, we can perform a case analysis on both directions of the implication with the split tactic. We can define a Lemma, that we’ll admit for now, to rewrite our condition hypothesis as follows:
After apply the above Lemma to our first subgoal we can use the intuition tactic to solve it. The second and final subgoal is event simpler to solve by using only the intuition tactic. The final proof is as follows:
Let’s now prove both Lemmas we have introduced. Note that if we cannot prove them, we’ll have to iterate on the above proof since that will mean that it relies on some wrong assumptions. We’ll start by proving the sender_is_owner Lemma as we can make use of other Lemmas provided by the Coq standard library and Michocoq’s util.v. As often, we start by introducing the universally quantified variables:
We’d like now to rewrite negb x = false with x = true, we can make use of the following Lemma provided by Coq:
This leads to the following goal:
We’d now like to rewrite (comparison_to_int (address_compare owner sender) =? 0)%Z = true with owner = sender. Michocoq provides us with the following Lemma in util.v we can make use of:
As can be seen, this Lemma takes a comparable_type parameter which in our case is an address. Proving sender_is_owner is then trivial:
Onto proving sender_is_not_owner! This Lemma is slightly more complex as we’ll need to further define some of our own Lemmas. Our proof starts as follows:
Notice that we now need to rewrite (comparison_to_int (address_compare owner sender) =? 0)%Z = false into owner <> sender, which we cannot use the eqb_eq Lemma for anymore. But we can make use of Coq’s Z.eqb_eq Lemma which simply states that if we ask if n and m are equal and the answer is false, then n and m are not equal:
Our goal now becomes:
We then unfold the comparison_to_int in our goal, setup some new hypotheses with intuition, and destruct the new hypothesis leading to:
We can now rewrite H0 in the goal leading to the following goal:
Let’s now define and admit (for now) the following Lemma, which states that comparing the same addresses leads to Eq which will allow us to rewrite our goal:
The sender_is_not_owner Lemma proof is now concluded as follows:
We still have a final Lemma to prove, namely address_compare_Eq. We can perform an induction on the address and simplify what we get, leading to the following goal:
Michocoq provides us with a string_compare_Eq_correct Lemma, available in comparable.v, that we can use to rewrite our goal:
Applying reflexivity afterwards then proves our final Lemma:
Thankfully, we’ve been able to prove all the Lemmas that we have introduced to prove our contract. Note that we’ve made available the final proof file.
In this rather long chapter, we’ve proved our first non-trivial contract. We’ve learned quite a bit on the way, in particular, we’ve:
From now on, you should be comfortable enough navigating the Michocoq sources in order to port Michelson code to Michocoq and find relevant Lemmas to help you out in your proofs. In the next chapter, we’ll look into how to work with Michelson’s maps, which are a very frequently used data structure when developing smart contracts.