.. _synthesis:
Synthesis
=========
Software synthesis offers an alternative way of developing trustworthy
programs. At a high level, program specifications describe **what** the
function should do, and its corresponding implementation describes **how** it
will do it. While verification ensures that both views match, synthesis
consists of translating these specifications (or expectations) to executable
programs which realises them.
As we have seen previously, relatively precise specifications of complex
operations can be written concisely. Synthesis can thus reduce development time
and allows the user to focus on high-level aspects.
Deductive Synthesis Framework
-----------------------------
The synthesizer takes a synthesis problem that it extracts from ``choose`` or
holes (``???``) found in the program. Formally, we define a **synthesis problem** as
.. math::
[[ ~~ \bar{a} ~~ \langle ~~ \Pi \rhd \phi ~~ \rangle ~~ \bar{x} ~~ ]]
which carries the following information:
* a set of input variables :math:`\bar{a}`, initially all variables in scope of the ``choose``,
* a set of output variables :math:`\bar{x}`, corresponding to the values to synthesize,
* a path-condition :math:`\Pi`, constraining :math:`\bar{a}`,
* the specification :math:`\phi` relating input variables to output variables.
The job of the synthesizer is to convert this problem into a solution :math:`[ ~ P ~ | ~ T ~ ]`, which consists of
* a executable program term :math:`T` (an expression that may contain input variables),
* a precondition :math:`P` under which this program is valid
To illustrate, we consider the following program:
.. code-block:: scala
def foo(a: BigInt, b: BigInt): Int = {
if (a > b) {
choose ( (x: BigInt) => x > a)
} else {
0
}
}
From this program, we will extract the following initial synthesis problem:
.. math::
[[ ~~ a, b ~~ \langle ~~ a > b ~ \rhd ~ x > a ~~ \rangle ~~ x ~~ ]]
A possible solution to this problem would be:
.. math::
[ ~ \top ~ | ~ a + 1 ~ ]
To solve this problem, the synthesizer will apply a series of rules which will
try to either
1. Decompose this problem into sub problems, which may be simpler to solve
2. Immediately find a solution
This corresponds to an and-or graph of rule applications, which Leon will
explore.
Decomposing Rules
-----------------
Leon defines several rules that decompose a synthesis problem (a ``choose``)
into sub-problems that may be simpler to solve. Such rules also define a way to
generate a solution for the original problem, when provided with solutions for
all of the sub-problems. These rules thus both decompose the problems and
recompose the solutions. Leon defines several of such decomposing rules:
Equality Split
^^^^^^^^^^^^^^
Given two input variables `a` and `b` of compatible types, this rule
considers the cases where `a = b` and `a != b`. From:
.. code-block:: scala
choose(res => spec(a, b, res))
this rule generates two sub-chooses, and combines them as follows:
.. code-block:: scala
if (a == b) {
choose(res => spec(a, a, res))
} else {
choose(res => spec(a, b, res))
}
Inequality Split
^^^^^^^^^^^^^^^^
Given two input variables `a` and `b` of numeric type, this rule
considers the cases where `a < b`, `a == b`, and `a > b`. From:
.. code-block:: scala
choose(res => spec(a, b, res))
this rule generates three sub-chooses, and combines them as follows:
.. code-block:: scala
if (a < b) {
choose(res => spec(a, b, res))
} else if (a > b) {
choose(res => spec(a, b, res))
} else {
choose(res => spec(a, a, res))
}
ADT Split
^^^^^^^^^
Given a variable `a` typed as an algebraic data type `T`, the rules decomposes
the problem in cases where each case correspond to one subtype of `T`:
.. code-block:: scala
abstract class T
case class A(f1: Int) extends T
case class B(f2: Boolean) extends T
case object C extends T
choose(res => spec(a, res))
this rule generates three sub-chooses, in which the input variable `a` is
substituted by the appropriate case, and combines them as follows:
.. code-block:: scala
a match {
case A(f1) => choose(res => spec(A(f1), res))
case B(f2) => choose(res => spec(B(f2), res))
case C => choose(res => spec(C, res))
}
Int Induction
^^^^^^^^^^^^^
Given an Int (or BigInt) variable `a`, the rules performs induction on `a`:
.. code-block:: scala
choose(res => spec(a, res))
this rule generates three sub-chooses, one for the base case and one for each inductive case (we allow negative numbers):
.. code-block:: scala
def tmp1(a: Int) = {
if (a == 0) {
choose(res => spec(a, res))
} else if (a > 0) {
val r1 = tmp1(a-1)
choose(res => spec(a, res))
} else if (a < 0) {
val r1 = tmp1(a+1)
choose(res => spec(a, res))
}
}
tmp1(a)
This allows Leon to synthesize a well-structured recursive function.
One Point
^^^^^^^^^
This syntactic rule considers equalities of an output variable at the top level of the
specification, and substitutes the variable with the corresponding expression in
the rest of the formula. Given the following specification:
.. math::
res1 = expr \land \phi
and assuming :math:`expr` does not use :math:`a`, we generate the alternative and
arguable simpler specification:
.. math::
\phi[res1 \rightarrow expr]
Assert
^^^^^^
The `Assert` rule scans the specification for predicates that only constraint
input variables and lifts them out of the specification. Since these are
constraints over the input variables, they typically represent the
precondition necessary for the ``choose`` to be feasible.
Given an input variable `a`:
.. code-block:: scala
choose(res => spec(a, res) && pred(a))
will become:
.. code-block:: scala
require(pred(a))
choose(res => spec(a, res))
Case Split
^^^^^^^^^^
This rule considers a top-level disjunction and decomposes it:
.. code-block:: scala
choose(res => spec1(a, res) || spec2(a, res))
thus becomes two sub-chooses
.. code-block:: scala
if (P) {
choose(res => spec1(a, res))
} else {
choose(res => spec2(a, res))
}
Here we note that ``P`` is not known until the first ``choose`` is solved, as it
corresponds to its precondition.
Equivalent Input
^^^^^^^^^^^^^^^^
This rule discovers equivalences in the input variables in order to eliminate
redundancies. We consider two kinds of equivalences:
1) Simple equivalences: the specification contains :math:`a = b` at the top
level.
2) ADT equivalence the specification contains :math:`l.isInstanceOf[Cons]
\land h = l.head \land t = l.tail` which entails :math:`l = Cons(h, t)` and
thus allows us to substitute :math:`l` by :math:`Cons(h, t)`.
Eliminating equivalences prevents explosion of redundant rule instantiations.
For instance, if you have four integer variables where three of them are
equivalent, Leon has 6 ways of applying `Inequality Split`. After
eliminating equivalences, only one application remains possible.
Unused Input
^^^^^^^^^^^^
This rule tracks input variables (variables originally in scope of the
``choose``) that are not constrained by the specification or the
path-condition. These input variables carry no information and are thus
basically useless. The rule consequently eliminates them from the set of input
variables with which rules may be instantiated.
Unconstrained Output
^^^^^^^^^^^^^^^^^^^^
This rule is the dual of ``Unused Input``: it tracks output variable (result
values) that are not constrained. Such variables can be trivially synthesized
by any value or expression of the right type. For instance:
.. code-block:: scala
choose ((x: Int, y: T) => spec(y))
becomes
.. code-block:: scala
(0, choose ((y: T) => spec(y)))
Leon will use the simplest value of the given type, when available. Note this
rule is not able to synthesize variables of generic types, as no literal values
exist for these. While ``null`` may be appropriate in Scala, Leon does not
define it.
..
Unification.DecompTrivialClash,
Disunification.Decomp,
ADTDual,
CaseSplit,
IfSplit,
DetupleOutput,
DetupleInput,
InnerCaseSplit
Closing Rules
-------------
While decomposing rules split problems in sub-problems, Leon also defines rules
that are able to directly solve certain synthesis problems. These rules are
crucial for the synthesis search to terminate efficiently. We define several
closing rules that apply in different scenarios:
Ground
^^^^^^
This rule applies when the synthesis problem has no input variables. If the
specification is satisfiable, its model corresponds to a valid solution. We
rely on SMT solvers to check satisfiability of the formulas. For instance:
.. code-block:: scala
choose ((x: Int, y: Int) => x > 42 && y > x)
can trivially be synthesized by ``(1000, 1001)``.
If the specification turns out to be UNSAT, the synthesis problem is impossible
and we synthesize it as an ``error`` with a ``false`` precondition.
Optimistic Ground
^^^^^^^^^^^^^^^^^
This rule acts like `Ground`, but without the requirement on the absence of input
variables. The reasoning is that even though the problem has input variables,
the solutions might still be a constant expression.
`Optimistic Ground` also tries to satisfy the specification, but it also needs
to validate the resulting model. That is, given a valuation of output
variables, it checks whether it exists a valuation for input variables such that
the specification is violated. The model is discarded if such counter-example
is found. If no counter-example exist, we solve the synthesis problem with the
corresponding values.
The rule tries at most three times to discover a valid value.
CEGIS
^^^^^
`CEGIS` stands for Counter-Example-Guided Inductive Synthesis, it explores the
space of small expressions to find valid solutions. Here we represent the space
of programs by a tree, where branches are determined by free boolean variables.
For instance, a tree for small integer operations could be:
.. code-block:: scala
def res(b, a1, a2) = if (b1) 0
else if (b2) 1
else if (b3) a1
else if (b4) a2
else if (b5) c1(b, a1, a2) + c2(b, a1, a2)
else c1(b, a1, a2) * c2(b, a1, a2)
def c1(b, a1, a2) = if (b7) 0
else if (b8) 1
else if (b9) a1
else a2
def c2(b, a1, a2) = if (b10) 0
else if (b11) 1
else if (b12) a1
else a2
At a high-level, it consists of the following loop:
1. Find one expression and inputs that satisfies the specification:
:math:`\exists \bar{b}, a1, a2. spec(a1, a2, res(\bar{b}, a1, a2))`.
If this fails, we know that the solution is not in the search space.
If this succeeds, we:
2. Validate the expression represented by :math:`M_\bar{b}` for all inputs by
searching for a counter-example: :math:`\exists a1, a2. \lnot spec(a1, a2, res(M_\bar{b}, a1, a2))`.
If such counter-example exists, start over with (1) with this program
excluded. If no counter-example exists we found a valid expression.
The space of expressions our CEGIS rule considers is small expressions of
bounded depth (3), which contain for each type: a few literals, functions and
operations returning that type that do not transitively call the function under
synthesis (to prevent infinite loops), and recursive calls where one argument is
decreasing.
TEGIS
^^^^^
This rule uses the same search space as `CEGIS` but relies only on tests
(specified by the user or generated) to validate expressions. It is thus a
generally faster way of discovering candidate expressions. However, these
expressions are not guaranteed to be valid since they have only been validated
by tests. Leon's synthesis framework supports untrusted solutions which trigger
an end-to-end validation step that relies on verification.
String Conversion
^^^^^^^^^^^^^^^^^
This rule applies to pretty-printing problems given a non-empty list of examples, i.e. of the type:
.. code-block:: scala
choose ((x: String) =>
(input, x) passes {
case InputExample1 => "Output Example1"
case InputExample2 => "Output Example2"
}
)
It will create a set of functions and an expression that will be consistent with the example. The example need to ensure the following properties:
* **Primitive display**: All primitive values (int, boolean, Bigint) present in the InputExample must appear in the OutputExample. The exception being Boolean which can also be rendered differently (e.g. as "yes" and "no")
* **Linearization**: The output example must use the same order as the definition of InputExample; that is, no Set, Map or out of order rendering (e.g. ``case (1, 2) => "2, 1"`` will not work)
To further optimize the search, it is also better to ensure the following property
* **Constant case class display**: By default, if a hierarchy of case classes only contains parameterless variants, such as
.. code-block:: scala
abstract class StackThread
case class T1() extends StackThread
case class T2() extends StackThread
it will first try to render expressions with the following function:
.. code-block:: scala
def StackThreadToString(t: StackThread) = t match {
case T1() => "T1"
case T2() => "T2"
}
CONST + StackThreadToString(t) + CONST
where CONST will be inferred from the examples.