Demo: generate a program for reversing an array.
generate
stop
reset
Input/output examples
Generated program
Evaluation result

Nov 2020  

Inductive program synthesis in Javascript

Demo: iterative program generation
step
reset

Any compiler or transpiler is doing some form of program synthesis. But “proper” program synthesis usually means generation of high level programs from scratch, given only high level problem descriptions.

This post is about one of the most simple and straightforward methods of automatic program synthesis: inductive synthesis based on input-output examples specification and brute-force enumerative search.

Synthesis task specification

Let's start with a simple example. Suppose that we want to automatically generate a program that reverses an array.

How to formulate this task for the automatic program generator? One option is to just describe it in natural language. But it would be quite hard to automatically parse and understand. A simpler way is to provide examples of inputs and expected outputs:

[1, 2, 3, 4, 5, 6] → [6, 5, 4, 3, 2, 1]

[5, 2, 7, 6] → [6, 7, 2, 5]

It's similar to a programmer working on a task using test-driven development, with a set of self-explanatory tescases.

Brute-force search

How to find a program that satisfies some input-output examples? There are several possible search strategies, but the simplest one is just with a brute force enumeration.

The idea is to generate all programs (within some size liimts and a limited language subset), and pick the first one that works correctly for all specified examples.

This is probably the least efficient approach of all, but with some additional optimizations and heuristics for reducing the search space it can actually be practical.

For example, an algorithm that won last year's synthesis competition SyGuS-Comp 2019 in programming-by-examples track, is based on brute-force enumerative search strategy. As well as some previous years winners (EUSOLVER, Enumerative CEGIS Solver). There are even some practical applications, for example brute-force search can be used for synthesising distributed systems protocols.

Generating Javascript

For generating and manipulating programs, code has to be represented as a data structure. This is not a problem in homoiconic languages like Lisp, but in Javascript it’s harder.

One simple idea (used, for example in miniMAL project https://github.com/kanaka/miniMAL), is to use JSON for representing programs. In this case, JSON arrays are serving as an analog of Lisp s-expressions.

Similar to an s-expression, each JavaScript expression is represented as a JSON array. First element is a function name, and the rest of elements are for function arguments.

For example:

Javascript JSON
Math.pow(x, 2) ["Math.pow", "x", 2]
Math.pow(Math.sin(x), 2) [“Math.pow”, [“Math.sin”, “x”], 2]

For operators, can use prefix notation:

x = 2 + 2 ["=", x, ["+", 2, 2]]

There are some special expressions formats: control flow, variables declaration.

Javascript JSON
let x = 1 ["let", "x", 1]
for (let i = 0; i < 10; i += 1) { console.log(i) } ["for”, ”i”, ”0”, ”10”, ”1”,["block", [“console.log”, “i”]]]
if (i > 1) { i = 1; } ["if", [“>“, “i”, 1], ["block", [“=“, “i”, 1]]]

Also need syntax conventions for calling methods, accessing properties, and indexing arrays:

let a = new Array(0, 1, 2) ["let", "a", ["new Array", 0, 1, 2]]
a.shift() [".shift", "a"]
let n1 = a.length ["let", "n1", [".length_", "a"]]
let n2 = a[1] ["let", "n2", ["get_index", a, 1]
a[1] = 2 ["set_index", a, 0, 0]

This allows to represent a wide range of possible JavaScript expressions, and covers all language features needed for a simple demo.

Code for transforming from JSON representation to Javascript is fairly simple. It just has to traverse arrays, print out expressions elements in correct order, and deal with code formatting.

Simple DSL

For brute-force search to be feasible, search space has to be limited to some restricted domain-specific language subset, which depends on the kind of the target problem.

For example, can design a simple DSL targeted for simple array processing. It should have functions for array access and manipulation, simple arithmetics, and control flow functions.

function Javascript JSON
subtract number = number?? - number?? ["-", "number??", "number??", "number"]
add number = number?? + number?? ["+", "number??", "number??", "number"]
greater than boolean = number?? > number?? [">", "number??", "number??", "boolean"]
less than boolean = number?? < number?? ["<", "number??", "number??", "boolean"]
equal boolean = number?? == number?? ["==", "number??", "number??", "boolean"]
array length number = array??.length [".length_", "array??", "number"]
get array element at index number = array??[number??] ["get_index", "array??", "number??", "number"]
set array element at index array??[number??] = number ["set_index", "output", "number??", "number??", ""]
for loop for (let i in array??) { ?? } ["for", "i", ["block", "??"], "in", "array??", ""]
If condition if (boolean??) { ?? } ["if", "boolean??", ["block", "??"], ""],

In the DSL description, "??" stands for an undefined value, that can be replaced with an expression or a variable. This notation is taken from SKETCH language, where "??" represents a hole, a placeholder that the synthesizer can replace with an integer constant.

Some undefined values have annotated types. For example "number??" means that the argument should be replaced with an expression that has "number" return value. This allows for additional limitations on the search space.

Algorithm for program generation

How to generate all programs that can be represented with the DSL? The algorithm is fairly simple: start with some undefined values, and iteratively replace them with all possible options picked from the list of DSL expressions. Selected and pasted expressions would contain some undefined values of their own, so the process is repeated until all the options are used, or until expressions tree depth reaches some limit.

It’s like iterative template rewriting, and it’s easier to demonstrate with an example:

1. Start with an empty/undefined program ["??", "??", "??", "??"]
2. Traverse the expressions tree, and pick the first occurring undefined value ["??", "??", "??", "??"]
3. List all options for replacing this value (expressions that have an empty return type) ["set_index", "output", "number??", "number??", ""]
["if", "boolean??", ["block", "??"], ""]
["if", "boolean??", ["block", "??"], ""]
4. Repace undefined value with the first option [["set_index", "output", "number??", "number??", ""], "??", "??" ,"??"]

Steps 2-4 are repeated iteratively. Expressions tree is traversed depth-first, and each found leaf corresponds to a unique program. Press “next” button in the demo below, to see next iterations.

Demo: iterative program generation
step
reset

Generated programs are verified with the given input-output examples:

  1. Translate from JSON represenation to Javascript
  2. Fill pre-defined input array variable with an example
  3. Evaluate the program with eval()
  4. Check if contents of the pre-defined output variable correspond to the output example.

The demo below shows program generation for "array reversal" problem. Press “generate” button to see continuous generation, or use “next” to see generation steps. It generates and checks around a million of short programs per minute, and is able so solve the problem in ~10-20s. I tweaked the DSL to have a bare minimum of available expressions, for search to be faster. Size of the program is also limited by number of used varialbes (limited to 4). It would still work with larger more generic DSL, just would take more time.

Demo: generate a program for reversing an array.
step
generate
stop
reset
Input/output examples
Generated program
Evaluation result
DSL specification

Some more examples are show below. They also have custom DSLs tuned for each problem, so that it does not take too much time (~10-20s).

Example: generate a program that finds sum of array elements

Demo: generate a program that finds sum of array elements
step
generate
stop
reset
Input/output examples
Generated program
Evaluation result
DSL specification

Example: generate a program that finds max element in an array

Demo: generate a program that finds max element in an array
step
generate
stop
reset
Input/output examples
Generated program
Evaluation result
DSL specification

References & links

(book) Program Synthesis
Sumit Gulwani, Oleksandr Polozov, Rishabh Singh

(course materials) Introduction to Program Synthesis
http://people.csail.mit.edu/asolar/SynthesisCourse/Lecture2.htm

Papers

Dimensions in Program Synthesis
Sumit Gulwani

Inductive Programming (A Survey of Program Synthesis Techniques)
Emanuel Kitzelmann


Scaling Enumerative Program Synthesis via Divide and Conquer
Rajeev Alur, Arjun Radhakrishna, and Abhishek Udupa

CVC4SY for SyGuS-COMP 2019
Andrew Reynolds, Haniel Barbosa, Andres Nötzli, Clark Barrett, Cesare Tinelli


The Sketching Approach to Program Synthesis
Armando Solar-Lezama

Blog posts

Software that writes software: on program synthesis
Adithya Ganesh

A Program Synthesis Primer
Aws Albarghouthi

Program Synthesis in 2017-18
Alex Polozov

Program Synthesis is Possible
Adrian Sampson

Homoiconicity, Lisp, and Program Synthesis
Rajesh Jayaprakash

Program Synthesis in 2019
Program Synthesis Explained
Building a Program Synthesizer
Building Your First Program Synthesizer
James Bornholt