# The Varphi Language Reference

Varphi is a domain-specific language (DSL) designed for defining, simulating, and studying Turing Machines. Unlike traditional formal definitions found in textbooks, Varphi employs a concise yet fully expressive syntax that scales from simple single-tape machines to complex multi-tape algorithms.

This document serves as the formal specification for Varphi programs.

***

### 1. Program Structure

A Varphi program describes the state transition diagram of a Turing Machine. The file structure is flat and line-oriented.

* **Transitions:** The program is a sequence of **transition lines**. Each line represents a directed edge in the Turing Machine's state graph.
* **Entry Point:** The machine represented by a Varphi program initializes execution at the **current state of the first transition line in the file**.
* **Newlines:** Newlines act as the statement terminator.
* **Whitespace:** All non-newline whitespace is completely ignored. You may indent code as you please.

#### Comments

Varphi supports C-style comments for documentation.

```
// This is a single-line comment.

/* This is a 
   multi-line comment.
*/
```

***

### 2. Transition Lines

Every valid transition line (i.e., line of code) in Varphi must contain exactly **five components** in a specific order. This strict structure ensures readability and parsing consistency.

#### Syntax

```
<Current State> <Read Tuple> <Next State> <Write Tuple> <Shift Tuple>
```

{% hint style="info" %}
The above transition line template (although not syntactically valid) is read as follows:

> If the machine's current state is `<Current State>`, and the heads of the machine read `<Read Tuple>`, then switch the machine's state to `<Next State>`, make the heads of the machine write `<Write Tuple>`, and shift the heads of the machine in the directions given in `<Shift Tuple>`.&#x20;
> {% endhint %}

#### Component Breakdown

| Component         | Description                                            | Example                   |
| ----------------- | ------------------------------------------------------ | ------------------------- |
| **Current State** | The state the machine must be in to trigger this line. | `q0`, `start`             |
| **Read Tuple**    | The pattern of symbols to match on the tape(s).        | `(0)`, `($x, BLANK)`      |
| **Next State**    | The state the machine transitions to.                  | `q1`, `halt`              |
| **Write Tuple**   | The symbols to overwrite on the tape(s).               | `(1)`, `($x, 1)`          |
| **Shift Tuple**   | The direction to move the tape head(s).                | `(RIGHT)`, `(STAY, LEFT)` |

{% hint style="warning" %}
The elements of the Read, Write, and Shift tuples of all transition lines must be enclosed in parenthesis, even if only one element is present.
{% endhint %}

#### Example

A simple incrementer state:

```
/* 
- If the machine is on state q0, and "0" is seen on the current tape,
then stay on q0, write back the "0", and move the head right
- If the machine is on state q0, and "1" is seen on the current tape,
then stay on q0, write back the "1", and move the head right
- If the machine is on state q0, and the current tape cell is blank,
then switch to q1, write a "1", and keep the head stationary.
- That is, traverse the tape until a blank cell is found, then write a
"1" to the cell.
*/
q0  (0)      q0    (0)  (RIGHT)
q0  (1)      q0    (1)  (RIGHT)
q0  (BLANK)  q1    (1)  (STAY)
```

{% hint style="info" %}
In the above program, when a blank cell is observed, the program writes "1" to the cell and switches to state `q1`, leaving the head stationary. Since no transition applies for when the machine is on state `q1` and "1" is observed, the machine *halts*. A state that always results in the machine halting is called a *halting* (or *final*) *state*.
{% endhint %}

***

### 3. The Tape Model & Symbols

Varphi abstracts the machine's tape as an infinite strip of cells. Inside the Read and Write tuples, three types of symbols function as the alphabet.

#### A. Literals

*Alphanumeric* characters represent exact values on the tape. They are used for exact matching (so `0` will match only "0" on the tape)

* **Examples:** `0`, `1`, `a`, `B`, `x`.

#### B. The `BLANK` Keyword

The keyword `BLANK` represents an empty tape cell (often denoted as $$\sqcup$$ in textbooks). This is often used when detecting end-of-input or clearing tape cells.

#### C. Variables (Pattern Matching)

{% hint style="info" %}
Variables in Varphi are purely syntactic sugar, and thus any Varphi program that uses variables can technically be converted to one that does not use them.
{% endhint %}

Variables allow a single transition to handle multiple symbols, significantly reducing code size.

* **Syntax:** A `$` followed by alphanumerical characters (e.g., `$x`, `$val`, `$1`).
* **Semantics:**
  * **On Read:** The variable *binds* to the symbol currently under the corresponding tape head.
  * **On Write:** The variable evaluates to the symbol it bound during the read phase.

**Variable Binding Examples**

The following transition reads *any* symbol, remembers it as `$x`, transitions to `q1`, writes `$x` back (no change), and moves right.

```
// Identity operation; leaves input untouched
q0  ($x)  q1  ($x)  (RIGHT)
```

**Constraint:** Variables are local to the transition line they appear in. You cannot use a variable in the `write_tuple` unless it was defined (bound) in the `read_tuple` on that same line. Further, variables with the same name that are used across different lines are totally independent.

***

### 4. Multi-Tape Support

Varphi supports $$k$$-tape Turing Machines. The dimension $$k$$ is inferred from the first line of the program (based on the lengths of the Read, Write, and Direction tuples).

#### Consistency Rule

Every transition in the file must have a tuple length exactly equal to $$k$$.

* If line 1 uses a tuple of length $$i$$ for its Read tuple, then $$k=i$$.
* If any other tuple in the program has a length that is not $$k$$, a compilation error will be thrown.

#### Example: 2-Tape Swap Machine

This machine swaps the symbols on two tapes until it reaches a blank tape cell on either one, demonstrating the concepts of variables and multi-tape support.

```
// Swap the contents of two tapes
copy ($x, $y) copy ($y, $x) (RIGHT, RIGHT)
copy (BLANK, $ignore) all_done (BLANK, $ignore) (STAY, STAY)
copy ($ignore, BLANK) all_done ($ignore, BLANK) (STAY, STAY)
```

{% hint style="warning" %}
Notice that the second and third rules will always take precedence over the first one, since they are more specific. We discuss this more [below](#specificity-and-execution-priority).
{% endhint %}

{% hint style="info" %}
By now, you should be able to recognize that `all_done` is a halting state in the above program.
{% endhint %}

***

### 5. Control Flow & Nondeterminism

#### Specificity & Execution Priority

In Varphi, if multiple transitions match the current configuration (state and read symbols), the machine selects the "best" match based on **Specificity**.

The compiler calculates a **Specificity Score** based on `(Unique Variables, Total Variables)`. Lower scores are prioritized.

1. **Priority 1: Fewest Unique Variables.**
   * Literals are more specific than variables.
   * Reusing a variable (constraints) is more specific than using distinct variables.
2. **Priority 2: Fewest Total Variables.**
   * If unique counts are equal, the rule with fewer variable slots (more literals) wins.

**Resolution Table for the 2-Tape Case**

<table><thead><tr><th width="129">Rule Pattern</th><th width="130" align="center">Unique Vars</th><th width="115" align="center">Total Vars</th><th width="101">Priority</th><th>Interpretation</th></tr></thead><tbody><tr><td><code>(0, 0)</code></td><td align="center">0</td><td align="center">0</td><td><strong>Highest</strong></td><td>Exact literal match.</td></tr><tr><td><code>($x, 0)</code></td><td align="center">1</td><td align="center">1</td><td>High</td><td>One variable, one literal.</td></tr><tr><td><code>($x, $x)</code></td><td align="center">1</td><td align="center">2</td><td>Medium</td><td>Constraint: Both heads must read the <em>same</em> symbol.</td></tr><tr><td><code>($x, $y)</code></td><td align="center">2</td><td align="center">2</td><td><strong>Lowest</strong></td><td>Catch-all: Matches any two symbols.</td></tr></tbody></table>

#### Nondeterminism

If two rules have the exact same Specificity Score and both match the tape state, the machine exhibits **Nondeterministic behavior**. The runtime will randomly select one of the valid transitions.

Below is an example of a machine where the machine stochastically picks a branch at runtime.

```
// Nondeterministic Branching
q0  (0)  state_A  (0)  (LEFT)
q0  (0)  state_B  (0)  (RIGHT)
```

***

### 6. Directions

The Shift Tuple of a transition line dictates head movement. There are exactly three valid keywords that can be used in a Shift Tuple:

* `LEFT`: Move head one cell to the left.
* `RIGHT`: Move head one cell to the right.
* `STAY`: Do not move the head.

***

### 7. Safety & Constraints

#### Write Safety

A variable cannot appear in the Write Tuple of a transition line unless it was defined (i.e., appears) in the Read Tuple of the same transition line.

```
// VALID
q0  ($x)  q1  ($x)  (RIGHT)

// INVALID: $y is undefined
q0  ($x)  q1  ($y)  (RIGHT)
```

#### Keyword Safety

While the parser allows keywords like `LEFT` or `BLANK` to be used as state names and variable names, this is strongly discouraged as it creates hard-to-read code.

{% hint style="info" %}
Pedantically speaking, `LEFT`, `RIGHT`, `STAY`, and `BLANK` are not keywords in Varphi, since they only function differently in certain contexts (i.e., certain tuples). As such, we can call them *context-specific keywords*.
{% endhint %}
