Skip to content

Interpreter Design Pattern

Video Lecture

Section Video Links
Interpreter Overview Interpreter Overview Interpreter Overview 
Interpreter Use Case Interpreter Use Case Interpreter Use Case 
String Slicing String Slicing String Slicing 
__repr__ Method __repr__ Dunder Method __repr__ Dunder Method 

Overview

The Interpreter pattern helps to convert information from one language into another.

The language can be anything such as words in a sentence, numerical formulas or even software code.

The process is to convert the source information, into an Abstract Syntax Tree (AST) of Terminal and Non-Terminal expressions that all implement an interpret() method.

A Non-Terminal expression is a combination of other Non-Terminal and/or Terminal expressions.

Terminal means terminated, i.e., there is no further processing involved.

An AST root starts with a Non-Terminal expression and then resolves down each branch until all expressions terminate.

An example expression is A + B.

The A and B are Terminal expressions and the + is Non-Terminal because it depends on the two other Terminal expressions.

The Image below, is an AST for the expression 5 + 4 - 3 + 7 - 2

Abstract Syntax Tree Example

The official Interpreter pattern described in the original GoF Design Patterns book does not state how to construct an Abstract Syntax Tree. How your tree is constructed will depend on the grammatical constructs of your sentence that you want to interpret.

Abstract Syntax Trees can be created manually or dynamically from a custom parser script. In the first example code below, I construct the AST manually.

Once the AST is created, you can then choose the root node and then run the Interpret operation on that, and it should interpret the whole tree recursively.

Terminology

  • Abstract Expression: Describe the method(s) that Terminal and Non-Terminal expressions should implement.
  • Non-Terminal Expression: A composite of Terminal and/or Non-Terminal expressions.
  • Terminal Expression: A leaf node Expression.
  • Context: Context is state that can be passed through interpret operations if necessary.
  • Client: Builds or is given an Abstract Syntax Tree to interpret.

Interpreter UML Diagram

Interpreter UML Diagram

Source Code

In this example, I interpret the string 5 + 4 - 3 + 7 - 2 and then calculate the result.

The grammar of the string follows a pattern of Number ⇾ Operator ⇾ Number ⇾ etc.

I convert the string into a list of tokens that I can refer to by index in the list.

I then construct the AST manually, by adding a

  1. Non-Terminal Add row containing two Terminals for the 5 and 4,
  2. Non-Terminal Subtract row containing the previous Non-Terminal row and the 3
  3. Non-Terminal Add row containing the previous Non-Terminal row and the 7
  4. Non-Terminal Subtract row containing the previous Non-Terminal row and the 2

The AST root becomes the final row that was added, and then I can run the interpret() method on that, which will interpret the full AST recursively because each AST row references the row above it.

./interpreter/interpreter_concept.py

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
# pylint: disable=too-few-public-methods
# pylint: disable=arguments-differ
"The Interpreter Pattern Concept"

class AbstractExpression():
    "All Terminal and Non-Terminal expressions will implement an `interpret` method"
    @staticmethod
    def interpret():
        """
        The `interpret` method gets called recursively for each
        AbstractExpression
        """

class Number(AbstractExpression):
    "Terminal Expression"

    def __init__(self, value):
        self.value = int(value)

    def interpret(self):
        return self.value

    def __repr__(self):
        return str(self.value)

class Add(AbstractExpression):
    "Non-Terminal Expression."

    def __init__(self, left, right):
        self.left = left
        self.right = right

    def interpret(self):
        return self.left.interpret() + self.right.interpret()

    def __repr__(self):
        return f"({self.left} Add {self.right})"

class Subtract(AbstractExpression):
    "Non-Terminal Expression"

    def __init__(self, left, right):
        self.left = left
        self.right = right

    def interpret(self):
        return self.left.interpret() - self.right.interpret()

    def __repr__(self):
        return f"({self.left} Subtract {self.right})"

# The Client
# The sentence complies with a simple grammar of
# Number -> Operator -> Number -> etc,
SENTENCE = "5 + 4 - 3 + 7 - 2"
print(SENTENCE)

# Split the sentence into individual expressions that will be added to
# an Abstract Syntax Tree (AST) as Terminal and Non-Terminal expressions
TOKENS = SENTENCE.split(" ")
print(TOKENS)

# Manually Creating an Abstract Syntax Tree from the tokens
AST: list[AbstractExpression] = []  # Python 3.9
# AST = []  # Python 3.8 or earlier
AST.append(Add(Number(TOKENS[0]), Number(TOKENS[2])))  # 5 + 4
AST.append(Subtract(AST[0], Number(TOKENS[4])))        # ^ - 3
AST.append(Add(AST[1], Number(TOKENS[6])))             # ^ + 7
AST.append(Subtract(AST[2], Number(TOKENS[8])))        # ^ - 2

# Use the final AST row as the root node.
AST_ROOT = AST.pop()

# Interpret recursively through the full AST starting from the root.
print(AST_ROOT.interpret())

# Print out a representation of the AST_ROOT
print(AST_ROOT)

Output

python ./interpreter/interpreter_concept.py
5 + 4 - 3 + 7 - 2
['5', '+', '4', '-', '3', '+', '7', '-', '2']
11
((((5 Add 4) Subtract 3) Add 7) Subtract 2)

SBCODE Editor

<>

Example Use Case

The example use case will expand on the concept example by dynamically creating the AST and converting Roman numerals to integers as well as calculating the final result.

The Image below, is an AST for the expression 5 + IV - 3 + VII - 2

Abstract Syntax Tree Example

Example UML Diagram

Interpreter Pattern Overview

Source Code

./interpreter/client.py

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
"The Interpreter Pattern Use Case Example"

from sentence_parser import Parser

# The sentence complies with a simple grammar of
# Number -> Operator -> Number -> etc,
SENTENCE = "5 + IV - 3 + VII - 2"
# SENTENCE = "V + IV - III + 7 - II"
# SENTENCE= "CIX + V"
# SENTENCE = "CIX + V - 3 + VII - 2"
# SENTENCE = "MMMCMXCIX - CXIX + MCXXII - MMMCDXII - XVIII - CCXXXV"
print(SENTENCE)

AST_ROOT = Parser.parse(SENTENCE)

# Interpret recursively through the full AST starting from the root.
print(AST_ROOT.interpret())

# Print out a representation of the AST_ROOT
print(AST_ROOT)

./interpreter/abstract_expression.py

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
"An Abstract Expression"
# pylint: disable=too-few-public-methods
class AbstractExpression():
    """
    All Terminal and Non-Terminal expressions will implement an
    `interpret` method
    """
    @staticmethod
    def interpret():
        """
        The `interpret` method gets called recursively for
        each AbstractExpression
        """

./interpreter/number.py

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
"A Number. This is a leaf node Expression"
from abstract_expression import AbstractExpression

class Number(AbstractExpression):
    "Terminal Expression"

    def __init__(self, value):
        self.value = int(value)

    def interpret(self):
        return self.value

    def __repr__(self):
        return str(self.value)

./interpreter/add.py

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
"Add Expression. This is a Non-Terminal Expression"
from abstract_expression import AbstractExpression

class Add(AbstractExpression):
    "Non-Terminal Expression."

    def __init__(self, left, right):
        self.left = left
        self.right = right

    def interpret(self):
        return self.left.interpret() + self.right.interpret()

    def __repr__(self):
        return f"({self.left} Add {self.right})"

./interpreter/subtract.py

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
"Subtract Expression. This is a Non-Terminal Expression"
from abstract_expression import AbstractExpression

class Subtract(AbstractExpression):
    "Non-Terminal Expression"

    def __init__(self, left, right):
        self.left = left
        self.right = right

    def interpret(self):
        return self.left.interpret() - self.right.interpret()

    def __repr__(self):
        return f"({self.left} Subtract {self.right})"

./interpreter/roman_numeral.py

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
# pylint: disable=too-few-public-methods
"Roman Numeral Expression. This is a Non-Terminal Expression"
from abstract_expression import AbstractExpression
from number import Number

class RomanNumeral(AbstractExpression):
    "Non Terminal expression"

    def __init__(self, roman_numeral):
        self.roman_numeral = roman_numeral
        self.context = [roman_numeral, 0]

    def interpret(self):
        RomanNumeral1000.interpret(self.context)
        RomanNumeral100.interpret(self.context)
        RomanNumeral10.interpret(self.context)
        RomanNumeral1.interpret(self.context)
        return Number(self.context[1]).interpret()

    def __repr__(self):
        return f"{self.roman_numeral}({self.context[1]})"

class RomanNumeral1(RomanNumeral):
    "Roman Numerals 1 - 9"
    one = "I"
    four = "IV"
    five = "V"
    nine = "IX"
    multiplier = 1

    @classmethod
    def interpret(cls, *args):

        context = args[0]

        if not context[0]:
            return Number(context[1]).interpret()

        if context[0][0: 2] == cls.nine:
            context[1] += (9 * cls.multiplier)
            context[0] = context[0][2:]
        elif context[0][0] == cls.five:
            context[1] += (5 * cls.multiplier)
            context[0] = context[0][1:]
        elif context[0][0: 2] == cls.four:
            context[1] += + (4 * cls.multiplier)
            context[0] = context[0][2:]

        while context[0] and context[0][0] == cls.one:
            context[1] += (1 * cls.multiplier)
            context[0] = context[0][1:]

        return Number(context[1]).interpret()

class RomanNumeral10(RomanNumeral1):
    "Roman Numerals 10 - 99"
    one = "X"
    four = "XL"
    five = "L"
    nine = "XC"
    multiplier = 10

class RomanNumeral100(RomanNumeral1):
    "Roman Numerals 100 - 999"
    one = "C"
    four = "CD"
    five = "D"
    nine = "CM"
    multiplier = 100

class RomanNumeral1000(RomanNumeral1):
    "Roman Numerals 1000 - 3999"
    one = "M"
    four = ""
    five = ""
    nine = ""
    multiplier = 1000

./interpreter/sentence_parser.py

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
"A Custom Parser for creating an Abstract Syntax Tree"

from number import Number
from add import Add
from subtract import Subtract
from roman_numeral import RomanNumeral

class Parser:
    "Dynamically create the Abstract Syntax Tree"

    @classmethod
    def parse(cls, sentence):
        "Create the AST from the sentence"

        tokens = sentence.split(" ")
        print(tokens)

        tree = []  # Abstract Syntax Tree
        while len(tokens) > 1:

            left_expression = cls.decide_left_expression(tree, tokens)

            # get the operator, make the token list shorter
            operator = tokens.pop(0)

            right = tokens[0]

            if not right.isdigit():
                tree.append(RomanNumeral(tokens[0]))
                if operator == '-':
                    tree.append(Subtract(left_expression, tree[-1]))
                if operator == '+':
                    tree.append(Add(left_expression, tree[-1]))
            else:
                right_expression = Number(right)
                if not tree:
                    # Empty Data Structures return False by default
                    if operator == '-':
                        tree.append(
                            Subtract(left_expression, right_expression))
                    if operator == '+':
                        tree.append(
                            Add(left_expression, right_expression))
                else:
                    if operator == '-':
                        tree.append(Subtract(tree[-1], right_expression))
                    if operator == '+':
                        tree.append(Add(tree[-1], right_expression))

        return tree.pop()

    @staticmethod
    def decide_left_expression(tree, tokens):
        """
        On the First iteration, the left expression can be either a
        number or roman numeral. Every consecutive expression is
        reference to an existing AST row
        """
        left = tokens.pop(0)
        left_expression = None
        if not tree:  # only applicable if first round
            if not left.isdigit():  # if 1st token a roman numeral
                tree.append(RomanNumeral(left))
                left_expression = tree[-1]
            else:
                left_expression = Number(left)
        else:
            left_expression = tree[-1]
        return left_expression

Output

python ./interpreter/client.py
5 + IV - 3 + VII - 2
['5', '+', 'IV', '-', '3', '+', 'VII', '-', '2']
11
((((5 Add IV(4)) Subtract 3) Add VII(7)) Subtract 2)

SBCODE Editor

<>

New Coding Concepts

String Slicing

Sometimes you want part of a string. In the example code, when I am interpreting the Roman numerals, I am comparing the first one or two characters in the context with IV or CM or many other Roman numeral combinations. If the match is true then I continue with further commands.

The format is

1
string[start: end: step]

E.g., the string may be

MMMCMXCIX

and I want the first three characters,

1
2
test = "MMMCMXCIX"
print(test[0: 3])

Outputs

MMM

or I want the last 4 characters

1
2
test = "MMMCMXCIX"
print(test[-4:])

Outputs

XCIX

or I want a section in the middle

1
2
test = "MMMCMXCIX"
print(test[2:5])

Outputs

MCM

or stepped

1
2
test = "MMMCMXCIX"
print(test[2:9:2])

Outputs

MMCX

or even reversed

1
2
test = "MMMCMXCIX"
print(test[::-1])

Outputs

XICXMCMMM

The technique is very common in examples of Python source code throughout the internet. So, when you see the [] with numbers and colons inside, e.g., [:-1:], it is likely to do with extracting a portion of a data structure.

Note that the technique also works on Lists and Tuples.

1
2
3
4
5
6
7
test = [1,2,3,4,5,6,7,8,9]
print(test[0: 3])
print(test[-4:])
print(test[2:5])
print(test[2:9:2])
print(test[::-1])
print(test[:-1:])

Outputs

[1, 2, 3]
[6, 7, 8, 9]
[3, 4, 5]
[3, 5, 7, 9]
[9, 8, 7, 6, 5, 4, 3, 2, 1]
[1, 2, 3, 4, 5, 6, 7, 8]

Dunder __repr__ Method

It is used in a very similar way to the __str__ dunder method. You can override it to produce a string representation of a class.

1
2
3
4
5
6
7
class class_a():

    def __repr__(self):
        return "I am __repr__"

A = class_a()
print(A)

Outputs

I am __repr__

If you have a class that also overrides the __str__ dunder method already, then printing it will default to use the __str__ override instead.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
class class_a():

    def __str__(self):
        return "I am __str__"

    def __repr__(self):
        return "I am __repr__"

A = class_a()
print(A)

Outputs

I am __str__

You should prefer to use the __str__ method to print human-readable versions of your classes instead. If you require something with more information that could alternatively be used for debugging, or for object recreation using eval(), and not intended for reading by users, then it is generally recommended implementing the __repr__ method instead of __str__ for these more programmatic purposes.

To specifically use a __repr__ output, where both __repr__ and __str__ are both implemented in the same class, then use object.__repr__() or repr(A),

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
class class_a():

    def __str__(self):
        return "I am __str__"

    def __repr__(self):
        return "I am __repr__"

A = class_a()
print(A.__repr__())
#or
print(repr(A))

Summary

  • ASTs are hard to create and are an enormous subject in themselves. My recommended approach is to create them manually first using a sample sentence to help understand all the steps individually, and then progress the conversion to be fully dynamic one step at a time ensuring that the grammatical constructs still work as you continue to progress.

  • The Interpreter pattern uses a class to represent each grammatical rule.

  • ASTs consist of multiple Non-Terminal and Terminal Expressions, that all implement an interpret() method.

  • Note that in the sample code above, the interpret() methods in the Non-Terminal expressions, all call further interpret() recursively. Only the Terminal expressions interpret() method returns an explicit value. See the Number class in the above code.