Manual

User Manual:

Open the PDF directly: View PDF PDF.
Page Count: 42

DownloadManual
Open PDF In BrowserView PDF
The Tali Forth 2 Manual (ALPHA)
Scot W. Stevenson
June 19, 2018

Abstract
Tali Forth 2 is a bare-metal ANSI(ish) Forth for the 65c02 8-bit MPU. It aims to be,
roughly in order of importance:

Easy to try.

Download the source  or even just the binary  and you can immediately run
it in an emulater. This lets you experiment with a working 8-bit Forth for the 65c02
without any special conguration.

Simple.

The subroutine-threaded (STC) design and happily overcommented source code give
hobbyists the chance to study a working Forth at the lowest level. The manual  this
document  explains structure and code in detail. The aim is to make it easy to port
Tali Forth 2 to various 65c02 hardware projects.

Specic.

Many Forths available are `general' implementations with a small core adapted to
the target processor. Tali Forth 2 was written as a "bare metal Forth" for the 65c02
8-bit MPU and that MPU only, with its strengths and limitations in mind.

Standardized.

Most Forths available for the 65c02 are based on ancient, outdated templates
such as FIG Forth. Learning Forth with them is like trying to learn modern English by
reading Chaucer. Tali Forth (mostly) follows the current ANSI Standard.

Tali Forth is hosted at GitHub at https://github.com/scotws/TaliForth2. The discussion
thread is at 6502.org at http://forum.6502.org/viewtopic.php?f=9&t=2926.

1

Contents
I Introduction

6

1 Why

7

1.1

1.2

The big picture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7

1.1.1

The 6502 MPU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7

1.1.2

Forth

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Writing your own Forth

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.2.1

FIG Forth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

8

1.2.2

A modern Forth for the 65c02 . . . . . . . . . . . . . . . . . . . . . . . . . .

9

2 Overview of Tali Forth
2.1

Design considerations

10
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Characteristics of the 65c02 . . . . . . . . . . . . . . . . . . . . . . . . . . .

10

2.1.2

Cell size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10

2.1.3

Threading technique . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10

2.1.4

Register use . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11

2.1.5

Data Stack design

11

2.1.6

Dictionary structure

2.1.7

Deeper down the rabbit hole

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . .

11
11

12

3 Installing

3.2

10

2.1.1

II User Guide
3.1

8
8

13

Downloading

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

13

3.1.1

Downloading Tali Forth

3.1.2

Downloading the py65mon Simulator . . . . . . . . . . . . . . . . . . . . . .

13

Running the binary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

13

4 Running

13

14

4.1

Booting

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

14

4.2

Available words . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

14

4.2.1

History

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

15

4.2.2

Standards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

15

4.2.3

Tali Forth special words . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

15

4.3

Native compiling

16

4.4

Underow detection

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

16

4.5

Restarting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

16

4.6

Gotchas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

17

4.7

Reporting a problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

17

5 The Editor

18

III Developer Guide

19

6 How Tali Forth works

20

6.1

Stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2

20

CONTENTS

CONTENTS

6.1.1

Single cell values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

20

6.1.2

Underow detection

21

6.1.3

Double cell values

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

21

Dictionary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

21

6.2.1

Elements of the Header

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

22

6.2.2

Structure of the Header List . . . . . . . . . . . . . . . . . . . . . . . . . . .

22

6.3

Memory Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

23

6.4

Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

24

6.4.1

Starting up

24

6.4.2

The Command Line Interface . . . . . . . . . . . . . . . . . . . . . . . . . .

6.2

6.5

evaluate
create/does> .

6.6

Control Flow

6.4.3

6.7

6.8

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

24
24

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

24

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

27

6.6.1

Branches

6.6.2

Loops

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

27

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

27

Native Compiling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

28

6.7.1

Return Stack special cases . . . . . . . . . . . . . . . . . . . . . . . . . . . .

29

6.7.2

Underow stripping

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

30

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

31

cmove, cmove>

and

move

7 Developing

32

7.1

Adding new words

7.2

Deeper changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

32

7.2.1

The Ophis Assembler

33

7.2.2

General notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

33

7.2.3

Coding style

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

33

7.3

Code Cheat Sheet

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

32

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

33

7.3.1

The Stack Drawing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

33

7.3.2

Coding idioms

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

34

7.3.3

vi shortcuts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

35

8 Future plans

36

A FAQ

37

A.1

What happened to Tali Forth 1?

. . . . . . . . . . . . . . . . . . . . . . . . . . . .

37

A.2

Why does Tali Forth take so long to start up? . . . . . . . . . . . . . . . . . . . . .

37

A.3

Why `Tali' Forth?

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

37

A.4

Who is `Liara' ? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

37

B Testing Tali Forth

38

C Thanks

39

3

List of Figures
1.1

The 65c02 MPU. Photo:

Anthony King, released in the public domain . . . . . . .

4

7

List of Tables
2.1

The classic Forth registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11

6.1

DSP values for underow testing

. . . . . . . . . . . . . . . . . . . . . . . . . . . .

21

6.2

Header ags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

22

5

Part I

Introduction

6

Chapter 1

Why
Forth is well suited to resource-constrained situations. It doesn't need lots of memory
and doesn't have much overhead. It can take full advantage of whatever hardware or
interfaces exist.
 Charles Moore, `Chuck Moore: Geek of the Week', redgate Hub 2009

1.1 The big picture
This section provides background information on Forth, the 6502 processor, and why anybody
would want to combine the two. It can be safely skipped if you already know all those things.

1.1.1 The 6502 MPU
It is a well-established fact that humanity reached the apex of processor design with the 6502 in
1976. Created by a team including Chuck Peddle and Bill Mensch, it was the engine that powered
the 8-bit home computer revolution of the 1980s.

1 The VIC-20, Commodore PET, Apple II, and

Atari 800 all used the 6502, among others.

Figure 1.1:

The 65c02 MPU. Photo:

Anthony King, released in the public domain

More than 40 years later, the processor is still in production by the Western Design Center.
Apart from commercial uses, there is an active hobbyist scene centered on the website 6502.org.

1 Rumor

has it that there was another MPU called `Z80', but it ended up being a mere footnote.

7

1.2.

WRITING YOUR OWN FORTH

CHAPTER 1.

WHY

Quite a number of people have built their own 8-bit computers based on this chip and the instructions there, including a a primer by Garth Wilson. It is for these systems that Tali Forth 2 was
created.
The most important variant of the 65c02 produced today is the 65c02, a CMOS chip with some
additional instructions. It is for this chip that Tali Forth 2 was written.

2 The 65c02 is fun to work with because of its

But why program in 8-bit assembler at all?

clean instruction set architecture (ISA). This is not the place to explain the joys of assembler. The
ocial handbook for the 65c02 is

Programming the 65816, including the 6502, 65C02 and 65802 [6].

1.1.2 Forth
If C gives you enough rope to hang yourself, Forth is a amethrower crawling with
cobras.
 Elliot Williams,
Forth is the

enfant terrible

of programming languages.

Forth: The Hacker's Language

It was invented by Charles `Chuck'

Moore in the 1960s to do work with radio astronomy, way before there were modern operating
systems or programming languages.

3 As a language for people who actually need to get things

done, it lets you run with scissors, play with re, and cut corners until you've turned a square
into a circle. Forth is not for the faint-hearted: It is trivial, for instance, to redene 1 as 2 and

true

as

false.

Though you can do really, really clever things with few lines of code, the result

can be hard for other people to understand, leading to the reputation of Forth begin a `write-only
language'. However, Forth excels when you positively, absolutely have to get something done with
hardware that is really too weak for the job.
It should be no surprise that NASA is one of the organizations who use Forth. The

Cassini

mission to Saturn used a Forth CPU, for instance. It is also perfect for small computers like the
8-bit 65c02.

After a small boom in the 1980s, more powerful computers led to a decline of the

language. The `Internet of Things' with embedded small processors has led to a certain amount
renewed interest in the language. It helps that Forth is easy to implement: It is stack-based, uses
reverse polish notation (RPN) and a simple threaded interpreter model.
There is no way this document can provide an adiquate introduction to Forth. There are quite

A Beginner's Guide to Forth by J.V. Nobel[9] or the classic
Starting Forth [2] by Leo Brodie. Gforth, one of the more powerful free Forths,

a number of tutorials, however, such as
(but slightly dated)

comes with its own tutorial.

4

1.2 Writing your own Forth
Even if the 65c02 is great and Forth is brilliant, why got to the eort of writing a new, bare-metal
version of the languages?

After almost 50 years, shouldn't there be a bunch of Forths around

already?

1.2.1 FIG Forth
In fact, the classic Forth availble for the whole group of 8-bit MPUs is FIG Forth  `FIG' stands
for `Forth Interest Group'. Ported to various architectures, it was original based on an incarnation
for the 6502 written by Bill Ragsdale and Robert Selzer. There are PDFs of the 6502 version from
September 1980 freely available  Forths are traditionally placed in the public domain  and more
than one hobbyist has revised it to his machine.
However, Forth has changed a lot in the past three decades.

There is now a standardized

version called ANSI Forth standard, which includes such basic changes as how the
Learning the language with FIG Forth is like learning English with

do

loop works.

The Canterbury Tales.

2 Wilson answers this question in greater detail as part of his 6502 primer
3 A brief history of Forth can be found at https://www.forth.com/resources/forth-programming-language/
4 Once you have understood the basics of the language, do yourself a favor and read
by Brodie[3],

Thinking Forth

which deals with the philosophy of the language. Like Lisp, exposure to Forth will change the way you think about
programming.

8

1.2.

WRITING YOUR OWN FORTH

CHAPTER 1.

WHY

1.2.2 A modern Forth for the 65c02
Tali Forth was created to provide an easy to understand modern Forth written especially for the
65c02 that anybody can understand, adapt to their own use, and maybe actually work with. As
part of that eort, the source code is heavily commented. And this document tries to explain the
internals in more detail.

9

Chapter 2

Overview of Tali Forth
2.1 Design considerations
When creating a new Forth, there are a bunch of design decisions to be made.

1 Spoiler alert: Tali

Forth ended up as a subroutine-threaded variant with a 16-bit cell size and a dictionary that keeps
headers and code separate. If you don't care and just want to use the program, skip ahead.

2.1.1 Characteristics of the 65c02
Since this is a bare-metal Forth, the most important consideration is the target processor. The
65c02 only has one full register, the accumulator A, and two secondary registers X and Y. All
are 8-bit wide. There are 256 bytes that are more easily addressable on the Zero Page. A single
hardware stack is used for subroutine jumps. The address bus is 16 bits wide for a maximum of
64 KiB of RAM and ROM. For the default, simple setup, we assume 32 KiB of each.

2.1.2 Cell size
The 16 bit address bus suggests the cell size should be 16 bits as well. This is still easy enough
to realize on a 8-bit MPU, though not as comfortable as working with the 65816, the 65c02's big
brother, with an actual 16 bit register size.

2.1.3 Threading technique
A `thread' in Forth is simply a list of addresses of words to be executed.
threading techniques:

2

Indirect threaded (ITC)

There are four basic

The oldest, original variant, used by FIG Forth. All other versions

are modications of this model.

Direct threaded (DTC)

Includes more assembler code to speed things up, but slightly larger

than ITC.

Token threaded (TTC)

The reverse of DTC in that it is slower, but uses less space than the

other Forths. Words are created as a table of tokens.

Subroutine threaded (STC)

This technique converts the words to a simple series of

jsr

com-

binations.
Our lack of registers and the goal of creating a simple and easy to understand Forth makes
subroutine threading the most attractive solution. We will try to mitigate the pain caused by the
12 cycle cost of each and every

jsr/rts

combination by including a relatively large number of

native words.

1 The best introduction to these
2 For the 8086 MPU, Guy Kelly

questions is found in Design Decisions in the Forth
compared various Forth implementations in 1992[7]

10

Kernel by Brad Rodriguez

2.1.

DESIGN CONSIDERATIONS

CHAPTER 2.

OVERVIEW OF TALI FORTH

2.1.4 Register use
The lack of registers  and 16 bit registers at that  becomes apparent when you realize that Forth
classically uses at least four `virtual' registers:
W

Working register

IP

Interpreter Pointer

DSP

Data Stack Pointer

RSP

Return Stack Pointer

Table 2.1: The classic Forth registers
On a modern processor like a RISC-V RV32I CPU with 32 registers of 32 bit each, this wouldn't
be a problem. In fact, we'd be trying to gure out what else we could keep in a register. On the
65c02, at least we get the RSP for free with the built-in stack pointer.

This still leaves three

registers. We cut that number down by one through subroutine threading, which gets rid of the
IP. For the DSP, we use the 65c02's Zero Page indirect addressing mode with the X register. This
leaves W, which we put on the Zero Page as well.

2.1.5 Data Stack design
We'll go into greater detail on how the Data Stack works in a later chapter when we look at
the internals. Briey, the stack is realized on the Zero Page for speed. For stability, we provid
underow checks in the relevant words, but give the user the option of stripping it out for native
compilation.

2.1.6 Dictionary structure
Each Forth word consists of the actual code and the header which holds the meta-data. Part of
this data is the single-linked list of words which is searched.
In constrast to Tali Forth 1, which kept the header and body of the words together, Tali Forth
2 keeps them separate. This lets us play various tricks with the code to make it more eective.

2.1.7 Deeper down the rabbit hole
This concludes our overview of the basic Tali Forth 2 structure.
chapter will provide far more detail.

11

For those interested, a later

Part II

User Guide

12

Chapter 3

Installing
3.1 Downloading
Tali Forth was created to be easy to get started with. In fact, all you should need is the
binary le and the

py65mon

ophis.bin

simulator.

3.1.1 Downloading Tali Forth
The newest version of Tali Forth 2 lives on GitHub at https://github.com/scotws/TaliForth2. You

git or simply
taliforth-py65mon.bin binary.

can either clone the code with
is the

download it. To just try the program, all you need

3.1.2 Downloading the py65mon Simulator
Tali was written to run out of the box on the

py65mon simulator from https://github.com/mnaberez/py65.

This is a Python program that should run on various operating systems.
To install py65mon on Linux, use the command

sudo pip install -U py65. If you don't
sudo apt-get install python-pip.

have PIP installed, you will have to add it rst with
There is a

setup.py

script as part of the package.

3.2 Running the binary
To start the emulator, run:

py65mon -m 65c02 -r taliforth-py65mon.bin
Note that the option

-m 65c02

is required, because Tali Forth makes extensive use of the addi-

tional commands of the CMOS version and will not run on a stock 6502 MPU.

13

Chapter 4

Running
One doesn't write programs in Forth. Forth is the program.
 Charles Moore,

Masterminds of Programming [1]

4.1 Booting
Out of the box, Tali Forth boots a minimal kernelkernel to connect to the py65mon simulator.
By default, this stage ends with a line such as

Tali Forth 2 default kernel for py65mon (18. Feb 2018)
When you port Tali Forth to your own hardware, you'll have to include your own kernel (and
probably should print out a dierent line).
Tali Forth itself boots next, and after setting up various internal things, compiles the high level
words. This causes a slight delay, depending on the number and length of these words. As the last
step, Forth should spit out a boot string like

Tali Forth 2 for the 65c02
Version ALPHA 07. Mar 2018
Copyright 2014-2018 Scot W. Stevenson
Tali Forth 2 comes with absolutely NO WARRANTY
Type ’bye’ to exit
Because these are the last high-level commands Tali Forth executes, this functions as a primiIf you have modied the high level Forth words in either forth_words.fs or
user_words.fs, the boot process might fail with a variant of the error message `unknown word'.
The built-in, native words should always work. For this reason, dump is a built-in word  it very

tive self-test.

useful for testing.

4.2 Available words
Tali Forth comes with the following Forth words out of the box:

see within to d.r d. ud.r ud. .r u.r */mod */ mod /mod /
action-of is defer@ defer! while until repeat else then
if .( ( drop dup swap ! @ over >r r> r@ nip rot -rot tuck
, c@ c! +! execute emit type . u. ? false true space 0 1
2 2dup ?dup + - abs dabs and or xor rshift lshift pick char
[char] char+ chars cells cell+ here 1- 1+ 2* = <> < > 0=
0<> 0> 0< min max 2drop 2swap 2over 2variable 2r@ 2r> 2>r
invert negate dnegate c, bounds spaces bl -trailing /string
refill accept unused depth key allot create does> variable
constant value s>d d>s d- d+ erase blank fill find-name ’
[’] name>int int>name name>string >body defer latestxt
latestnt parse-name parse source source-id : ; compile, [ ]
14

4.2.

AVAILABLE WORDS

CHAPTER 4.

RUNNING

0branch branch literal sliteral ." s" postpone immediate
compile-only never-native always-native nc-limit abort
abort" do ?do i j loop +loop exit unloop leave recurse quit
begin again state evaluate base digit? number >number hex
decimal count m* um* * um/mod ud/mod sm/rem fm/mod \ move
cmove> cmove pad >in <# # #s #> hold sign output input cr
page at-xy marker words wordsize aligned align bell dump .s
find word cold bye
(Call

words

in Tali Forth for the current list.)

Though the list might look unsorted, it actually reects the priority in the dictionary, that is,
which words are found rst. For instance, the native words  those coded in assembler  start with

drop bye,

which is the last word that Tali Forth will nd.

1 The words before

that are dened in high-level Forth. For more information on the words, use the

drop are those
see command.

Note that the built-in words are lower case. Newly dened words can be in any case and will
be distinct  `KASUMI' is a dierent word than `Kasumi'.

4.2.1 History
Tali's command line includes a simple, eight-element history function.
entries, press

CONTROL-p,

to go forward to the next entry, press

To access the previous

CONTROL-n.

4.2.2 Standards
Tali Forth is orientated on ANSI Forth, but (currently) doesn't contain the complete set of even
the core words. Tali also adopted some words from Gforth such as

bounds.

In practical terms,

Tali aims to be a subset of Gforth: If a program runs on Tali, it should run on Gforth the same
way or have a very good reason not to.
In addition, there are a few words that are specic to Gforth such as

nc-limit.

4.2.3 Tali Forth special words
Tali Forth includes a number of words not found in Gforth or ANSI Forth.

• 0branch ( f – ) Take branch if TOS is zero. Used internally for branching commands
such as if. This is usually replaced by cs-pick and cs-roll in modern Forths; Tali Forth
might switch to this model in the future.

• 0( – 0 )

Push the number 0 on the Data Stack.

• 1( – 1 )

Push the number 1 on the Data Stack.

• 2( – 2 )

Push the number 2 on the Data Stack.

• always-native( – )
• bell ( – )

Mark latest word so that it is always natively compiled.

Ring the terminal bell (print ASCII 07)

• branch( – )

Always take branch. Used internally for branching commands such as

This is usually replaced by

cs-pick

and

cs-roll

if.

in modern Forths; Tali Forth might

switch to this model in the future.

• compile-only ( – )

Mark latest word as compile only.

• digit? ( char – u f | char f )

If character is a digit, convert and set a success

ag, otherwise return the character and a failure ag.

• input ( – addr )

Return the address where the vector for the input routine is stored

(not the vector itself ). Used for input redirection for

• int>name ( xt – nt )
1 If

emit

and others.

Given the execution token of a word, return the name token..

you're going to quit, speed can't be that important
15

4.3.

NATIVE COMPILING

• latestnt ( – nt )
called latest.

CHAPTER 4.

RUNNING

Return the last used name token. The Gforth version of this word is

• nc-limit( – addr )

Return the address where the threshold value for native compiling

is kept. To check the value of this parameter, use

• never-native ( – )

nc-limit ?.

The default is 20.

Mark most recent word so it is never natively compiled.

• number ( addr u – u | d )

Convert a string to a number. Gforth uses

s>number?

instead and returns a success ag as well.

• output ( – addr )

Return the address where the vector for the output routine is stored

(not the vector itself ). Used for output redirection for

• uf-strip ( – addr )

emit

and others.

Return the address where the ag is kept that decides if the un-

derow checks are removed during native compiling.

To check the value of this ag, use

uf-strip ?.
• wordsize ( nt – u )

Given the name token of a Forth word, return its size in bytes.

Used to help tune native compiling.

4.3 Native compiling
As the name says, subroutine threaded code encodes the words as a series of subroutine jumps.
Because of the overhead caused by these jumps, this can make the code slow. Therefore, Tali Forth
enables `native compiling', where the machine code from the word itself is included instead of a
subroutine jump.
The parameter

nc-limit sets the limit of how small words have to be to be natively compiled.

To get the current value (usually 20), check the value of the system variable:

nc-limit ?
To set a new limit, save the maximal allowed number of bytes in the machine code like any other
Forth variable:

40 nc-limit !
To complete turn o native compiling, set this value to zero.

4.4 Underow detection
When a word tries to access more words on the stack than it is holding, an `underow' error occurs.
Whereas Tali Forth 1 didn't check for these errors, this version does.
However, this slows the program down. Because of this, the user can turn o underow detection
for words that are natively compiled into new words. To do this, set the system variable
to

true.

uf-strip

Note this does not turn o underow detection in the built-in words. Also, words with

underow detection which are not included in new words through native compiling will also retain
their tests.

4.5 Restarting
Tali Forth has a non-standard word

cold

that resets the system.

Note that this doesn't erase

any data in memory, but just moves the pointers back. When in doubt, you might be better o
quitting and restarting completely.

16

4.6.

GOTCHAS

CHAPTER 4.

RUNNING

4.6 Gotchas
Tali has a 16-bit cell size (use

1 cells 8 * .

can trip up calculations when compared to the

to get the cells size in bits with any Forth), which

de facto

standard Gforth with 64 bits. Take this

example:

( Gforth )
( Tali Forth)

decimal 1000 100 um* hex swap u. u.
decimal 1000 100 um* hex swap u. u.

186a0 0 ok
86a0 1 ok

Tali has to use the upper cell of a double-celled number to correctly report the result, while Gforth
doesn't. If the conversion from double to single is only via a

drop

instruction, this will produce

dierent results.

4.7 Reporting a problem
The best way to point out a bug or make any other form of a comment is on Tali Forth's page on
GitHub at https://github.com/scotws/TaliForth2. There, you can `open an issue', which allows
other people who might have the same problem to help even when the author is not available.

17

Chapter 5

The Editor
(Currently, there is no editor installed.)

18

Part III

Developer Guide

19

Chapter 6

How Tali Forth works
6.1 Stack
Tali Forth 2 uses the lowest part of the top half of Zero Page for the Data Stack (DS). This leaves
the lower half of the Zero Page for any kernel stu the user might require. The DS therefore grows
towards the initial user variables. See the le

definitions.asm for details.

Because of the danger

of underow, it is recommended that the user kernel's variables are keep closer to

$0100

than to

$007f.
The X register is used as the Data Stack Pointer (DSP). It points to the least signicant byte
of the current top element of the stack (`Top of the Stack', TOS).
Initially, the DSP points to

$78,

not

$7F

1

as might be expected. This provides a few bytes as

a `oodplain' in case of underow. The initial value of the DSP is dened as

dsp0

in the code.

6.1.1 Single cell values
Since the cell size is 16 bits, each stack entry consists of two bytes. They are stored little endian
(least signicant byte rst). Therefore, the DSP points to the LSB of the current TOS.

2

Because the DSP points to the current top of the stack, the byte it points to after boot 
 will never be accessed: The DSP is decremented rst with two

dex

dsp0

instructions, and then the

new value is placed on the stack. This means that the initial byte is garbage and can be considered
part of the oodplain.

...
$0076
$0077
$0078
$0079
$007A

+--------------+
|
... |
+-+
|
|
+- (empty)
-+
|
|
+-+
|
|
+==============+
|
LSB|
+TOS
-+
|
MSB|
+==============+
| (garbage)
|
+--------------+
|
|
+ (floodplain) +
|
|
+--------------+

...
FE,X
FF,X
00,X

<-- DSP (X Register)

01,X
02,X

<-- DSP0

03,X
04,X

1 In the rst versions of Tali, the DSP pointed to the next free element of the stack. The new system makes
detecting underow easier and parallels the structure of Liara Forth.
2 Try reading that last sentence to a friend who isn't into computers. Aren't abbreviations fun?

20

6.2.

DICTIONARY

CHAPTER 6.

HOW TALI FORTH WORKS

Snapshot of the Data Stack with one entry as Top of the Stack (TOS). The DSP has been
increased by one and the value written.
Note that the 65c02 system stack  used as the Return Stack (RS) by Tali  pushes the MSB
on rst and then the LSB (preserving little endian), so the basic structure is the same for both
stacks.
Because of this stack design, the second entry (`next on stack', NOS) starts at
third entry (`third on stack', 3OS) at

02,X

and the

04,X.

6.1.2 Underow detection
In contrast to Tali Forth 1, this version contains underow detection for most words. It does this
by comparing the Data Stack Pointer (X) to values that it must be smaller than (because the stack
grows towards 0000). For instance, to make sure we have one element on the stack, we write

cpx #dsp0-1
bmi okay
jmp underflow
okay:
(...)
For the most common cases, this gives us:
Test for

Pointer oset

1 cell

dsp0-1

2 cells

dsp0-3

3 cells

dsp0-5

4 cells

dsp0-7

Table 6.1: DSP values for underow testing
Underow detection adds seven bytes to the words that have it.

However, it increases the

stability of the program enormously.

6.1.3 Double cell values
The double cell is stored on top of the single cell. Note this places the sign bit at the beginning of
the byte below the DSP.

+---------------+
|
|
+===============+
|
LSB|
+-+ Top Cell -+
|S|
MSB|
+-+-------------+
|
LSB|
+- Bottom Cell -+
|
MSB|
+===============+

$0,x

<-- DSP (X Register)

$1,x
$2,x
$3,x

Tali Forth 2 does not check for overow, which in normal operation is too rare to justify the
computing expense.

6.2 Dictionary
Tali Forth 2 follows the traditional model of a Forth dictionary  a linked list of words terminated
with a zero pointer. The headers and code are kept separate to allow various tricks in the code.

21

6.2.

DICTIONARY

CHAPTER 6.

HOW TALI FORTH WORKS

6.2.1 Elements of the Header
Each header is at least eight bytes long:

nt_word ->
+0
+2
+4
+6
+8

+n

8 bit
8 bit
LSB
MSB
+--------+--------+
| Length | Status |
+--------+--------+
| Next Header
| nt_next_word
+-----------------+
| Start of Code
| xt_word
+-----------------+
| End of Code
| z_word
+--------+--------+
| Name
|
|
+--------+--------+
|
|
|
+--------+--------+
|
| ...
|
+--------+--------+

Each word has a

name token

(nt,

nt_word

in the code) that points to the rst byte of the

header. This is the length of the word's name string, which is limited to 255 characters.
The second byte in the header (index 1) is the
in the le

status byte.

It is created by the ags dened

definitions.asm:
Flag

Function

CO

Compile Only

IM

Immediate Word

NN

Never Native Compile

AN

Always Native Compile

UF

Underow dectection

Table 6.2: Header ags

Note there are currently three bits unused. The status byte is followed by the

next header in the linked list, which makes it the name token of the next word.

pointer to the
A

bye.
xt_word) that

0000

in this

position signales the end of the linked list, which by convention is the word
This is followed by the current word's
start of the actual code.
block. The

execution token

points to the

end of the code is referenced through the next pointer (z_word) to enable native

compilation of the word if allowed.
The

(xt,

Some words that have the same functionality point to the same code

name string starts at the eighth byte.

The string is

not

zero-terminated. By default, the

strings of Tali Forth 2 are lower case, but case is respected for words the user denes, so `quarian'
is a dierent words than `QUARIAN'.

6.2.2 Structure of the Header List
Tali Forth 2 distinguishes between three dierent list sources: The

native words that are hard-

native_words.asm, the Forth words which are dened as high-level words and
then generated at run-time when Tali Forth starts up, and user words in the le user_words.asm.

coded in the le

Tali has an unusually high number of native words in an attempt to make the Forth as fast as
possible on the 65c02. The rst word in the list  the one that is checked rst  is always
the last one  the one checked for last  is always

bye.

drop,

The words which are (or are assumed to be)

used more than others come rst. Since humans are slow, words that are used more interactively
like

words

come later.

The list of Forth words ends with the intro strings.

This functions as a primitive form of a

self-test: If you see the string and only the string, the compilation of the Forth words worked.

22

6.3.

MEMORY MAP

CHAPTER 6.

HOW TALI FORTH WORKS

6.3 Memory Map
Tali Forth 2 was developed with a simple 32 KiB RAM, 32 KiB ROM design.

$0000

$0078

$0100

$0200

$0300

$7FFF
$8000

$E000

$F000

$FFFA
$FFFF

+-------------------+
| User varliables |
+-------------------+
|
|
| ^ Data Stack
|
| |
|
+-------------------+
|
|
|
(Reserved for
|
|
kernel)
|
|
|
+===================+
|
|
| ^ Return Stack |
| |
|
+-------------------+
| |
|
| v Input Buffer |
|
|
+-------------------+
| |
|
| v Dictionary
|
|
(RAM)
|
|
|
~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
+-------------------+
|
|
| ACCEPT history
|
|
|
#####################
|
|
|
|
|
|
|
Tali Forth
|
|
(24 KiB)
|
|
|
|
|
+-------------------+
|
|
|
Kernel
|
|
|
+-------------------+
|
I/O addresses
|
+-------------------+
|
|
|
Kernel
|
|
|
+-------------------+
| 65c02 vectors
|
+-------------------+

ram_start, zpage, user0

<-- dsp
dsp0, stack

<-- rsp
rsp0, buffer, buffer0

cp0

<-- cp

ram_end
forth, code0

kernel_putc, kernel_getc

Note that some of these values are hard-coded into the test suite; see the le
for details.

23

definitions.txt

6.4.

INPUT

CHAPTER 6.

HOW TALI FORTH WORKS

6.4 Input
Tali Forth 2, like Liara Forth, follows the ANSI input model with

refill

instead of older forms.

There are up to four possible input sources in Forth (see C&D p. 155):
1. The keyboard (`user input device')
2. A character string in memory
3. A block le
4. A text le
To check which one is being used, we rst call

blk

which gives us the number of a mass

storage block being used, or 0 for the `user input device' (keyboard). In the second case, we use
SOURCE-ID to nd out where input is coming from: 0 for the keyboard,
in memory, and a number
the

blk

n

-1 ($FFFF) for a string

for a le-id. Since Tali currently doesn't support blocks, we can skip

instruction and go right to

source-id.

One gotcha with Tali Forth's input is that current it only sees spaces, but not other whitespace,
as delimiters. This means that Forth text les that are fed to Tali should not contain tabs. This
behavior might be changed in the future.

6.4.1 Starting up
cold to abort to quit. This is the same
refill to get the input. refill does dierent

The intial commands after reboot ow into each other:
as with pre-ANSI Forths. However,

quit

now calls

things based on which of the four input sources (see above) is active:

Keyboard entry

This is the default. Get line of input via

accept

and return

true

even if the

input string was empty.

string Return a false ag.
Input from a buer Not implemented at this time.
Input from a le Not implemented at this time.
evaluate

6.4.2 The Command Line Interface
Tali Forth accepts input lines of up to 256 characters. The address of the current input buer is
stored in

>in

cib. The length of the current buer is stored
source by default returns cib and ciblen

returns.

in

ciblen

 this is the address that

as the address and length of the input

buer.

6.4.3 evaluate
evaluate

is used to execute commands that are in a string. A simple example would be:

s" 1 2 + ." evaluate
Tali Forth uses

evaluate

to load high-level Forth words from the le

extra, user-dened words from

forth_words.asc

and

user_words.asc.

6.5 create/does>
create/does> is the most complex,

but also most powerful part of Forth. Understanding how it

works in Tali Forth is important if you want to be able to modify the code. In this text, we walk
through the generation process for a subroutine threaded code (STC) such as Tali Forth. For a more
general take, see Brad Rodriguez' series of articles at http://www.bradrodriguez.com/papers/moving3.htm.
There is a discussion of this walkthrough at http://forum.6502.org/viewtopic.php?f=9&t=3153.
We start with the following standard example, the Forth version of

constant:.

: constant create , does> @ ;
We examine this in three phases or "sequences", following Rodriguez based on[5]:

24

6.5.

CREATE/DOES>

CHAPTER 6.

HOW TALI FORTH WORKS

Sequence 1: Compiling the word constant
constant

is a `dening word', one that makes new words.

In pseudocode, and ignoring any

compilation to native 65c02 assembler, the above compiles to:

a:
b:

jsr
jsr
jsr
jsr
jsr
rts

CREATE
COMMA
(DOES>)
DODOES
FETCH

; from DOES>
; from DOES>

To make things easier to explain later, we've added the labels `a' and `b' in the listing.
is an immediate word that adds not one, but two subroutine jumps, one to

dodoes,

(does>)

3

does>

and one to

dovar. We'll discuss those later.
defer are `hand-compiled', that is, instead

which is a pre-dened system routine like

In Tali Forth, a number of words such as

of using

forth such as

: defer create [’] abort , does> @ execute ;
we write an opimized assembler version ourselves (see the actual
need to use

(does>)

and

dodoes

does>

instead of

defer

code). In these cases, we

as well.

Sequence 2: Executing the word constant/creating life
Now when we execute

42 constant life
this pushes the

rts

of the calling routine  call it `main'  to the 65c02's stack (the Return Stack,

as Forth calls it), which now looks like this:

(1) RTS

; to main routine

Without going into detail, the rst two subroutine jumps of

(Header "LIFE")
jsr DOVAR
4200

give us this word:

; in CFA, from LIFE’s CREATE
; in PFA (little-endian)

jsr to (does>). The address
constant we had labeled `a'.

Next, we
of

constant

that this pushes on the Return Stack is the instruction

(2) RTS to CONSTANT ("a")
(1) RTS to main routine
Now the tricks start.

jsr

(does>)

takes this address o the stack and uses it to replace the

target in the CFA of our freshly created

c:

(Header "LIFE")
jsr a
4200

life

dovar

word. We now have this:

; in CFA, modified by (DOES>)
; in PFA (little-endian)

Note we added a label `c'. Now, when

(does>)

reaches its own

rts,

it nds the

rtrs

to the

TM , because it aborts the execution of the rest of

main routine on its stack. This is Good Thingi

constant,

and we don't want to do

dodoes

or

fetch

now. We're back at the main routine.

Sequence 3: Executing life
Now we execute the word

life

from our `main' program. In a STC Forth such as Tali Forth, this

executes a subroutine jump.

3 This example uses the word (does>), which in Tali Forth 2 is actually an internal routine that does not appear
as a separate word. This version is easier to explain.

25

6.5.

CREATE/DOES>

CHAPTER 6.

HOW TALI FORTH WORKS

jsr LIFE
The rst thing this call does is push the return address to the main routine on the 65c02's stack:

(1) RTS to main
The CFA of

life

life

executes a subroutine jump to label `a' in

constant.

This pushes the

rts

of

on the 65c02's stack:

(2) RTS to LIFE ("c")
(1) RTS to main
This

jsr

to a lands us at the subroutine jump to

dodoes,

so the return address to

constant

gets pushed on the stack as well. We had given this instruction the label `b'. After all of this, we
have three addresses on the 65c02's stack:

(3) RTS to CONSTANT ("b")
(2) RTS to LIFE ("c")
(1) RTS to main
dodoes

pops address `b' o the 65c02's stack and puts it in a nice safe place on Zero Page, which

we'll call `z'. More on that in a moment. First,
address of the PFA or

life,

dovar here, and pushes `c'
the rts to the main routine.

performs a
stack is

dodoes

pops the

rts

to

life.

where we stored the payload of this constant.

This is `c', the

Basically,

dodoes

on the Data Stack. Now all we have left on the 65c02's

[1] RTS to main
This is where `z' comes in, the location in Zero Page where we stored address `b' of
Remember, this is where

constant's

does> in the very rst
jsr!  to this address.

codes after
 not a

own PFA begins, the

fetch

constant.

command we had originally

denition. The really clever part: We perform an indirect

jmp

jmp (z)
Now

constant's

fetch. Since we
fetch replaces this by 42, which is what we were aiming
constant ends with a rts, we pull the last remaining address o the

little payload programm is executed, the subroutine jump to

just put the PFA (`c') on the Data Stack,
for all along. And since

65c02's stack, which is the return address to the main routine where we started. And that's all.
Put together, this is what we have to code:

does>:

Compiles a subroutine jump to

(does>):

(does>),

then compiles a subroutine jump to

dodoes
life.

Pops the stack (address of subroutine jump to

one, replace the original

dovar

jump target in

in

constant,

dodoes.

increase this by

dodoes: Pop stack (constant's PFA), increase address by one, store on Zero Page; pop stack
(life's PFA), increase by one, store on Data Stack; jmp to address we stored in Zero Page.
Remember we have to increase the addresses by one because of the way
address for

rts on the stack on the 65c02:

It points to the third byte of the

jsr stores the return
jsr instruction itself,

not the actual return address. This can be annoying, because it requires a sequence like:

*

inc z
bne +
inc z+1
(...)

Note that with most words in Tali Forth, as any STC Forth, the distinction between PFA and CFA
is meaningless or at least blurred, because we go native anyway. It is only with words generated
by

create/does>

where this really makes sense.

26

6.6.

CONTROL FLOW

CHAPTER 6.

HOW TALI FORTH WORKS

6.6 Control Flow
6.6.1 Branches
if/then, we
0branch.4

For

need to compile something called a `conditional forward branch', traditionally

called

Then, at run-time, if the value on the Data Stack is false (ag is zero), the

branch is taken (`branch on zero', therefore the name). Execpt that we don't have the target of

then. For this to work, we remember the address
0branch instruction during the compilation of if. This is put on the Data Stack, so
that then knows where to compile it's address in the second step. Until then, a dummy value is
5
compiled after 0branch to reserve the space we need.

that branch yet  it will later be added by
after the

In Forth, this can be realized by

: if

postpone 0branch here 0 , ; immediate

and

: then here swap ! ; immediate
Note

then

doesn't actually compile anything at the location in memory where it is at. It's job is

simply to help

branch

if out of the mess it created. If we have an else, we have to add an unconditional
if left on the Data Stack. The Forth for this is:

and manipulate the address that

: else
Note that

then

postpone branch here 0 , here rot ! ; immediate

has no idea what has just happened, and just like before compiles its address

where the value on the top of the Data Stack told it to  except that this value now comes from

else,

not

if.

6.6.2 Loops
Loops are far more complicated, because we have

do, ?do, loop, +loop, unloop,

and

leave to

take care of. These can call up to three addresses: One for the normal looping action (loop/+loop),
one to skip over the loop at the beginning (?do) and one to skip out of the loop (leave).
Based on a suggestion by Garth Wilson, we begin each loop in run-time by saving the address

leave and ?do know where to
if/then structures. On top of that address,

after the whole loop construct to the Return Stack. That way,
jump to when called, and we don't interfere with any
we place the limit and start values for the loop.

The key to staying sane while designing these constructs is to rst make a list of what we want
to happen at compile-time and what at run-time. Let's start with a simple

do

do/loop.

at compile-time:
•

Remember current address (in other words,

here)

on the Return Stack (!) so we can later

compile the code for the post-loop address to the Return Stack

•

Compile some dummy values to reserve the space for said code

•

Compile the run-time code; we'll call that fragment (do)

•

Push the current address (the new

here)

to the Data Stack so

loop

knows where the loop

contents begin

do

at run-time:
•

Since

Take limit and start o Data Stack and push them to the Return Stack

loop is just a special case of +loop with an index of one,

we can get away with considering

them at the same time.

4 Many Forths now use the words cs-pick and cs-roll instead of the branch variants, see
http://lars.nocrew.org/forth2012/rationale.html#rat:tools:CS-PICK. Tali Forth might switch to this construction
in the future.
5 This section and the next one are based on a discussion at http://forum.6502.org/viewtopic.php?f=9&t=3176,
see there for more details. Another take on this subject that handles things a bit dierently is at
http://blogs.msdn.com/b/ashleyf/archive/2011/02/06/loopty-do-i-loop.aspx

27

6.7.

NATIVE COMPILING

loop

CHAPTER 6.

HOW TALI FORTH WORKS

at compile time:

•

Compile the run-time part

•

Consume the address that is on top of the Data Stack as the jump target for normal looping

(+loop)

and compile it

•

Compile

unloop

for when we're done with the loop, getting rid of the limit/start and post-

loop addresses on the Return Stack

•

Get the address on the top of the Return Stack which points to the dummy code compiled
by

•

do

At that address, compile the code that pushes the address after the list construct to the
Return Stack at run-time

loop

at run-time (which is (+loop))

•

Add loop step to count

•

Loop again if we haven't crossed the limit, otherwise continue after loop

At one glance, we can see that the complicated stu happens at compile-time. This is good,
because we only have to do that once for each loop.
In Tali Forth, these routines are coded in assembler. With this setup,
(six

pla

instructions  four for the limit/count of

before it) and

leave

even simpler (four

pla

do,

unloop

becomes simple

two for the address pushed to the stack just

instructions for the address).

6.7 Native Compiling
In a pure subroutine threaded code, higher-level words are merely a series of subroutine jumps.
For instance, the Forth word

[char],

formally dened in high-level Forth as

: [char] char postpone literal ; immediate
in assembler is simply

jsr xt_char
jsr xt_literal
as an immediate, compile-only word. Theare are two obvious problems with this method: First,
it is slow, because each

jsr/rts

pair consumes four bytes and 12 cycles overhead. Second, for

smaller words, the jumps use far more bytes than the actual code. Take for instance

drop,

which

in its naive form is simply

inx
inx
for two bytes and four cycles. If we jump to this word as is assumed with pure subroutine threaded
Forth, we add four bytes and 12 cycles  double the space and three times the time required by
the actual working code. (In practice, it's even worse, because

drop

checks for underow. The

actual assembler code is

cpx
bmi
lda
jmp

#dsp0-1
+
#11
error

; error code for underflow

*
inx
inx

28

6.7.

NATIVE COMPILING

CHAPTER 6.

HOW TALI FORTH WORKS

for eleven bytes. We'll discuss the underow checks further below.)
To get rid of this problem, Tali Forth supports

limit

native compiling.

The system variable

nc-

sets the threshhold up to which a word will be included not as a subroutine jump, but

machine language. Let's start with an example where

nc-limit

is set to zero, that is, all words

are compiled as subroutine jumps. Take a simple word such as

: aaa 0 drop ;
and check the actual code with

see

see aaa
nt: 7CD xt: 7D8
size (decimal): 6
07D8

20 52 99 20 6B 88

ok

(The actual addresses might be dierent, this is from the ALPHA release). Our word
of two subroutine jumps, one to zero and one to

drop.

aaa consists

Now, if we increase the threshhold to 20,

we get dierent code, as this console session shows:

20 nc-limit ! ok
: bbb 0 drop ; ok
see bbb
nt: 7DF xt: 7EA
size (decimal): 17
07EA
07FA

CA CA 74 00 74 01 E0 77
E8 ok

Even though the denition of

30 05 A9 0B 4C C7 AC E8

bbb is the same as aaa,

we have totally dierent code: The number

0001 is pushed to the Data Stack (the rst six bytes), then we check for underow (the next nine
bytes), and nally we

drop

by moving X, the Data Stack Pointer. Our word is denitely longer,

but have just saved 12 cycles.
To experiment with various parameters for native compiling, the Forth word
included in

user_words.fs

words&sizes

is

(but commented out by default). The Forth is:

: words&sizes ( -- )
latestnt
begin
dup
0<> while
dup name>string type space
dup wordsize u. cr
2 + @
repeat
drop ;
An alternative is

nc-limit

see,

which also displays the length of a word. One way or another, changing

should show dierences in the Forth words.

6.7.1 Return Stack special cases
There are a few words that cause problems with subroutine threaded code: Those that access the
Return Stack such as

r>, >r, r@, 2r>,

and

2>r.

For them to work correctly, we rst have to

remove the return address on the top of the stack, only to replace it again before we return to
the caller. This mechanism would normally prevent the word from being natively compiled at all,
because we'd try to remove a return address that doesn't exit.
This becomes clearer when we examine the code for

xt_r_from:
pla
sta tmptos
29

>r

(comments removed):

6.7.

NATIVE COMPILING

CHAPTER 6.

HOW TALI FORTH WORKS

ply
; --- CUT FOR NATIVE CODING --dex
dex
pla
sta 0,x
pla
sta 1,x
; --- CUT FOR NATIVE CODING --phy
lda tmptos
pha
z_r_from:

rts

The rst three and last three instructions are purely for housekeeping with subroutine threaded
code. To enable this routine to be included as native code, they are removed when native compiling
is enabled by the word

compile,.

This leaves us with just the six actual instructions in the center

of the routine to be compiled into the new word.

6.7.2 Underow stripping
As described above, every underow check adds seven bytes to the word being coded. Stripping
this check by setting the

uf-strip

system variable to

true

simply removes these seven bytes

from new natively compiled words.
It is possible, of course, to have lice and eas at the some time. For instance, this is the code
for

>r:

xt_to_r:
pla
sta tmptos
ply
; --- CUT HERE FOR NATIVE CODING --cpx #dsp0-1
bmi +
jmp underflow
*
lda 1,x
pha
lda 0,x
pha
inx
inx
; --- CUT HERE FOR NATIVE CODING --phy
lda tmptos
pha
z_to_r:

rts

30

6.8.

CMOVE, CMOVE>

both

This word has

AND

MOVE

CHAPTER 6.

HOW TALI FORTH WORKS

native compile stripping and underow detection.

However, both can be

removed from newly native code words, leaving only the eight byte core of the word to be compiled.

6.8 cmove, cmove> and move
The three moving words

cmove, cmove>,

and

move

show subtle dierences that can trip up new

users and are reected by dierent code under the hood.

cmove

and

cmove>

are the traditional

Forth words that work on characters (which in the case of Tali Forth are bytes), whereas

move

is

a more modern word that works on address units (which in our case is also bytes).
If the source and destination regions show no overlap, all three words work the same. However, if
there is overlap,

cmove and cmove> demonstrate a behavior called propagation or clobbering :
move does not show this behavior. This example shows

Some of the characters are overwritten.
the dierence:

create testbuf char a c, char b c, char c c, char d c,
testbuf 4 type ( abcd ok )
testbuf dup char+ 3 cmove ( ok )
testbuf 4 type ( aaaa ok )
Note the propagation in the result.

move,

however, doesn't propagate.

would be:

testbuf dup char+ 3 move ( ok )
testbuf 4 type ( aabc ok )
In practice,

move

( ok )

is usually what you want to use.

31

The last two lines

Chapter 7

Developing
Programming computers can be crazy-making.

Thinking Forth [3]

 Leo Brodie,

7.1 Adding new words
The simplest way to add new words to Tali Forth is to include them in the le

forth_code/user_words.fs.

This is the place to put them for personal use.
To add words to the permament set, it is best to start a pull request on the GitHub page of
Tali Forth. The use of git and GitHub is beyond the scope of this document  we'll just point out
it they are not as complicated as they look, and the make experimenting a lot easier.
Internally, Tali Forth 2 tends to follow a sequence of steps for new words:

•

If it is an ANSI Forth word, review the standard online. In some cases, there is a reference
implementation that can be used.

•

Otherwise, check other sources for a high-level realization of the word. This can be Jonesforth
or Gforth. A direct copy is usually not possible (or legally allowed, given dierent licenses),
but gives hints for a Tali Forth version of the word.

•

After the new word has been tested interactively, add a high-level version to the le

•

Add tests for the new word to the test suite. Ideally, there will be tests code included in the

forth_code/forth_words.

ANSI specication.

•

If appropriate, convert the word to assembler, adding an entry to
code itself to

JSR
•

native_words.asm.

headers.asm

and the

In this rst step, it will usually be a simple sequence of

subroutine jumps to the existing native Forth words.

If appropriate, rewrite all or some of the subroutine jumps in direct assembler.

Note that if you are contributing code, feel free to happily ignore this sequence and just submit
whatever is working.

7.2 Deeper changes
Tali Forth was not only placed in the public domain to honor the tradition of giving the code away
freely. It is also to let people play around with it and adapt it to their own machines. This is also
the reason it is (perversely) overcommented.
To work on the internals of Tali Forth, you will need the Ophis assembler.

32

7.3.

CODE CHEAT SHEET

CHAPTER 7.

DEVELOPING

7.2.1 The Ophis Assembler
Michael Martin's Ophis Cross-Assember can be downloaded from http://michaelcmartin.github.io/Ophis/.
It uses a slightly dierent format than other assemblers, but is in Python and therefore will run
on almost any operating system. To install Ophis on Windows, use the link provided above. For
Linux:

git clone https://github.com/michaelcmartin/Ophis
cd Ophis/src
sudo python setup.py install
Switch to the folder where the Tali code lives, and run the Makele with a simple
command. This also updates the le listings in the

docs

make

folder.

Ophis has some quirks. For instance, you cannot use math symbols in label names, because it
will try to perform those operations. Use underscores for label names instead.

7.2.2 General notes
•

The X register should not be changed without saving its pointer status.

•

The Y register is free to be changed by subroutines. This means it should not be expected
to survive subroutines unchanged.

•

All words should have one point of entry  the

z_word.
•

xt_word

link  and one point of exit at

In may cases, this means a branch to an internal label

done

right before

Because of the way native compiling works, the usual trick of combining
a single

jmp

z_word.

jsr/rts

pairs to

(usually) doesn't work.

7.2.3 Coding style
Until I get around to writing a tool for Ophis assembler code that formats the source le the way
gofmt does for Go (golang), I work with the following rules:

•

two tabs
Tabs are eight characters long and converted to spaces

•

Function-like routines are followed by a one-tab indented `function doc' based on the Python

•

Actual opcodes are indented by

3 model: Three quotation marks at the start, three at the end it its own line, unless it is
a one-liner. This should make it easier to automatically extract the docs for them at some
point.

•

The native words have a special commentary format that allows the automatic generation of
word list by a tool in the tools folder, see there for details.

•

Assembler mnenomics are lower case. I get enough uppercase insanity writing German, thank
you very much.

•

Hex numbers are also lower case, such as

•

Numbers in mnemonics are a stripped-down as possible to reduce visual clutter:
instead of

•

$FFFE
lda 0,x

lda $00,x.

Comments are included like popcorn to help readers who are new both to Forth and 6502
assembler.

7.3 Code Cheat Sheet
7.3.1 The Stack Drawing
This is your friend and should probably go on your wall or something.

33

7.3.

CODE CHEAT SHEET

...
$0076
$0077
$0078
$0079
$007A

CHAPTER 7.

+--------------+
|
... |
+-+
|
|
+- (empty)
-+
|
|
+-+
|
|
+==============+
|
LSB|
+TOS
-+
|
MSB|
+==============+
| (garbage)
|
+--------------+
|
|
+ (floodplain) +
|
|
+--------------+

DEVELOPING

...
FE,X
FF,X
00,X

<-- DSP (X Register)

01,X
02,X

<-- DSP0

03,X
04,X

7.3.2 Coding idioms
While coding a Forth, there are certain assembler fragments that get repeated over and over again.
These could be included as macros, but that can make the code harder to read for somebody only
familiar with basic assembly.
Some of these fragments could be written in other variants, such as the `push value' version,
which could increment the DSP twice before storing a value. We try to keep these in the same
sequence (a "dialect" or "code mannerism" if you will) so we have the option of adding code
analysis tools later.

drop

cell of top of the Data Stack

inx
inx
push

a value to the Data Stack. Remember the Data Stack Pointer (DSP, the X register of the
65c02) points to the LSB of the TOS value.

dex
dex
lda
sta
lda
sta
pop

$
0,x
$
1,x

; or pla, jsr kernel_getc, etc.
; or pla, jsr kernel_getc, etc.

a value o the Data Stack

lda
sta
lda
sta
inx
inx

0,x
$
1,x
$

; or pha, jsr kernel_putc, etc
; or pha, jsr kernel_putc, etc

34

7.3.

CODE CHEAT SHEET

CHAPTER 7.

DEVELOPING

7.3.3 vi shortcuts
One option for these is to add abbreviations to your favorite editor, which should of course be vim,
because vim is cool. There are examples for that further down. They all assume that auto-indent
is on and we are two tabs in with the code, and use
separate from the normal words. My

~/.vimrc

#

at the end of the abbreviation to keep them

le contains the following lines for work on

.asm

les:

ab drop# inx; dropinx
ab push# dex; pushdexlda $sta $00,xlda
$sta $01,x
ab pop# lda $00,x; popsta $lda $01,xsta $<
MSB>inxinx

35

Chapter 8

Future plans
(See the le TODO.txt)

36

Appendix A

FAQ
A.1 What happened to Tali Forth 1?
Tali Forth 1, formally just Tali Forth, was my rst Forth. As such, it is fondly remembered as a
learning experience. You can still nd it online at GitHub at https://github.com/scotws/TaliForth.
When Tali Forth 2 entered BETA, Tali Forth was discontinued and does not receive bug xes either.

A.2 Why does Tali Forth take so long to start up?
After the default kernel string is printed, you'll notice a short pause that didn't occur with Tali
Forth 1.

This is because Tali Forth 2 has more words dened in high-level Forth (see

words.asm)

forth-

than Tali did. The pause happens because they are being compiled on the y.

A.3 Why `Tali' Forth?
I like the name, and we're probably not going to have any more kids I can give it to.
(If it sounds vaguely familiar, you're probably thinking of Tali'Zorah vas Normandy, a character
in the `Mass Eect' universe created by EA/BioWare. This software has absolutely nothing to do
with either the game or the companies and neither do I, expect that I've played the games and
enjoyed them, though I do have some issues with

Andromeda.

Like what happened to the quarian

ark?)

A.4 Who is `Liara'?
Liara Forth is a STC Forth for the big sibling of the 6502, the 65816. Tali Forth 1 came rst, then
I wrote Liara with that knowledge and learned even more, and now Tali 2 is such much better for
the experience. Oh, and it's another

Mass Eect

character.

37

Appendix B

Testing Tali Forth
Tali Forth 2 comes with a test suite in the

tests

folder. It is based on code by John Hayes and

was rst adapted by Sam Colwell. To run it, switch to that folder and start the

talitest.py

program with Python3. The tests can take around half an hour to run and produce a lot of output,
including, at the end, a list of words that didn't work. A detailed list of results is saved to the le

Talitest.py

requires the

pexpect

package.

(During development, testing was done by hand with a list of words that has since been placed
in the folder.)

38

Appendix C

Thanks
Tali Forth would never have been possible without the help of a very large number of people, very
few of whom I have actually even met.
First, there is the crew at 6502.org who not only helped me build a 6502 computer, but also introduced me to Forth. Tali Forth would not exist without their inspiration, support, and feedback.
Thanks, guys.
Special thanks go out to Mike Barry and Lee Pivonka, who both suggested vast improvements
to the code in size, structure, and speed.

39

Bibliography
Biancuzzi, Masterminds of Programming, O'Reilly Media 1st edition, 2009.
Leo Brodie, Starting Forth, New edition 2003, https://www.forth.com/starting-forth/
Leo Brodie, Thinking Forth, 1984, http://thinking-forth.sourceforge.net/#21CENTURY
Edward K. Conklin, Elizabeth D. Rather Forth Programmer's Handbook, 3rd Edition 2010.
Mitch Derick, Linda Baker Forth Encyclopedia, Mountain View Press 1982.
David Eyes, Ron Lichty Programming the 65816, including the 6502, 65C02 and 65802,

[1] Federico
[2]
[3]
[4]
[5]
[6]

(Currently not available from the WDC website)

Kelly

[7] Guy

`Forth Systems Comparisons',

Forth Dimensions

V13N6 March/April 1992

http://www.forth.org/fd/FD-V13N6.pdf
[8] Lance A.

Leventhal, 6502 Assembly Language Programming,

OSBORNE/McGRAW-HILL

1979.
[9] J.V.

Nobel, A Beginner's Guide to Forth, http://galileo.phys.virginia.edu/classes/551.jvn.fall01/primer.htm.

40

Index
>in, 24
>r, 29, 30
[char], 28
0branch, 15
2>r, 29
2r>, 29

native, 16, 29
compiling, native,

6502.org, 7, 39

current input buer, 24

65c02, 8, 22
dictionary, 15,

0,
1,
2,

28, 29

constant, 24
create, 24
cs-pick, 15
cs-roll, 15

21

15

digit?, 15
does>, 24

15

double cell, 17, 21

15

drop,
dump,

6502, 7
65816, 10, 37

15, 22, 28
14

emit, 15, 16
evaluate, 24

accumulator, 10
address bus, 10

always-native,

execution token, 15, 22

15

ANSI Forth, 8, 24, 32
Apple II, 7

false,

ASCII, 15

feedback,

Atari 800, 7

FIG Forth, 8, 10

8

17

le
block, 24

Barry, Mike, 39

text, 24

bell, 15
blk, 24

le-id, 24

booting, 14, 24

oodplain (stack), 20

bounds,
branch,

15

Forth,

15

Forth words, 14, 22

forth_words.asc,

Brodie, Leo, 8, 32
bugs, 17

bye,

git, 32

git,

Go, 33

lower, 15

Goto, Kasumi, 15

case, lower, 22

Cassini, 8

Hayes, John, 38

cheat sheet, code, 33

header, 11,

current input buer

22

ags, 22

clobbering, 31

history, command line, 15

cmove, 31
cmove>, 31
cold, 16

if,

15

input,

Colwell, Sam, 38

24

input,

command line interface, 24

15

instruction set

Commodore PET, 7

compile,, 30
compile-only,

13

GitHub, 13, 17, 32, 37

case

see

24

Gforth, 8, 15, 17, 32

15, 22

Canterbury Tales, The, 8

cib,

8

architecture, 8

int>name,

15

15

intro string, 22

compiling

ISA,

41

see

instruction set

INDEX

INDEX

source,

Jonesforth, 32

stack,

24

20, 33

Kelly, Guy, 10

status byte, 22

kernel, 14

style, coding, 33

keyboard, 24
Tali Forth 1, 11, 16, 20, 21, 37

latest, 16
latestnt, 16
Liara Forth, 20,

talitest.py,

38

testing, 23, 38

37

threading, 8, 10, 16, 24, 29

Linux, 13, 33

true,

8, 16, 30

Lisp, 8

uf-strip,

little endian, 20

16, 30

underow, 16, 20, 28, 29
Martin, Michael, 33

detection, 11, 16, 21

Mass Eect, 37
memory map,

stripping, 30

23

user words, 22

Mensch, Bill, 7

user_words.asc,

24

Moore, Charles, 7, 8, 14

move,

31

vas Normandy, Tali'Zorah, 37
VIC-20, 7

name token, 15, 16, 22
NASA, 8

WDC, 7

native compilation, 11

Williams, Elliot, 8

native compiling, 16

Wilson, Garth, 8

native words, 10, 15, 22

words,

nc-limit, 15, 16, 29
never-native, 16

words, adding, 32

nt,

see

words&sizes,
wordsize, 16

name token

number,

15, 22
29

16
X register, 10, 11, 20, 21, 29, 33

old,

38

Ophis assembler,

output,

xt,

32, 33

overow, 21

Z80, 7
Zero Page, 10, 11, 20, 26

Peddle, Chuck, 7
38

PIP, 13
Pivonka, Lee, 39
propagation, 31
pull request, 32

13, 14

py65mon,

py65mon,

13

Python, 13, 33

r>,

29

Ragsdale, Bill, 8
RAM, 10, 23
results.txtresults.txt, 38
Return Stack, 29
reverse polish notation, 8
RISC-V, 11
Rodriguez, Brad, 10
ROM, 10, 23
RPN,

see

reverse polish notation

s>number?,
see, 15, 29

execution token

Y register, 10, 33

16

pexpect,

see

16

self-test, 22
Selzer, Robert, 8

42



Source Exif Data:
File Type                       : PDF
File Type Extension             : pdf
MIME Type                       : application/pdf
PDF Version                     : 1.5
Linearized                      : No
Page Count                      : 42
Page Mode                       : UseOutlines
Author                          : 
Title                           : 
Subject                         : 
Creator                         : LaTeX with hyperref package
Producer                        : pdfTeX-1.40.16
Create Date                     : 2018:06:19 14:37:40+02:00
Modify Date                     : 2018:06:19 14:37:40+02:00
Trapped                         : False
PTEX Fullbanner                 : This is pdfTeX, Version 3.14159265-2.6-1.40.16 (TeX Live 2015/Debian) kpathsea version 6.2.1
EXIF Metadata provided by EXIF.tools

Navigation menu