\ A sometimes minimal FORTH compiler and tutorial for Linux / i386 systems. -*- asm -*-
\ By Richard W.M. Jones <rich@annexia.org> http://annexia.org/forth
\ This is PUBLIC DOMAIN (see public domain release statement below).
-\ $Id: jonesforth.f,v 1.14 2007-10-10 13:01:05 rich Exp $
+\ $Id: jonesforth.f,v 1.18 2009-09-11 08:32:33 rich Exp $
\
\ The first part of this tutorial is in jonesforth.S. Get if from http://annexia.org/forth
\
\ SPACE prints a space
: SPACE BL EMIT ;
-\ The 2... versions of the standard operators work on pairs of stack entries. They're not used
-\ very commonly so not really worth writing in assembler. Here is how they are defined in FORTH.
-: 2DUP OVER OVER ;
-: 2DROP DROP DROP ;
-
\ NEGATE leaves the negative of a number on the stack.
: NEGATE 0 SWAP - ;
( Some more complicated stack examples, showing the stack notation. )
: NIP ( x y -- y ) SWAP DROP ;
-: TUCK ( x y -- y x y ) DUP ROT ;
+: TUCK ( x y -- y x y ) SWAP OVER ;
: PICK ( x_u ... x_1 x_0 u -- x_u ... x_1 x_0 x_u )
1+ ( add one because of 'u' on the stack )
4 * ( multiply by the word size )
SWAP ( width u )
DUP ( width u u )
UWIDTH ( width u uwidth )
- -ROT ( u uwidth width )
+ ROT ( u uwidth width )
SWAP - ( u width-uwidth )
( At this point if the requested width is narrower, we'll have a negative number on the stack.
Otherwise the number on the stack is the number of spaces to print. But SPACES won't print
DUP 0< IF
NEGATE ( width u )
1 ( save a flag to remember that it was negative | width n 1 )
- ROT ( 1 width u )
- SWAP ( 1 u width )
+ SWAP ( width 1 u )
+ ROT ( 1 u width )
1- ( 1 u width-1 )
ELSE
0 ( width u 0 )
- ROT ( 0 width u )
- SWAP ( 0 u width )
+ SWAP ( width 0 u )
+ ROT ( 0 u width )
THEN
SWAP ( flag width u )
DUP ( flag width u u )
UWIDTH ( flag width u uwidth )
- -ROT ( flag u uwidth width )
+ ROT ( flag u uwidth width )
SWAP - ( flag u width-uwidth )
SPACES ( flag u )
: ? ( addr -- ) @ . ;
( c a b WITHIN returns true if a <= c and c < b )
+( or define without ifs: OVER - >R - R> U< )
: WITHIN
- ROT ( b c a )
+ -ROT ( b c a )
OVER ( b c a c )
<= IF
> IF ( b c -- )
Notice that the parameters to DUMP (address, length) are compatible with string words
such as WORD and S".
+
+ You can dump out the raw code for the last word you defined by doing something like:
+
+ LATEST @ 128 DUMP
)
: DUMP ( addr len -- )
- BASE @ ROT ( save the current BASE at the bottom of the stack )
+ BASE @ -ROT ( save the current BASE at the bottom of the stack )
HEX ( and switch to hexadecimal mode )
BEGIN
CR
DUP 1- 15 AND 1+ ( addr len linelen )
- DUP ( addr len linelen linelen )
- ROT ( addr linelen len linelen )
+ TUCK ( addr linelen len linelen )
- ( addr linelen len-linelen )
- ROT ( len-linelen addr linelen )
- + ( len-linelen addr+linelen )
- SWAP ( addr-linelen len-linelen )
+ >R + R> ( addr+linelen len-linelen )
REPEAT
DROP ( restore stack )
: R/W ( -- fam ) O_RDWR ;
: OPEN-FILE ( addr u fam -- fd 0 (if successful) | c-addr u fam -- fd errno (if there was an error) )
- ROT ( fam addr u )
+ -ROT ( fam addr u )
CSTRING ( fam cstring )
SYS_OPEN SYSCALL2 ( open (filename, flags) )
DUP ( fd fd )
: CREATE-FILE ( addr u fam -- fd 0 (if successful) | c-addr u fam -- fd errno (if there was an error) )
O_CREAT OR
O_TRUNC OR
- ROT ( fam addr u )
+ -ROT ( fam addr u )
CSTRING ( fam cstring )
- 420 ROT ( 0644 fam cstring )
+ 420 -ROT ( 0644 fam cstring )
SYS_OPEN SYSCALL3 ( open (filename, flags|O_TRUNC|O_CREAT, 0644) )
DUP ( fd fd )
DUP 0< IF ( errno? )
;
: READ-FILE ( addr u fd -- u2 0 (if successful) | addr u fd -- 0 0 (if EOF) | addr u fd -- u2 errno (if error) )
- ROT SWAP -ROT ( u addr fd )
+ >R SWAP R> ( u addr fd )
SYS_READ SYSCALL3
DUP ( u2 u2 )
updates the header so that the codeword isn't DOCOL, but points instead to the assembled
code (in the DFA part of the word).
- We provide a convenience macro NEXT (you guessed the rest).
+ We provide a convenience macro NEXT (you guessed what it does). However you don't need to
+ use it because ;CODE will put a NEXT at the end of your word.
The rest consists of some immediate words which expand into machine code appended to the
definition of the word. Only a very tiny part of the i386 assembly space is covered, just
enough to write a few assembler primitives below.
)
+HEX
+
+( Equivalent to the NEXT macro )
+: NEXT IMMEDIATE AD C, FF C, 20 C, ;
+
: ;CODE IMMEDIATE
+ [COMPILE] NEXT ( end the word with NEXT macro )
ALIGN ( machine code is assembled in bytes so isn't necessarily aligned at the end )
LATEST @ DUP
HIDDEN ( unhide the word )
[COMPILE] [ ( go back to immediate mode )
;
-HEX
-
-( Equivalent to the NEXT macro )
-: NEXT IMMEDIATE AD C, FF C, 20 C, ;
-
( The i386 registers )
: EAX IMMEDIATE 0 ;
: ECX IMMEDIATE 1 ;
RDTSC ( writes the result in %edx:%eax )
EAX PUSH ( push lsb )
EDX PUSH ( push msb )
- NEXT
;CODE
(
+ INLINE can be used to inline an assembler primitive into the current (assembler)
+ word.
+
+ For example:
+
+ : 2DROP INLINE DROP INLINE DROP ;CODE
+
+ will build an efficient assembler word 2DROP which contains the inline assembly code
+ for DROP followed by DROP (eg. two 'pop %eax' instructions in this case).
+
+ Another example. Consider this ordinary FORTH definition:
+
+ : C@++ ( addr -- addr+1 byte ) DUP 1+ SWAP C@ ;
+
+ (it is equivalent to the C operation '*p++' where p is a pointer to char). If we
+ notice that all of the words used to define C@++ are in fact assembler primitives,
+ then we can write a faster (but equivalent) definition like this:
+
+ : C@++ INLINE DUP INLINE 1+ INLINE SWAP INLINE C@ ;CODE
+
+ One interesting point to note is that this "concatenative" style of programming
+ allows you to write assembler words portably. The above definition would work
+ for any CPU architecture.
+
+ There are several conditions that must be met for INLINE to be used successfully:
+
+ (1) You must be currently defining an assembler word (ie. : ... ;CODE).
+
+ (2) The word that you are inlining must be known to be an assembler word. If you try
+ to inline a FORTH word, you'll get an error message.
+
+ (3) The assembler primitive must be position-independent code and must end with a
+ single NEXT macro.
+
+ Exercises for the reader: (a) Generalise INLINE so that it can inline FORTH words when
+ building FORTH words. (b) Further generalise INLINE so that it does something sensible
+ when you try to inline FORTH into assembler and vice versa.
+
+ The implementation of INLINE is pretty simple. We find the word in the dictionary,
+ check it's an assembler word, then copy it into the current definition, byte by byte,
+ until we reach the NEXT macro (which is not copied).
+)
+HEX
+: =NEXT ( addr -- next? )
+ DUP C@ AD <> IF DROP FALSE EXIT THEN
+ 1+ DUP C@ FF <> IF DROP FALSE EXIT THEN
+ 1+ C@ 20 <> IF FALSE EXIT THEN
+ TRUE
+;
+DECIMAL
+
+( (INLINE) is the lowlevel inline function. )
+: (INLINE) ( cfa -- )
+ @ ( remember codeword points to the code )
+ BEGIN ( copy bytes until we hit NEXT macro )
+ DUP =NEXT NOT
+ WHILE
+ DUP C@ C,
+ 1+
+ REPEAT
+ DROP
+;
+
+: INLINE IMMEDIATE
+ WORD FIND ( find the word in the dictionary )
+ >CFA ( codeword )
+
+ DUP @ DOCOL = IF ( check codeword <> DOCOL (ie. not a FORTH word) )
+ ." Cannot INLINE FORTH words" CR ABORT
+ THEN
+
+ (INLINE)
+;
+
+HIDE =NEXT
+
+(
NOTES ----------------------------------------------------------------------
DOES> isn't possible to implement with this FORTH because we don't have a separate