Laxen meta-compiling one – Nimble Machines

Original article appeared in Fourth Dimensions Volume IV, Issue 6

Other articles in this series: Laxen meta-compiling two .. Laxen meta-compiling three

Meta Compiling I

Henry Laxen

Meta-compiling is an often heard term in Forth, and yet most people approach it with fear and anxiety. This is unfortunate since it is really not that difficult and it is extremely powerful. Many purposes have been attributed to meta-compiling, such as generating new Forth systems, creating a custom application, cross compiling code for a different target machine, removing the names (headers) from the code, and generating ROMable code. All of these are different benefits of the meta-compiling process, but they may or may not be the only way to accomplish the task. For example FIG allowed people to create new Forth systems by supplying assembly language listings of Forth which people could enter into their computer and assemble with their assembler. No meta-compiler ever entered the picture. Before exploring meta-compiling in detail, let’s first look at the dictionary definition of the word META.

meta a prefix meaning 1. changed, transposed [metamorphosis, metathesis]; 2. after, beyond, higher [metaphysics]

Meta-compiling in Forth combines attributes of both of the above definitions. It occurs on a “higher level” from ordinary compiling and involves a change from one environment to another. In one sentence, meta-compiling in Forth is a process in which Forth code is compiled in one environment and executed in another. The environment in which the code is compiled is called the host system. The environment in which the code compiled by the meta-compiler will finally execute is called the target system. One of the main difficulties encountered in meta-compiling is the confusion that naturally arises out of the interactions between the different environments. Many words in the meta-compiler have totally different meanings depending on the context in which they are used.

In the first part of this exposition we will look in detail at one of the central issues of meta-compiling, namely that of storage allocation. We will leave the issue of context for the next article. Think of a meta-compiler as a machine in which Forth source code is cranked in and target object code is cranked out. In any software project one of the main resource allocation problems is how to allocate memory. The same is true in meta-compiling, and this article will address the issue of memory allocation for meta-compilers. The problem then is to construct a mapping in which the target image can reside, and to find a convenient way of manipulating that target image. Instead of reinventing the wheel, let’s do it the way Forth does it. Forth has a set of words that read and write memory, as well as allocate and initialize space in the dictionary. Presumably we will need the same functions in the target image. The difference is that while the ordinary Forth words that read and write memory, namely @ and !, operate on addresses, our new read and write memory words will have to operate on target addresses. What we need is a word which will map target addresses into host addresses. Let’s call this word THERE and it must behave as follows:

  target-address THERE host-address

Using THERE, we can define the read and write memory words as

  : @-T   THERE @ ;
  : !-T   THERE ! ;

We append the -T suffix to indicate that we are fetching and storing into Target address. We can define C@-T and C!-T in a similar way. Next we want to implement something analogous to a dictionary in the target system. The amount of space that has been allocated in an ordinary Forth system is held in a variable called DP.

We can analogously define a variable called DP-T to hold the amount of space allocated in the target system. Armed with that definition we can define the dictionary words as follows:

  : HERE-T   DP-T @ ;
  : ALLOT-T   DP-T +! ;
  : ,-T   HERE-T !-T  2 ALLOT-T ;

(Why is there an ordinary @ in the definition of HERE-T and a !-T in the definition of ,-T?)

Why have we gone through such an elaborate ritual? Let’s take a quick look at what we can do with these words. Perhaps you recall how Forth assemblers work. (If not wait for a future issue and I will discuss them in this column.) The main idea behind Forth assemblers is that you define a set of Forth words whose names are opcodes for your particular machine. When these words are executed they assemble their machine language binary opcode into the dictionary along with whatever parameters are required. For example the jump instruction on the 8080 is a hex C3 followed by the 16 bit address of where to jump to. The JMP word in the Forth Assembler is thus defined as:

  : JMP   C3 c, , ;

The C, assembles the opcode into the dictionary and the , assembles the address that must have been left on the stack. Notice that the compiled code is inline in the dictionary. Now, using the -T definitions we defined above, we can now assemble code which will execute from a different address than where it was assembled. The corresponding definition for jump would be:

  : JMP   C3 C,-T ,-T ;

This would assemble the opcode in the next available location in the target system, not in the host system. Furthermore, it will jump to the specified target address when it is executed, not to the host address. What we have done is turned a Forth assembler that can assemble inline code words into a cross assembler that can assemble code that will execute in an environment other than Forth.

If the significance of what has just been discussed has escaped you, don’t feel bad. It escaped me the first six times also. Don’t be fooled by the simplicity of the implementation. The mere fact that we can assemble or compile code in a different memory area than the one we are executing out of is very powerful. It is one of the cornerstones of the meta-compiling process.

It now only remains to define the mapping word THERE, which takes a target address and returns a host address. The simplest approach, if you have enough user memory, is to simply define THERE as a constant offset as follows:

  20000 CONSTANT TARGET-OFFSET
  : THERE   TARGET-OFFSET + ;

You can’t get much simpler than that. However, there are times when memory is tight or the application program is just too large to fit. What do you do then? In most other programming languages you either give up or start the entire application over from scratch. We in Forth have the luxury of redefining a few words and the rest of the application will never know the difference. Let’s take a look at how we can provide a mapping from target to host addresses without taking up any room in the host dictionary. The answer is of course to use BLOCK as a means of mapping memory addresses into disk addresses. Consider the following:

  10 CONSTANT TARGET-BLOCK
  : THERE ( target-addr -- host-addr)
     1024 /MOD  TARGET-BLOCK +  BLOCK + ;

We first divide by 1024 bytes per block, and get back a quotient and a remainder. The quotient is the block number, and the remainder is the byte index into that block. All we have to do is add in the beginning block number, TARGET-BLOCK, and call our friend BLOCK to perform the mapping of a block number into a buffer address. Finally, we add in the byte index into the returned address and we are done. Or are we? There are two bugs in the above code, as it relates to meta-compiling. See if you can find what they are.

The first bug will probably not bite you, but when it does it will produce very dramatic results and it will be obvious how to fix it. The problem is that when dealing with addresses, you should be very careful what kind of arithmetic you perform. Addresses are unsigned quantities, while division and multiplication deal with signed quantities. The above code works fine as long as the Target address is less than 32K. As soon as it is larger, /MOD returns a signed quotient and remainder, and we will be passing BLOCK a very strange block number. I will leave it to you to rewrite THERE to avoid this 32K problem.

The second bug is far more subtle, and in fact does not lie in the word THERE at all. You don’t discover this one until you have crashed many many times. Recall the definition of @-T and !-T was:

  : @-T   THERE @ ;
  : !-T   THERE ! ;

Well, 1023 out of 1024 times this will work just fine. You see if we call THERE with a Target address that is congruent to 1023 modulo 1024, then THERE will return the address of the last byte in a block buffer. Since @ and ! act on 2-byte, 16-bit entities, the wrong results will be read or written. Rule of usage for THERE is that it takes a target byte address and returns a host byte address. Only a single byte address is returned. There is no guarantee that target address + 1 maps into host address + 1. That is a false assumption on the user’s part. Anyway, how do we fix it? It really is quite simple; namely, we must construct @ and ! out of C@ and C!, which only operate on byte addresses, not word addresses.

At this point we get into a small mess because many microcomputers are “byte-swapped”, meaning that for a 16-bit word, the low order 8-bit half is stored first. The 8080 and 6502 are prime examples of byte-swapping machines. The newer 68000 computer is an example of a non-byte-swapping machine. Anyway, to construct @ and ! out of C@ and C! we must be aware of the byte swapping. Let’s suppose we are on a byte-swapped machine, and let’s take a look at how to implement @. I will leave the implementation of ! as an exercise. Consider:

  : @-T ( target-addr -- value)
     DUP C@-T ( addr low)
     SWAP 1 + C@-T ( low high)
     256 * + ( hi lo) ;

Notice that only C@-T is used, so our rule of usage is not violated. This is rather slow on most machines because of the multiply, but it will certainly work. What would be nicer is to define a CODE word, say FLIP, which exchanges the high and low halves of a 16-bit word. Then we could replace the 256 * phrase with FLIP and it would be much faster. If the machine were not byte-swapped then we would place the FLIP or the 256 * after the first C@-T instead of the second. See if you can implement !-T in an analogous way.

What we have really done is implement a disk resident virtual memory system. It turns out to be very useful in many applications, not just meta-compilation. Any time you need a very large array that will not fit in memory, the same technique will work. Next time we will look deeper into the meta-compiling process and address the issue of how to actually generate the target image code, now that we have a place to put it. Until then, good luck and may the Forth be with you.

Other articles in this series: Laxen meta-compiling two .. Laxen meta-compiling three