Improved serial port for the Replica 1 TE

Overview

The Briel Computers Replica 1 is a functional clone of the Apple 1 computer created by Steve Wozniak in 1976.

The Replica 1 TE ("third edition") includes a number of improvements to the original Apple 1 hardware including an RS232 port.

Unfortunately in the standard implementation the serial port is very slow (2,400 baud) and unreliable, often dropping input characters. Some implementations (link 1, link 2) have improved this to 19,200 baud by adding improved hardware — but they require software changes to the Woz monitor and every other application that talks directly to the 6821 (Krusader, BASIC, etc) as they provide a new serial interface rather than fixing the existing one.

This page sets out my solution to this problem, which gives reliable input and output at up to 115,200 baud with full hardware handshaking using the original 6821 (and thus requires no software changes). I also make an attempt to explain the problem itself.

If you're already familiar with the Apple 1 and Replica 1 you might want to skip ahead to the good bit or if you're very impatient to the instructions.

The Apple 1

The Apple 1 was an early personal computer, one of the first.

The Apple 1 was a really simple machine by modern standards. It had a MOS Technology 6502 processor, Motorola 6821 peripheral interface adapter, a tiny (256 byte!) ROM which contained the "Woz monitor" program, 4KB of RAM RAM, some glue logic, and an innovative video output circuit to draw characters on a standard television. Input was via an ASCII keyboard. The RAM could be expanded up to 48KB (although this much RAM would have cost thousands of dollars in 1976). The machine originally cost $666.66 and around 200 were made. So few original machines survive today (around 30–50) that they now change hands for large sums — in November 2010 one sold for £133,250 at a Christies auction.

With no mass storage, one had to load software into the machine by typing it into the "Woz monitor" — this is the program that occupies the 256 bytes of ROM at the top of the machine's memory. It allows the user to program values into the machines memory, read values from the memory, and have the processor jump and start executing at a particular location. So for example, to tell the monitor to program sixteen bytes at memory locations 6000 through 600F (all numbers are expressed in hexadecimal) you would type in:

6000: 4C A5 7C 4C EA 62 CD 66
6008: DB 65 A8 6A 90 67 63 69

Typically when you want to load some software you type it into the monitor using this syntax, which loads the machine code into RAM, and then enter something like 6000R to tell the processor to start executing the program you just loaded into RAM. Given that programs can be a few thousand bytes long this can be a lot of typing!

An optional audio cassette interface board was available which included another small ROM program which could modulate bytes read from memory into an audio signal, or demodulate an audio signal into bytes which were then stored in memory. This audio signal was recorded or played back from audio cassette. By modern standards this is somewhat archaic.

The Replica 1

The Replica 1 is a functional clone of the Apple 1 using a contemporary microcontroller to re-implement the input/output subsystem using fewer chips and adding some contemporary I/O interfaces.

The Replica 1 uses the original 6502 and 6821 combination, but expands the RAM to 32KB and the ROM to 8KB. The extra space in the ROM contains a copy of Apple I BASIC (originally distributed on tape) and Ken Wessen's Krusader interactive symbolic assembler. The original Woz monitor code is still present.

The original Apple 1 video output circuitry, which used a large number of simple chips, is replaced by a single Parallax Propeller chip which is basically a microcontroller with 8 cores executing in parallel. The propeller chip generates composite video output for a television and also accepts input from a modern PS/2 keyboard and translates it to the ASCII keyboard protocol that the Apple 1 expects. Finally it includes an RS232 serial port which one can use as a user interface instead of the video monitor and keyboard.

Uploading over serial

It's fun to type your first program into the Woz monitor and see it do something, but it goes rapidly downhill from there. No-one wants to sit at a keyboard typing thousands of keystrokes into the Woz monitor — it takes hours and hours and is a process prone to human error. Plus it's very dull. Four see, ay five, seven see, four see, eee ay, six two, see dee, six six. Zzzzz.

The RS232 serial port is very useful for uploading software through the Woz monitor. With the Replica 1 you can simply upload a text file containing the software through the serial port, and the propeller chip makes the characters received over serial look like keypresses.

Problem solved! Except ... it doesn't work very well. Even plodding along at 2,400 baud one very quickly runs into a problem with sending data into the machine — it quite frequently "misses" characters. On a modern PC, every keystroke generates an interrupt that tells the processor to stop doing whatever it's doing, read the keypress and buffer it somewhere, and then carry on as before.

The Apple 1 doesn't use interrupts, though. Instead it sits in a tight loop waiting for a keypress to arrive. Here's the relevant code from the Woz monitor:

NEXTCHAR        LDA     $D011           Load keyboard input status register
                BPL     NEXTCHAR        No key yet? Jump back and try again
                LDA     $D010           Read character from input register

When a keypress does arrive, the 6502 goes off and does some processesing in reaction to it. While it's processing it isn't listening for keypresses. If a keypress arrives while it's busy processing, it can be missed. With a real keyboard this is fine because the user probably can't type fast enough to overflow this one character buffer, and if a keypress is missed the user will just wait and try again. With a serial port, though, missed keypresses cannot be corrected in this way, one quickly gets into a mess, and the software upload can become corrupted and fail. Sad faces all around.

The standard solution to this is to insert delays between characters sent over the serial line — typically slowing output to at most 50 characters per second, and with an extra delay after a carriage return since that is when the monitor interprets the buffered line of input and programs memory, which takes longer. This can slow things down considerably, but is workable if the processing delays are of predicable durations (as they are with the monitor). But when entering a program into the BASIC interpreter the processing delays vary depending on what is being typed — also get longer and longer as the size of the program in memory increases. You have to increase the delays, and end up with a solution that is still faster than typing the program in manually — but it still feels horribly slow!

Maybe you feel this is part of the "vintage experience". Personally I want to get on with things.

6821 Handshaking

You may have noticed that there's no apparent problem with missed characters on the output. Why is this?

When the software on the 6502 processor wants to output a character to the screen, it writes it to the 6821 peripheral interface adapter's output register. The 6821 puts the data onto an output bus and strobes a signal line to indicate that data is available. The propeller chip sees the strobe signal, reads the data from the output bus, displays it on the screen, and also puts it into an internal memory buffer for transmission through the RS232 port. There is a signal that the propeller asserts to signal to the 6821 when it is busy processing a character and when it is ready to accept more. While it's updating the screen or when the internal serial buffer fills up this line is used to signal to the 6821 and 6502 that is has to wait. When the internal buffer has drained through the RS232 port and space is available the signal indicates that it can send further characters.

This practice is known as "handshaking" and essentially it's just a signal from the propeller chip to the processor indicating when it's safe to send another character.

When characters arrive from the RS232 port the propeller writes them onto the ASCII keyboard bus and strobes a line indicating to the processor that a character is ready to be read through the 6821. The missed character problem arises because there is no signal back to the propeller to indicate when the character actually has been read. This means that we have to guess when it's safe to send the next character, and hence the requirement for the introduction of delays between characters.

A new hope

Somewhat fed up with slow or corrupted uploads, I sat down and read the 6821 datasheet. The datasheet says that after the propeller raises the CA1 signal to indicate the presence of input data on the bus, the 6821 will raise a CA2 signal. This CA2 signal will stay high until the 6502 has read the input register on the 6821. Fortunately the original Woz ROM code configures the 6821 to make this signal available, so no ROM changes are required.

I got out my logic analyser, sent a keypress and took a look at what happened (click on the image for a larger version):

It looked like it could be used for input handshaking!

The CA2 signal was unused in the Replica 1 design (since a good deal of effort has been made to stick to the original Apple 1 design), and fortunately there were spare pins on the Propeller chip. I soldered up a wire to patch the CA2 signal to a spare pin on the Propeller. Briel Computers had kindly provided the source to the firmware that runs on the Propeller so I quickly put together an updated Propeller firmware that removed the fixed 20 millisecond delay used to strobe the CA1 line and instead held CA1 high only until it saw the CA2 signal fall low. This immediately improved the speed with which one could load data into the system by allowing the Propeller to determine when the 6502 had successfully read an input character.

The Propeller chip itself turned out to be easy to reprogram, since the RESET line on the chip is wired up through the RS232 port and the Propeller chip's EEPROM can be reprogrammed "in circuit" over the RS232 line. It takes just a few seconds to compile and install a new firmware.

I made a second trace with the logic analyser to confirm it was working correctly. I sent "0<CR>0<CR>" over the serial line. This is a good test because after each "<CR>" the monitor will read address 0 in memory and print it out, which takes a little while:

0                    My input
0000: 00             Monitor response
0                    My input
0000: 00             Monitor response

The second trace (click on the image for the full version) clearly shows that the first two characters are accepted very quickly, but the third character has to wait while the first command ("read from address 0") is processed and the corresponding output is generated. The CA1 line is only held high for the minimum period required and the next character is sent rapidly from the serial buffer straight away. So the input handshaking works!

Yet more handshaking

Unfortunately this is not all that's required. The propeller chip implements a 64 byte buffer on both the serial input and output. When characters arrive over the serial port and the 6502 isn't ready for them, they wait in this buffer. This queue can build up and if too many characters arrive the buffer will not have space and it will overflow, losing characters.

Fortunately the RS232 protocol already contains a solution to this: RTS/CTS handshaking. The CTS signal on the RS232 line is used by the receiving equipment (the Replica 1 in this case) to signal to the serial terminal (ie your computer) whether it is able to accept more data. The RTS line is similarly used by the serial terminal to indicate whether it is able to accept more data.

The standard Replica 1 does not implement RTS/CTS handshaking, but it turns out to be easy enough to add it. A MAX232 chip is used in the Replica 1 to shift the native TTL voltages (typically 0V and +5V) to RS232 voltages (typically -10V and +10V) and vice versa. Fortunately the MAX232 has two drivers and two receivers, and only one of each was used in the Replica 1 design.

I soldered up a couple of patch leads from the RS232 DB9 connector to the MAX232's second RS232 input and output pins, and another couple of patch leads from the MAX232's second TTL input and output pins across to another spare pair of pins on the propeller.

I modified the propeller code further to check RTS to confirm the serial terminal is ready before sending data to it, and to use the CTS signal to indicate that the serial terminal should stop sending when the buffer in the propeller exceeds 80% capacity and can start sending again when the buffer is below 50% capacity. This required learning a smattering of Propeller assembly but this turned out to take just a couple of hours.

Results

With this relatively small set of changes (just five wires and a new firmware for the propeller) one can reliably send and receive data from the Replica 1 over the serial port at 115,200bps. No delays are required between characters or lines since there is now full hardware handshaking on the serial line and on both the input and output interfaces on the 6821. No software changes are required on the 6502.

To give you an example, the Applesoft RAM BASIC program is 25,141 bytes of text and programs nearly 8KB of RAM. It takes just 5.4 seconds to upload using the upgraded serial line, versus several minutes with the standard system.

Instructions

To make the same changes to your Replica 1 you will need only five bits of wire. I used some wires from an old Cat-5 cable. Heat up your soldering iron and ...

I suggest you locate and mark the relevant pins with a pen before you start soldering.

A picture of my modified board is below (click picture for larger image). As you can see my soldering is pretty terrible, but it works despite that! Note that on my board there is an additional patch from the 6821 /RESET line to pin 15 on the Propeller. While I was working out how to do this I thought it might be useful but in the end it was not required. It is unused.

Be careful soldering on your Replica 1. If you assembled it from a kit in the first place you should have all the skills to make the changes without problems. But still, if you break it, you get to keep both halves.

You'll also need my modified firmware for the propeller. I have produced two versions. One supports the all the interfaces on the board (ASCII keyboard, PS/2 port, video output, RS232 port) and so is a little slower (11.9 sec to upload Applesoft BASIC). The second supports only the RS232 port (the other interfaces are disabled and should probably be disconnected) but is more than twice as fast (5.4 sec to upload Applesoft BASIC). Source and compiled code are included.

Update 2010-02-08: I've had a report of a problem where the "Everything" firmware does not work but the RS232-only firmware works fine. I need to look into this. The hardware mods don't prevent the stock firmware from working so you can re-flash the firmware to choose between 2400bps no with flow control but all hardware, or 115200bps with flow control but no video output or ASCII keyboard.

In both cases you should configure your terminal for 115200bps, 8 bits, no parity, 1 stop bit, and RTS/CTS hardware flow control.

To program the Propeller under Linux I use the Brad's Spin Tool compiler. The command line I use is:

$ bstc -p2 -d /dev/ttyUSB0 replica1-te.spin

replica1-term

I also write a quick Python script to act as a serial terminal for the Replica 1 under Linux. It's called replica1-term and it's ultra-minimal but works well for me. It requires pySerial which is packaged as python-serial under Debian based distributions (including Ubuntu).

Further work

I have a feeling that the Propeller is still a bottleneck and it should be possible to go faster still. The Propeller takes a little while to drive the next character onto the bus and re-assert the CA1 strobe after CA2 drops (you can see the delay in the second logic analyser trace above). Translating the code that runs on the serial input cog into assembler (and also the corresponding "rx" method on the FullDuplex object) would allow further performance improvements by reducing this potentially wasted time.

There is probably some scope to slightly improve the speed of output by translating the cog code into Propeller assembler in a similar fashion. Five minutes with a logic analyser would confirm if the serial line is being saturated by the 6502.

Build your own Replica 1

The Replica 1 is available from the Briel Computers Store in either fully assembled or kit form. I bought the kit and had great fun soldering it together, I would certainly recommend the kit over an assembled unit. It's a good product. The assembly instructions are very thorough and I should think that anyone could build one even with no previous soldering experience. It's fun, educational, and I thoroughly recommend it for anyone interested in learning how computers work.

Feedback

Get in touch and let me know if it works for you!