socz80: A Z80 retro microcomputer for the Papilio Pro FPGA board

Overview

I built a small FPGA microcomputer for the Papilio Pro board. I've ported a few operating systems to run on it. These 8-bit machines have very minimal features but (somewhat unexpectedly) I found they can run a multi-user, multi-tasking UNIX operating system.

The hardware specification is:

Z80 compatible T80 CPU core at 128MHz
4KB paged MMU (64KB virtual, 64MB physical address space)
8MB SDRAM (at 128MHz), with 16KB direct-mapped cache
4KB ROM with monitor program
4KB SRAM
UART with deep receive FIFO
Optional second UART with FIFO and hardware flow control
1MHz Timer
SPI master connected to SPI flash ROM
SPI master connected to optional SD card socket
GPIO

I've ported the following operating systems:

CP/M-2.2
MP/M-II
UZI

The project is open source and distributed under the GNU General Public License version 3.

Introduction

My first computer, in 1989, was a PC. It was a 16-bit 80286 machine. I missed out on the whole 8-bit generation, but I've always been interested in those machines, from the somewhat mythical golden age when people hacked up their own computer design on their kitchen table, before every manufacturer just cloned the IBM PC architecture ad infinitum. So when my wife bought me an FPGA for my birthday I decided to build my own 8-bit machine for it, in order to learn about them and the software they ran.

A lot of people say to me, "What is an FPGA? And why am I asking this question?"

An FPGA is basically a computer chip which can be reprogrammed with a different circuit so that it behaves differently. You can make them into all sorts of different chips, simple or complex. Over time their cost has fallen and you can now build formidable circuits inside these devices. Just $10 will buy an FPGA big enough for a computer.

This was only my second FPGA project and was also my first attempt at writing code for a Z80, so the quality of my code is probably not brilliant! The machine works well though and I've had a great deal of fun with it.

The Papilio Pro is a great board and I thoroughly recommend it. It has a Xilinx Spartan 6 LX9 FPGA, 8MB of SDRAM, 8MB of SPI flash memory, and an FTDI USB interface that is used to connect JTAG and UART to a host PC. Everything works great under Linux. My main criticism of the board is that the serial link via the UART has no flow control lines hooked up to the FPGA -- the FTDI has a deep FIFO (kilobytes) and you can build a receive FIFO inside the FPGA, but at high data rates you will inevitably overflow these eventually.

I also have a Pipistrello FPGA board which is based on the same Papilio form factor. It has the UART flow-control hooked up, has a larger and faster DDR SDRAM chip as well as a much larger LX45 FPGA. You can use the Xilinx on-chip memory controller block to drive the DDR SDRAM chip. It has HDMI (in or out). The power supply on the Papilio Pro is more efficient but otherwise the Pipistrello is better albeit at a higher cost. I've not had time to get it working yet.

The Papilio form factor is very hardware-hacker friendly; all the IO pins are broken out on 0.1" headers so you can easily pop a bit of veroboard on top and solder up a MAX3232 or SD card or LEDs or whatever.

About this time in the conversation those same people say to me, "Are you detaining me? Am I free to go? Please?"

Hardware

I started my Z80 system with the open-source T80 CPU core, a UART that I'd written for an earlier project, and some of the on-chip block SRAM for memory. I then tried to wrote a simple monitor program for it, my first Z80 program after "Hello world".

Xilinx have a "data2mem" tool that you can use to quickly replace the data loaded into a block RAM without resynthesising the FPGA design (a tedious process), so you can assemble your monitor program, use data2mem to have the code loaded into block RAM, then reprogram the FPGA which will run the code when it comes out of reset. This affords a very quick edit/compile/test cycle, about three seconds from hitting enter to running code.

Once I had a monitor program running I imported Mike Field's brilliant Simple SDRAM Controller to drive the 8MB SDRAM chip on the board. Having the monitor in reliable SRAM made it easy to test the SDRAM and work out the bugs just by using deposit and examine memory commands.

The SDRAM gave me access to 128 times more memory than the Z80 could address, so I added a 4K paged MMU to translate the 16-bit (64K) logical address space into a 26-bit (64MB) physical address space. Each 4KB logical page can be mapped independently to any 4KB physical page.

The SDRAM takes on the order of 10 cycles to supply data after a read request so I implemented a 16KB direct mapped cache using the FPGA block RAM in order to conceal this latency. This works very well. The FPGA block RAM is 36-bits wide which allows for a 4-byte wide cache line plus 4 bits to indicate the validity of each byte.

Debugging the cache was a pain. I ended up writing several programs to exercise and test the memory in various ways; when I found a fault it often took some head-scratching to determine if it was a bug in the hardware or the software! This is doubly hard when the software is itself executing from unreliable memory, so I added a 4K SRAM using FPGA block RAM and used the MMU to map that wherever I wanted.

The MMU also has what I call the "17th page" which allows you to access any physical memory address without mapping it into the CPU virtual address space -- it has a 26-bit pointer in the MMU and an I/O port that translates I/O cycles into memory cycles, automatically incrementing the pointer after each cycle so you can use the INIR instruction with it to do block copies of unmapped physical memory to/from mapped memory.

The Xilinx synthesis tools tell me my design is good for about 70MHz. I've always run it at 128MHz without problems. The Z80 is rather fast at 128MHz and even the simple cache is surprisingly effective at keeping it fed with data.

Operating Systems

Once I had the hardware working I had a lot of fun writing software for it, extending the hardware capabilities as the software grew more sophisticated. I ported three operating systems to the platform, in each case porting them before I had ever used them!

I wrote a CP/M-2.2 BIOS. This wasn't too hard, the original documentation is very good and having access to a modern computer certaintly makes it much less arduous than it would have been in the 1970's.

There's so much RAM in the system that I just used the top 6MB as three 2MB RAM disks, which hugely simplified writing storage drivers. For persistent storage I decided to copy the RAM disk to and from the unused space in the flash ROM on the Papilio Pro board. I wrote SPI master hardware and some routines in the monitor ROM for the copying.

Once I had CP/M working I found out about its multi-tasking multi-user big brother, MP/M. Again the original Digital Research documentation was invaluable when writing an MP/M-II XIOS and getting MP/M-II running. I added interrupt driven consoles, a second UART so a second user can use the machine concurrently, and a simple interval timer for pre-emptive multi-tasking. I was really very impressed with MP/M-II, I had not realised that these Z80 systems could multitask and support multiple concurrent users (and all before I was even born!)

Once I saw that multi-tasking was feasible on this hardware I got a little bit ambitious and decided to port UZI, Doug Braun's 8-bit UNIX like operating system. UZI runs multiple processes with pre-emptive multi-tasking and supports multiple consoles like MP/M. It presents the standard UNIX system calls to processes, which (in my implementation) each have 62KB memory available to them. You can dynamically mount filesystems. UZI has its own filesystem format. UZI is free of AT&T code but offers features similar to the 7^th edition Unix kernel.

There's little or no documentation so this was harder than writing the BIOS/XIOS where there is a clear specification of what you need to do. I started with the P112 UZI-180 port which uses the Hi-Tech C/PM C compiler.

I ported the kernel to ANSI C and made it build with the modern SDCC compiler, added drivers for my MMU, UART, RAM disk, an SD card interface, and removed the Z180 instructions. I modified the context switching mechanism to make it much more efficient by eliminating all the memory copying. I also increased the amount of memory available to processes -- a native UZI process can use up to 0xF900 (62.25KB) and a CP/M process running under emulation has a 60KB TPA (larger than under real CP/M!)

The UZI kernel now works well on this hardware but I do not yet have a good way of building userspace applications for it. Suggestions warmly welcomed! At the moment I am using the P112 project's UZI-180 distribution root filesystem with relatively few changes.

Download

I've not really worked on this project for the last four months. I've decided to give it away in its current state rather than wait until I have both the time and motivation to make it perfect (which may never happen).

UZI for socz80 source code

Latest release (2014-04-30). Contents:

FPGA bitstream for Papilio Pro
Full VHDL source code for the hardware
Source code to the ROM monitor
CP/M 2.2 BIOS (including source code)
MP/M-II XIOS (including source code)
RAM disk images to run CP/M 2.2, MPM-II and UZI
Various other bits and bobs
Hastily thrown together instructions (see README.txt)

Alan Cox has been working on an expanded and improved version of this hardware with additional features including an internal video terminal.

Project Ideas

This project is fun to use but it's much more fun to build.

Change the CPU for a different 8-bitter, like the 6502 or 6809. Open source VHDL cores are available for both.
This Z80 is fast. But it could go faster! A really simple trick would be to modify the T80 core to remove the memory refresh cyles. There's no DRAM connected directly to the Z80 bus and none of my software uses the R register, so these could be eliminated without any ill effect.
The Z80 core uses quite a few cycles for each instruction compared to a modern processor. You could try building a "Z80+" that executes the common instructions in a single cycle and/or employs a pipeline to execute each instruction in multiple steps but multiple instructions concurrently.
You can buy inexpensive ENC28J60 boards on eBay. These talks SPI on one side and ethernet on the other. I've already written an SPI master, so it would be very quick and easy to add ethernet hardware to this machine with this. The software side might be a satisfying challenge.
I've heard about an operating system called TurboDOS but never seen it run. Apparently it was designed to be used in a network, with co-operating TurboDOS machines sharing resources. You can download it from The TurboDOS Museum. This might be fun to port and run, perhaps in combination with the ENC28J60 and multiple FPGAs, or multiple systems on one FPGA. You could write a Linux server process that speaks the TurboDOS protocol to provide resources to the Z80 boxes.
I saved as much block RAM on the FPGA as I could for future peripheral hardware. An obvious extension would be a video display. Combined with a keyboard interface these could replace the UART as the system console.
I did some work on a dual-processor version of this system. My design basically just dual-ported the SRAM and cache memories so both CPUs could use them concurrently, added an arbiter for the CPU's to share the IO bus, and gave each CPU a separate MMU. When a CPU used the MMU IO registers it talked to its local MMU only; all other IO registers were shared. One IO register was special in that it read as 0 from CPU0 and 1 from CPU1, this was so the monitor could do the right thing with each CPU on boot, eventually I expect the interrupt handler would require it also. I didn't get as far as doing the interrupt routing hardware. Realistically the only use I could come up with for the second CPU was to run firmware for a terminal on future keyboard/video hardware, or under UZI for SMP operation.
Build a more modern machine using a 32 or 64 bit processor. There are several CPU cores released under free and open licenses -- MIPS, OpenRISC, ARM, ZPU, x86, several of each architecture. These would be an easier target for porting modern software, and faster too.

Feedback

Get in touch and let me know if it works for you!