Monday, December 5, 2016

RC2014/LL Showing ROM/RAM switching, 64k!

So I thought I'd just make a quick post to show how I know that the ROM swap hardware in my RC2014/LL is working. I'll walk through the procedure here...

First, I booted up the RC2014 as usual, which drops me into BASIC.  From there I wrote this short program:
10 FOR A = 0 to 8192
20 B = PEEK( A ) 
30 POKE A, B40 PRINT A 
50 NEXT A
What this does is goes through the lowest 8 kbytes of memory, reads (peek) and then writes (poke) the byte it finds right back to the place it got it from.   For example:
  1. Read from memory address 0
  2. Write to memory address 0
  3. Switch to the next memory address, repeat 
Which seems stupid, that it does nothing.  But if you remember from a previous post, when the ROM is enabled, it's only enabled for reads.  Writes still happen to the RAM in the same addresses.  So the above program copies the BASIC ROM into RAM.


Next, we switch off the ROM by doing this:
 OUT 0,1
The lowest bit, '1' turns off the ROM, and switches to the RAM for read operations.  To confirm we're working out of the RAM, we can poke a 0 in somewhere down there.

PRINT PEEK( 0 )
 243
POKE 0,0
PRINT PEEK( 0 )
 0
So we can see here that memory address 0 has been changed from 243 to 0.


I also went one step further and yanked the ROM board out of my RC2014.  This is not recommended. ;)  But it did continue to run without any problems... until I really tried to muck things up, for fun:

I wrote a program to clear RAM, and then ran it in BASIC.  Here is the listing:


And here's what happened when i typed 'run'.  It completely locked up after it printed the '8'. Starting at memory address 0x0008 is the text output routine, so it cleared out the boot code before that (0x0000-0x0007), erased the beginning of the print handler, then went to print out something to the screen, and crashed.  Well, I assume it crashed. It got an invalid opcode and who knows what it's actually doing. The code there used to be a "jump" which is 3 bytes.  Instead it got a NOP, which does nothing, (0x00)  and then two garbage bytes which map to something incorrect. Boom!



Monday, October 17, 2016

State of the RC2016/10


RC2016/10 is a little more than halfway over... so where am I?

Pretty much in the weeds.

I'm almost at the point where I wanted to be at the beginning of the challenge. I'm still working on finishing up the SD loader routines.  Well, I've finished the SD/Micro side of things (in the SSDD1 module - Serial SD Drive) both for the real physical drive as well as for the emulator, for loading only.  The emulation also supports writes now, and both support file/directory manipuations:  Directory listings, make directory, remove directory/file. (For the FAT filesystem anyway.)

I've got a bunch of things on my plate right now, between work, finishing up a contract or two, the animatronic bird project has reappeared a bit, plus time for the family, lack of sleep, and general lack of motivation for anything.

I haven't done any of the sector IO stuff yet, as I want to get regular files working.

In any event, here's a quick bullet summary of the current state of the project:

Done:

  • Hardware for SD interface
  • Hardware for ROM/RAM switcher
  • SSDD1 process design (see image above)
  • SD drive (SSDD1) firmware (preliminary)
  • SSDD1 emulation for file and directory support
  • SSDD1 firmware for directory support and file reading
  • CP/M research to figue out what needs to be done, prior work.

ToDo:

  • SSDD1 firmware for file writing
  • SSDD1 firmware and emulation for simulated sector IO
  • Z80 SSDD1 decoder (Hex string+checksum to proper formatter)
  • Z80 IHX decoder (stream from SSDD1 to RAM writer

Gameplan tasks (Roughly in order of probable completion):
  • Z80 IHX decoder
  • Load a ROM from the SD card into RAM, switch off the boot ROM, restart, run from RAM
  • Backport emulation into the firmware to get them equal
From here, I can go down one of two paths.  I can completely flesh out the rest of the SSDD1, and get the sector IO code implemented, which is probably the best course of action, just for completeness.  Or I could start working on porting/implementing the CP/M bios ROM, which might give me the "win" kick that I need to get the Sector interface implemented... although once I have the regular file IO stuff done (which it is) the sector IO stuff is the same plus a bit more wrapper implementation... after all it's just a bunch of 128 byte files in subdirectories... so...
  • Sector IO SSDD1 emulation
  • Backport Sector IO to SSDD1 firmware
  • CP/M Bios
  • CP/M Bootloader into LLoader ROM
  • Burn new LLoader ROM to 27C512 EPROM
  • Boot a RC2014 to CP/M!

Sunday, October 2, 2016

RC2014/LL and RC2016/10



The RetroChallenge is upon us again!

This time around, I plan on working on my Z80 homebuilt computer, the RC2014.  (Website for RC2014, Order a RC2014) For a few months now, I've had my RC-2014 computer built, and modified to be an RC2014/LL computer. What this means is, is that I have some modified modules using no additional external hardware.

The above picture shows my RC2014/LL system with its extra RAM module, and the C0 Serial expansion board to the left, with the SD card interface board (SSDD1) on it.

The basics of this design:

Unmodified RC2014 modules:
  • Z80 CPU module
  • Clock module (* see below)
  • Serial console interface
  • RAM module (for RAM in the range $8000 through $FFFF
Modified RC2014 modules:
  • Second RAM module
  • ROM module
  • Digital IO module
Additional hardware:
  • Second ACIA Serial port at $C0
  • SSDD1 (Serial SD Drive)
(*) While the clock module is unmodified, it technically is modified. I have added a 10 uF (50V) cap between the reset line and ground to be a quick-and-dirty power-on reset circuit. It works perfectly.  Every time I power on the computer, it "presses the reset button" for me. ;)

The plans to mod these parts are available here.  This is currently fully functional and tested. The modifications use unused gates on the boards, so that it requires no extra additional hardware or boards to implement. The basic theory to the /LL modifications are as follows:



Bit 0 (0x01) of the Digital IO module is tied to one of the extra bus lines on the backplane.  Let's call this "Extra-A".  When you do an "out" to that port of "0x01", it will trigger the Extra-A line.




The ROM module "out of the box" is configured such that if there is a memory access, which is a read from address $0000 through $7FFF, it will enable the ROM, and it will put its data on the bus.  My modification adds in one extra condition to this.  It adds in that only if the Extra-A line is LOW, that the ROM will be enabled.  This means that when Extra-A is 0, the ROM works. When Extra-A is 1, it's as though the ROM doesn't exist.



The RAM module "out of the box" sits at the high half of memory space ($8000-$FFFF), and is always enabled on memory READ or WRITE to those addresses.  The modification to this module is threefold.  First, it sits the RAM at the low half of memory space ($0000-$7FFF). Secondly, it is set up that WRITEs to this memory space will always work, regardless of Extra-A.  Thirdly, it is set up that READs to this memory space will ONLY work when Extra-A is HIGH.

The end result of this is that when Extra-A is low, reads in the low half of memory will come from the ROM.  When it is high, reads will come from RAM.  Writes ALWAYS go to RAM.

This is quirky...

I admit this.  It means that you can (and I will) write a bootloader/monitor ROM that is enabled on power-on, will read from a mass-storage device and write into anywhere in RAM... It can load a 64 k byte memory image into RAM, and then switch off the ROM and it will all work.  The quirkyness is that you cannot verify the loaded-in memory in the low range of ram, since reads will come out of the ROM.   Obviously, you also need to do this routine completely out of registers, as your stack variables will get overwritten if you're not careful.

Anyway....

If you want it to behave like a stock RC2014, remove the jumper from the IO board to Extra-A, and instead add a jumper from ground to Extra-A.  I have added a switch to mine to force this in the case where the IO board comes up in the wrong state.

Additionally, I have added a second serial port, which basically follows the circuit for the first port, but it sits at $C0, and does not have an interrupt line wired to it.  The RX/TX on that one comes out to a FTDI-like pinout header, which is where my SSDD1 module plugs in.  The plans for this serial port are available here, and can be seen as the brown perf-board on the left of the topmost image of this post.

The SSDD1 module...


The Serial SD Drive module is the mass storage module that I've created for my Z80 to interface with. I know that I can push the FAT filesystem support onto the Z80, but that would require substantial effort.  I instead decided to go with the model where I have a smart serial-based device that you tell it "i want this file" and it sends it out.  Like a local BBS. ;)

The use of a serial-driven drive is not unprecedented.  It's somewhat modeled after the Commodore 64/Vic 20's "IEC" serial interface for floppy drives.  It also mirrors it in that the drive has some smarts in it to deal with the drive architecture.

I also went this route for the ultimate form of this computer, which is to run CP/M.  CP/M expects drives with a Drive+128 byte Sector layout.  While other Z80-CP/M computers implement this by having direct interface to the sectors on the spinny disk/cf/sd card, I will do it by having files on the SD card, for the most amount of flexibility.  There will be a "drives" folder on the card containing folders named "A", "B", "C", and so on.  These letters are the drives.  In each of those will be directories for the tracks, named "0000", "0001", etc and in each of those, files named "0000" "0001" "0002" etc. These files are the simulated sectors on the disks.  It will be easy to build virtual drives for other use out of this.  It also means that I won't need some special interface on a modern computer to talk with this.  I won't have to do 'dd' style transfers to get at the data... It's all just sitting on a FAT filesystem.

The implementation of this module is based on an Arduino with a microSD breakout module.  The serial interface communicates directly with it, and it sends the content via serial back to the Z80 host.

The above picture shows the SSDD1 module off of the RC2014/LL expansion board, and instead wired to a breakout board where I have a second FTDI serial-USB interface so that I can debug the hardware more easily.

And that brings me to the current Retro Challenge...

My goals for this month is to do a few things here, to finish up this computer system...
  • Finish up the firmware for the SSDD1
    • Sector load/save support
    • File load/save of intel hex files
    • directory create, list, remove
  • Write the SSDD1 emulation for the emulator
  • Finish up the loader and burn it to a ROM
  • Write a CP/M bios that uses the SSDD1 interface
  • Build CP/M disk/sector files 
  • Play Zork
Stretch goals:
  • Extend the NASCOM BASIC to support the SSDD1 for loading and saving.


Friday, June 24, 2016

The RC2014 Computer: 1. Emulation


As you may know, I'm bigtime into Z80 computerey stuff.  For the past 20 years or so, I've been hacking Pac-Man ROMs, been maintaining the Ms Pac Disassembly, and have made my own Z80-Pac based programes over the years.

Fairly recently, I got a Kaypro II from a friend at interlock, and it worked perfectly, and looked brand new.  I felt like I couldn't hold on to it... "it belongs in a museum!" ...so I donated it to ICHEG/Strong Museum of Play.  But it helped whet my appetite for a CP/M computer.

Another project I've been wanting to do was to start with a Commodore 64 (I know it's not Z80, it's 6502ish), a floppy drive, some blank disks and a hardware manual and code up, from scratch, an OS.  Start out by making a text editor, assembler, GEOS-like GUI, etc.

These projects recently had an opportunity to overlap, and they all seem to converge on the RC2014 modular Z80 computer.

The RC2014 computer is a backplane-based modular computer created by Spencer Owen, based on the Z80SBC by Grant Searle.  There are modules for the CPU, RAM, ROM, Serial Terminal interface, and so on.  It is available as a kit from tindie.com.  Once assembled, you hook up a serial terminal to it, power it on, and you get a 1980s-esque BASIC prompt onto which you can write your 32kbytes of program.  This is based on Grant's simplified Z80 computer, so there is no off-line storage.

My general plan for the RC2014 is:
  1. Emulation:
    1. Create an emulator for the system to aid with rapid development
      1. also bring my "bleu-romtools" from Google Code to github
    2. Add a serial-based storage solution to the RC2014 emulation to confirm proof-of-concept
    3. Add ROM swap out to the emulation
    4. Add 32k of ram to give a full flat 64k of ram to the emulation
  2. Hardware:
    1. Get a RC2014 kit
    2. Build the RC2014 kit
    3. Make a test ROM to run on real hardware to verify my toolchain is working
    4. Create a new serial card that sits at port 0xC0 (second serial)
    5. Create the SD Drive firmware for the serial arduino
    6. Hack the ROM and Digital IO boards to allow for disabling the ROM
    7. Add 32k hacked RAM to the system
  3. Name: RC2014/LL
    1. At this point, the architecture is different enough and well defined enough that I think a new name for this configuration is in order. I call this configuration "RC2014/LL".
  4. Port CP/M
    1. Create the BIOS
    2. Create the sector-based emulation layer in the SD drive
    3. Boot CP/M
    4. Play Zork


I started out by making an emulator using the Z80 Pack emulation system.  Once I got this running, I realized the limitations of this emulator and looked around and found another emulator that suited my needs better. (I wanted a way to "swap" memory around, which Z80 Pack would not do in a way that wasn't a major hack.)

I created a layer that adds disableable memory regions, and added emulation of the 6850 ACIA serial chip, and threw the 32k RAM BASIC ROM at it, and it started right up, running BASIC!

I added a second 32k of RAM (easy to do when you're emulating it!), and started creating the SD interface, also using the 6850 ACIA for communications.  I then added a port, emulating the IO card, on which bit 0 (0x01), when set, will disable the ROM... So any reads to the low area of memory will read from the ROM.  Whether this is set or not, all writes to that area of memory will actually happen to the RAM... they will just be hidden from reads until that bit is set.

I now have the basics of the RC2014/LL system emulated in software!  I created a boot/diagnostic ROM which can be used for all RC2014 systems which can probe memory to determine type (ROM, RAM, unpopulated), peek and poke memory, In and out IO, and other utility functions for the SD card interface.

Currently, I'm writing the SD card API which I will port to the Arduino Leonardo and SD breakout board which I have ordered, once those come this weekend, I'll shove them into my own serial board and burn a test ROM and see how it goes....

Sunday, February 21, 2016

Hacking my own Arduino Mega


At Interlock, I was handed the old controller board for a gutted 3D printer that was being rebuilt. "Do whatever you want with this." A close inspection of the board showed that it had a main microcontroller of the ATmega 1280, which is the chip used in older Arduino Megas.  The interface to USB however was an ATmega 8u2, which is the chip used in newer Arduino Megas, and you may also know it from older Arduino Unos... modern Uno R3s use a 32u4.

This board had custom firmware on it so that it didn't look like an Arduino, or any sort of serial connection to the host computer it's plugged into... so as-is, it was useless for general use as an Arduino; taking advantage of the GUI and clicky-clicky programmer interface.

So my thought was, it might be nice to have my own 'Mega for testing and such.  Could this board be set up in a way that might make this process and outcome easy?  Turns out it mostly was.


The original board got its power from a power terminls on the board, 24V.  It needed to power the stepper motors, and such so it needed to be beefy.  This was dropped down to 5 and 3.3 on the board itself.

There is a USB B jack for connecting this to a host computer, which did not have its 5V connected, so my thought was, what if i hooked up this 5V to the USB jack.  would that be enough to power the chips?


I added this jumper, which connects the +5 on the USB jack to the 5v bus on the board, and plugged it in, and sure enough, it beeped and came to life without its host power supply.

Next up would be reprogramming the micros to have the arduino bootloader and code on them.


I hooked up my fairly cheesy Arduino D-15 (hacked stepper motor controller) ISP to the 6 pin header, which thankfully was already populated and labelled on the board!  I plugged it into the port labelled "1280 ISP", selected the Arduino Mega, with 1280 micro from the Arduino 1.6.6 menus, selected Arduino ISP for the programmer, then selected "load bootlader".  In about a minute, it seemed to have completed successfully.... if something didn't jive, it would spew out sync or device errors to the screen.  Seemed good so far!

Next, was hooking it up to the jack labelled 8u2 ISP.  This was a little trickier because I wasn't installing the bootloader (which the Arduino IDE makes REALLY easy to do), but rather the secondary micro's firmware, which basically was just a USB-Serial interface driver.

Long story short, I grabbed the 8u2 code from github, "MEGA-dfu_and_usbserial_combined.hex", and used the following command line (using a mixture of the code on that page, with the parameters that my system used via the arduino IDE on my Mac:

    ./avrdude -p at90usb82 -F -cstk500v1 -P/dev/cu.usbserial-A800czia -b19200 -U flash:w:8u2.hex  -U lfuse:w:0xFF:m -U hfuse:w:0xD9:m -U efuse:w:0xF4:m -U lock:w:0x0F:m -C/Users/me/Library/Arduino15/packages/arduino/tools/avrdude/6.0.1-arduino5/etc/avrdude.conf

In short, it sets the CPU to at90usb82, uses the stk500v1 communications protocol over the /dev/cu.usbserial driver, at 19200 baud.... it programs the file 8u2.hex, sets fuses and sets other avrdude configuration stuff.

After lots of text scrolling by from running that, I was able to drop a program I was working on, onto it via the Arduino IDE directly, without any problems at all! I set the port to the serial Mega, set the board to "Arduino Mega", cpu set at "Mega 1280", clicked 'upload' and bam, fully functional serial communications from the serial montior down through to the '1280 on the board.


Whoo! Free Arduino Mega for me!

Edit: Here's the pinouts of stuff I beeped out.

 * 4 - Piezo +
 * 6 - heat
 * 7 - fan
 *
 * 24 - A Dir
 * 25 - A Step
 * 26 - A Enable
 * 27 - A Pot
 *
 * 28 - B Dir
 * 29 - B Step
 *
 * 36 - debug 2
 * 37 - debug 3
 * 38 - (nc)
 * 39 - B Enable
 * 40 - debug 4
 * 41 - PG0
 * 42 - TP33 / Z-MAX
 * 43 - TP32 / Z-MIN
 * 44 - Extra +/R85
 * 45 - bp heat
 * 46 - TP31 / Y-MAX
 * 47 - TP30 / Y-MIN
 * 48 - TP29 / X-MAX
 * 49 - TP28 / X-Min
 *
 * A0 - X Dir
 * A1 - X Step
 * A2 - X Enable
 * A3 - X Pot
 *
 * A4 - Y Dir
 * A5 - Y Step
 * A6 - Y Enable
 * A7 - Y Pot
 *
 * A8  - Z Dir
 * A9  - Z Step
 * A10 - Z Enable
 * A11 - Z Pot
 *
 * A12 - PK4 / JP7
 * A13 - PK5 / JP7
 * A14 - PK6 / JP6
 *
 * A15 - TP27 / HBP Therm

The molex switch connectors seem to have the pinout: (signal) (ground) (ground) (+5v)

Monday, February 1, 2016

A (mostly) Finished 6502 LlamaCalc(ulator) (RC2016/1 Post-Mortem)


February 1st sees the end of RetroChallenge RC2016/1.  My entry for this month was to create a calculator for the Commodore/MOS KIM-1, by way of 6502 and the KIM-Uno emulation project.  I wanted to have a working somewhat-calculator running on the system, but more importantly, I wanted to learn 6502 assembler.

So let's see what my goals were for this RetroChallenge, as I set them out at the beginning of the project:
Starting today, I'm going to attempt to better learn 6502 asm in my copious amounts of free time for the  RC2016/01 Retrocmputing Competition.  To prepare for this, over the past year I've gotten into working with Oscar Vermeulen's awesome KIM Uno kit, as well as pushing out my own updated firmware for it in the form of my Kim Uno Remix project on github. 
...
For the challenge, I want to use this system to make a simple integer programmer's calculator which I can run on the KIM Uno itself.  Press keys to shift in the nibbles, then switch it into a mode where i can affect the data.  Convert hex to decimal, do bitshifts, add, multiply, etc.
In short, even though I didn't accomplish everything I outlined here, I feel like I was completely successful in the project.  The calculator application is incomplete according to the above feature set, but that wasn't really the goal of this whole thing. I wanted to learn 6502 Asm, which I did. (I didn't finish the BCD to HEX conversions, nor did I implement multiply/divide math functions.)

What were my problems?

I think that one thing that held me back was getting my head around doing multibyte math with only a carry bit. For some reason I got it into my head that this wasn't enough, which of course it is.

Another thing that kept me from getting everything done was that I spent a lot of time to understand the BCD/Hex algorithms.  The code that I used was ultimately very similar to sample code online, but I decided that I really wanted to understand how it worked, so I didn't put it in until that was true.

And of course, just the general lack of time because of various other things including: my daytime job, playing with my kid, two contracts to work on at home, being sick, etc.

What did I achieve?

Over the course of the month, I learned a lot about how to work with such a limited set of registers.  I came from Z80 world where you have a bunch of 16 bit registers.  6502 has one 8 bit accumulator (A), two 8 bit indexing registers (X,Y) which each can only be used for certain operations.

Most everything, seemingly, is done by interacting with memory locations, specifically those in the "zero page". The 6502 has this idea where the 16 byte address's top byte is the "page" of memory.  the memory in the zero page would be bytes from $0000 through $00FF.  This is generally used for OS and general use variables, etc since there are small opcodes specifically for working with them.

I'm getting into too much detail. I'll instead outline all of my accomplishments for the month...

  • Learned 6502 ASM
  • Improved the "KIM Uno Remix" Desktop application (QT for portability)
    • Added a memory snooper
    • Better graphics palette
    • More speed support
  • Learned indexing (using X and Y registers)
  • Wrote the LlamaCalc input routine 3 times, learning 6502 opcodes better each time
  • Came up with a decent user interface for LlamaCalc that's somewhat learned-intuitive
  • LlamaCalc features implemented:
    • Display/UI states for LlamaCalc (Splash, Result, Menu, Error)
    • centralized interface for doing math functions, error handling, etc
    • 8 level stack of numbers to be used (changeable at build time)
    • Push/Pop stack functions
    • Hexadecimal to BCD conversion
    • bit shift left by one bit
    • bit shift right by one bit
    • 24 bit addition
    • 24 bit subtraction
  • Designed an RLE compression scheme for graphics
  • Added RLE decompressor to the source code projects
  • Played with optimizing screen display
  • Oh yeah, created a full repository for 6502 code, with libraries etc. on github
  • Every time I learned something new, I created another project in the Projects6502 repo

So yeah. I feel like i was successful...


I will soon have a walkthrough of using LlamaCalc using a KIM-Uno device.

Here's the source code for everything:

Monday, January 25, 2016

6502 - 24 Bit Math and a little BCD (RC2016/1)

I decided that another experiment/lesson to do on my way to making my calculator app was to learn how to do multibyte math and possibly experiment with BCD/Decimal vs Hexadecimal. (Or as Mark Watney calls them "Hexidecimals".

I kinda like breaking down this project into multiple "lessons" as it were.  It makes me feel like I'm following along lesson plans in a book.  Perhaps I should go the other way around and actually make the book I would be following if I were following a book to make this thing.

The code for this can be found in my github repository.

I broke down the application into a few main steps:

  1. display the last result
  2. add together the two previous results
  3. store that sum into a result variable
  4. repeat
When broken down further, we see that we also have to have some method for "kickstarting" it, as it were, since the first two numbers in the sequence do not follow the standard fibonacci sequence. (quick reminder: each value in the fibonacci sequence is the previous two values added together. very simple.  So, for the first two values, there is no "previous two" so they are just hardcoded as "0, 1"

Computation of the sequence can be described as :
  1. hardcoded "0"
  2. hardcoded "1"
  3. use algorithm to sum previous two values
  4. same as 3
  5. etc
For doing the math, I wanted to have variables that mimiced the 3 bytes we are able to display on the KIM, so I use 24 bits (3 bytes) to store them.  I broke down the math functions to be generic in that they can perform using two variables ("i" and "j") and store their result in a third variable "RESULT".  From there, additional functions were created to move the values around between them.  For example, we need to "roll" the values through if we want to make this repeatable. So the computation sequence can be described as:
  1. RESULT, I and J all set to '0'
  2. refresh display RESULT "0"
  3. RESULT gets "1"
  4. refresh display RESULT "1"
  5. shift the values through:
    1. J gets I's value
    2. I gets RESULT's value
  6. add:  RESULT gets the value from adding I and J
  7. repeat at step "4"
And this is basically the procedure as seen in the source.  

The multibyte addition was actually a lot simpler than I thought it was going to be. My first thought was "how could this possibly work if i were to add like 100 to 100... you end up with "2" for the carry instead of "1"."  Obviously, you can see the error here, but for some reason this got stuck in my head and suddenly, all of the multibyte (16+ bit) math seemed near impossible to deal with.  I think it was the multiplication that seemed hard, but when you break it down as multistep additions instead of multiplications, it all makes sense.  I blame this on the cold and fuzzy head I have right now.  I'm just not thinking right... also extra time at work... sure... and um... an ARP storm.  all contributing factors to not thinking clearly. ;)

The basic procedure for doing multibyte math is to observe the carry bit.  The carry bit is set when math on two 8 bit values exceeds the 8 bit container.  If you think of it in decimal, when you add 1 to 9, you get "0" with a carry of "1" which ends up in the next digit space, resulting in a "10".  So if you were to add 99 and 04, you end up with 03 with a carry of 1, resulting in "103".  Math on the 6502 is no different, other than we're (probably) using hex where the value can go from 0-9,a-f rather than just 0-9 for each digit.  The math for addition is basically:
  1. for each byte (starting from the least significant on the right)
    1. add one byte to the other, with the carry bit from the previous byte
    2. store that result in the RESULT
Or, more precisely
  1. clear "Carry" (Carry = 0)
  2. register A gets I0 (A = J0)
  3. add j0 to A.  (A = A + J0 + Carry)
  4. store the result in RESULT0
  5. A = I1
  6. A = A + J1 + Carry
  7. RESULT1 = A
  8. A = I2
  9. A = A + J2 + Carry
  10. RESULT2 = A 
I think you can see that this can be carried out indefinitely for multiple bytes.

The "display the result" was pretty straightforward as well.  The "RESULT" bytes were stored into INH, POINTL and POINTH, and then the SCANDS function is called. This refreshes those three values out to the KIM's LED display.  Then a call to GETKEY stores the current key press value into the accumulator register.  If nothing is pressed, this fills A with $15, or KEY_NONE as I have it defined.  Then it just sits in a tight loop refreshing the display and waiting for any key to be pressed.
  1. refresh display
  2. check for key press
  3. no key press? repeat at 1
  4. return
So the end result is a program that advances to the next sequence number each time you press a key.

When it fills all the digits, when we get a "carry" on the third digit while doing the math, i display "EEEEEE" as a cheesy error display and wait for a press.  When something is pressed then, it resets and starts all over agian.


As for BCD, I basically have run the code both in BCD (decimal) mode and hex mode, just to see how it works out.  Turns out i was worried for nothing,  It all 'just worked' fine in both modes.

So yeah.  My throat is sore, and I'd love to just go to sleep right now.

Thursday, January 21, 2016

6502 - RLE Image Renderer (RC2016/1)


I finished up my RLE (Run-Length Encoded) image renderer last night.  It would have been much simpler but there were a few things that I wanted to deal with to have proper full support for sprite placement and large image rendering.

The basic concept of RLE is that instead of storing just a series of pixel colors, we also store the number of times each pixel is repeated.  As described in the previous post, we know that this hardware uses the lower nibble of each byte to store the color number.  We will use the upper nibble to indicate repetitions as well as other commands, which we'll get to later...

Using '0' for the number of repetitions makes no sense, so it will never be used when the image is encoded. (repeat "red" pixels 0 times? nope.) So we'll use '0' in the top nibble to indicate commands.  A few commands that we will need are:

$00 - End of image (stop rendering, return)
$0F - End of line (no more pixels on this line, start over vertically down one pixel from the start of this line)

Which leaves $01 through $0E, which we will use as a "skip".  Advance the screen position, but do not draw any pixels to the screen. We can use this to allow images to have "transparency".

One thing to deal with was that after 255 bytes (at most), the referencing will go into another bank.  If everything fits in one bank, that's fine, but the screen itself is 4 banks, so this was something that needed to be addressed.  (HA! Addressed! I'm hilarious!)  If this isn't dealt with, and we only are incrementing the lower byte of a two byte address, we'll just keep reading (or writing) forever inside of one bank. $41FE, $41FF, $4100, etc  rather than $41FE, $41FF, $4200, $4201 ...

So basically instead of just incrementing the screen pointer by one, indirectly using
    inc IMGPTR    ; will wrap around inside a bank. bad.
I instead had to add a '1' to it, then add the carry bit onto the high byte of the value.  I need to take a step back here.  The 6502 only really has grasp of 8 bit (one byte) values.  It can use 16 bit values for addresses, stored as two bytes, but all math functions happen on the one-byte scale.
    clc          ; clear the carry bit  (Carry = 0)
    lda IMGPTR   ; A = *IMGPTR
    adc #$01     ; A = A + 1 + C
    sta IMGPTR   ; *IMGPTR = A
      ; at this point, the carry bit is either set or not,
      ; so we will add 0 into the next byte with carry
    lda IMGPTR+1 ; A = *IMGPTR+1
    adc $#00     ; A = A + 0 + C
    sta IMGPTR+1 ; *IMGPTR+1 = A

Why use RLE? A couple reasons.  First of all, it will save ROM space.  The RLE encoded (compressed) images should take a bit less space inside the rom.  An alternate we could do is to store color data in both nibbles of the byte, then just shift them out to the screen.  We would lose the ability for transparency, but you're guaranteed 50% space savings with the system we have here.

The full source code for this project is over at github.

The image shown at the top of this post shows three sprites stored in the ROM.  They were hand-encoded from graph paper sketches of various sources.  The rainbow was just coded by scratch to test out everything.

The red ghost is obviously borrowed from Namco's "Pac-Man" arcade game.  The mouse is borrowed from Nintendo's "Goonies" arcade game. Both are used for educational/demonstrative purposes here.

Wednesday, January 20, 2016

6502 Learning (RC2016/1)... Video buffer sidetrack...

I got a little sidetracked while working on the KIM-Uno calculator, playing with video buffers. I had added a video buffer to the desktop Kim Uno Remix project. Ultimately, I want to make a compressed image decoder and viewer.. to draw sprites to the screen or full-screen images.

I've written stuff like this before back on the Z80 for Pac-Man hardware, so I thought it would be a fun exercise to see how something like this would be implemented on 6502. It gives me a good chance to learn addressing modes and methods for this architecture... which is very different than Z80's.

You can attempt to play along by using this web-based assembler system. I based the video buffer in KIM Uno Remix on this system.  There are a few differences though...

6502asm.com:

  • 32x32 pixels
  • 16 colors
  • Commodore 64 palette
  • starts at $0200, continues horizontally then down, starting top left
  • one byte per pixel
  • bottom nibble indicates the color ($00..$0F)
  • top nibble is ignored
  • code starts at $0600
KIM Uno Remix:

  • is 32x32 pixels
  • 16 colors
  • Modified Deluxe Paint palette
  • starts at $4000, continues horizontally then down, starting top left
  • one byte per pixel
  • bottom nibble indicates the color ($00..$0F)
  • top nibble is ignored
  • code starts at $0200
Here's the output from a small program (shown below) that shows off the palettes of the systems. The KIM Uno is on the left, and shows the very reasonable "rainbow" palette.  The one on the right shows the more convoluted "Commodore 64 palette" of the web tool.


The colored sections are 8 rows of 32 pixels across. Since there's only 16 colors, the color stripes get repeated twice along the horizontal of the screen.

The code to run the above was essentially identical on both systems but there are some tweaks to accommodate the addresses and some minor differences between the CC65 tools that I use and the web-based tool.


Here's the source code listing used for CC65, which generated the image on the left above.  You can see that it writes to two of the four banks of memory space, at $4000 and $4200, while not doing anything with the $4100 and $4300 banks, which is why we see two segments of stripes, and two segments of black in the above image. It is a very simple program that simply increments "X" and writes it to videobuffer[x].


And here's the source for the web-based tool.  I colorized it to match the above CC65/KIM Uno Remix listing.  Notice that the program is the same, although it uses $0200 and $0400 for the screen memory, skipping the $0300 and $0500 sections.  I also switched the "unnamed label" from the above code to be a label named "loop" for this one.  It apparently doesn't support that.

Saturday, January 9, 2016

KIM Calculator Update: Learning 6502 and reducing code size

One of the functions of this KIM-Uno project involves functionality similar to the stock KIM monitor.  You press buttons 0-9, A-F to enter a number (a nibble, a half byte), and it scrolls in from the right side of the display.  Usually you can only enter the address bytes (first 4 digits) or the data byte (rightmost 2 digits).  I wanted to use all 6 digits for the values entered so I needed to write my own handler for this. I decided to kinda glance at the KIM Monitor code, but I wanted to make it all on my own.

Version 1 sketch...

Version 2 sketch.

Both of the above (which didn't quite work) were basically the same as the working version (v3) which I did implement and had in the code for a week or so.  It was 46 bytes long, plus a sub function which got called 3 times that was 20 bytes long.  Not very small.  The basic procedure was this:


  1. Get the key from the user (0x00 - 0x0F), store it aside
  2. for each of the three display bytes (eg 0xMN)
    1. Store it in X (input byte)
    2. shift it to the left by 4 nibbles (one display digit)
    3. store aside this value 0xN0
    4. restore the display byte (0xMN)
    5. shift it to the right by 4 nibbles (one display digit)
    6. store aside this value now (0x0M) this is the "carry out"
    7. Add the user input key with the first stored value  0xNK and put in "X" (output byte)
    8. restore the "carry out" to "A"

That's basically it. Nothing particularly wrong with it, it works fine, but once you understand more of the opcodes available in the system, we can drastically reduce this.

I was working on implenenting one of the calculator functions "Shift left one bit" when I learned/remembered about shifting with ROL (rotate left) and ROR (rotate right) which shift the bits around, storing the one going out and the one going in using the "carry" flags bit.  The implementation of this function was super easy using this:

clc          ; clear carry bit (shift in a '0')
rol DIGIT3   ; shift data in memory location DIGIT3 one bit
rol DIGIT2   ; these take the bit shifted out and store it
rol DIGIT1   ; in the carry bit, then shift that bit in
jsr DISPLAY  ; and display it

Then it hit me, I could leverage off of this carry bit for the above process, since it basically is:

  • Shift all three bytes to the left by one bit four times (one nibble) 
  • Shove the key in to the lower nibble of DIGIT3
This change of shifting everything by one bit four times, rather than the above where I was shifting four bits three times, worked out perfectly with the available opcodes.  Here's the final code for this routine:  (INH is the third digit pair, POINTL and POINTH are the second and first digit pairs of the display respectively)

We set up a loop using "x" as the counter register, notice we set it to '4' first.  We are also doing something that I just learned about which is non-named labels, which is why you see ":" starting a line, then a "bne :-"  (branch (goto) if not equal to the previous non-named label.)

keyShiftIntoDisplay:
        ldx     #$04            ; 4 bits to shift
:       clc                     ; rol pulls from carry, so clear it
        rol     KIM_INH         ; shift this byte by 1 
        rol     KIM_POINTL      ; shift this one, carry from INH
        rol     KIM_POINTH      ; shift this one, carry from POINTL
        dex                     ; x = x - 1
        txa                     ; a = x 
        cmp     #$00            ; a == 0?
        bne     :-              ; mot 0, repeat loop

        ; now shove the content in
        lda     KEYBAK          ; restore key 00 .. 0F to A
        ora     KIM_INH         ; A = A | INH
        sta     KIM_INH         ; INH = A
        jsr     SCANDS          ; and display it to the screen
        jmp     keyinput        ; next!

Which works out substantially smaller.  It has one chunk of 13 bytes that gets run four times, for the entire code block of 27 bytes.  Quite a lot less!  I'm really enjoying learning 6502 asm for this!

The code for this project can be found on github at The LlamaCalc directory of my Projects5502 repository.  This requires the cc65 toolset to build.

Friday, January 1, 2016

Retrocomputing Challenge 2016-1: Learning 6502, KIM-Uno stuff

Starting today, I'm going to attempt to better learn 6502 asm in my copious amounts of free time for the  RC2016/01 Retrocmputing Competition.  To prepare for this, over the past year I've gotten into working with Oscar Vermeulen's awesome KIM Uno kit, as well as pushing out my own updated firmware for it in the form of my Kim Uno Remix project on github.

Part of that project was to make it more portable and make it available on other platforms.  I have a preliminary iOS build of it, as well as a QT-based desktop build of it checked in which builds on Mac, Windows and Linux.  Source at github will build for all of these, if you have QT Creator installed (along with support compilers for your system of course.) Binaries will be available eventually for all platforms.



In the above screenshot you can see some 6502 asm in the center window. I have a makefile which uses the industry(homebrew) standard(?) cc65 compiler/assembler  to assemble it into a .lst "listing file".  This file contains the original ASM as well as the machine language bytes and the addresses they sit at.  This is a lot better, imo for distribution as it can be easily trimmed (ref: unix 'cut') to the original asm, or it provides the necessary information to hand-enter it into a KIM.

The KIM Uno emulator can be seen on the leftmost window. It looks like you'd expect a KIM on a desktop to look.  There are two other windows here though.  The Video Display shows a virtual framebuffer which sits in KIM memory at $4000 (0x4000).  It is 32x32 pixels, one byte per pixel.  Only the bottom nibble is currently used in the byte, to signify one of 16 colors ($00, $01... $0E, $0F).  At compile time you can use the Commodore 64 palette, or the Amiga palette, which is based on the default colors from Deluxe Paint.  This feature is heavily influenced (copied/borrowed) from the very awesome virtual machine/programming interface available at 6502asm.com.  Theirs sits at $0200, which collides with KIM stuff, so I moved it to $4000.  Otherwise it behaves the same.

There are a couple windows not shown, including a serial terminal emulator that connects to an emulated UART in the KIM, so that you can run the chess application or what have you.  Also available is a memory browser that lets you look through the entire 64k memory space of the 6502. It allows you to have it update automatically so that you can see changes as they occur.  Very handy for debugging

Now here's where things get neat...

The window in the bottom right is for a feature I call "Code Drop".  You can take one of the .lst files mentioned above (generate one by running "ca65 project.asm -l project.lst") and drag and drop it to that window. Or you can click "Browse..." and pick it from your filesystem.  Now, when you hit "Load to RAM", it will load in that .lst file, and drop the bytes in the appropriate place in RAM, while the emulation is still running.

The "Auto ADDR seek" feature will then auto type-in for you the first address specified in the LST.  The "Auto GO" feature will do the seek, then press "GO" for you as well.

The application is also sensitive to the SIGUSR1 signal, which does the same as pressing the "Load To RAM" button.

So here's what you (I) do...

The desktop application is set to the appropriate .lst file for the project I'm working on.  It is set for "auto GO" as seen above.  Now in the makefile for the project,  it will build the .lst file, then send the SIGUSR1 signal to the application.  When I type 'make', it assembles the file, builds the lst, then triggers the emulator to reload and restart the code, essentially integrating it into my build process.

.oOo.

For the challenge, I want to use this system to make a simple integer programmer's calculator which I can run on the KIM Uno itself.  Press keys to shift in the nibbles, then switch it into a mode where i can affect the data.  Convert hex to decimal, do bitshifts, add, multiply, etc.