1.44M floppy on Color Computer (was Re: Ebay reaches new low)

From: Eric Smith <eric_at_brouhaha.com>
Date: Wed Nov 10 01:09:25 1999

"Sean 'Captain Napalm' Conner" <spc_at_armigeron.com> posted some
analysis of I/O transfer loops on a Color Computer (6809):

> Well, the smallest loop I can see is:
> POLL LDA <$STATPORT ; 4 [1]
> ANDA #TESTBIT ; 2
> Bcc POLL ; 3
> The poll loop is 9 cycles long, and the dataread portion (including

Or perhaps:

POLL: BITB <$STATPORT ; 4
        Bcc POLL ; 3

For a poll loop of 7 cycles, assuming that B is preloaded with the
appropriate mask.

Still no where close to good enough. The poll loop has to go.

> If the hardware is set up such that the DRQ bit is tied to an interrupt
> (and for the loop, it's the only active source of interrupts), you can get a
> miminum of 17 cycles:
>
> POLL SYNC ; 2+
> LDA <$DATAPORT ; 4
> STA ,X+ ; 6
> DECB ; 2
> BNE POLL ; 3
>
> The only way I can see in speeding this up is to tie the reading of the

Or you can partially unroll it and use D:

        SYNC
        LDA <$DATAPORT

POLL: SYNC ; 2+
        LDB <$DATAPORT ; 4
        STD ,X++ ; 8
        
        SYNC ; 2+
        LDA <$DATAPORT ; 4
        DECB ; 2
        BNE POLL ; 3

        SYNC
        LDB <$DATAPORT
        STD ,X+

which gets you down to minimum times of 14 and 11 cycles on alternate
bytes. B needs to be preloaded with (sector_size/2)-1. Unfortunately
14 cycles is too close to the limit.

For a write operation, you might be able to do slightly better by using
"PULU A,B" in place of "LDD ,X++", saving one cycle.

> Now, given the same hardware (tieing DRQ to the DATAPORT to cause the CPU
> to wait until ready) plus a signal that indicates the end of the transfer
> tied to the NMI, you can save an additional two cycles:
>
> POLL LDA <$DATAPORT ; 4
> STA ,X+ ; 6
> BRA POLL ; 3
>
> For 13 cycles, or on a Coco, 14.6uSecs (minimum). At the cost of some

Might be good enough. It would be nice to have more margin for speed
variation, and I don't think that the Coco can generate an NMI on
completion, although in this case an IRQ should do. Of course, the
hardware also would have to be designed to release the wait when the
interrupt occurs.

Partially unrolling this one but NOT using the completion interrupt
yields:

        LDA <$DATAPORT

POLL: LDB <$DATAPORT ; 4
        STD ,X++ ; 8
        
        LDA <$DATAPORT ; 4
        DECB ; 2
        BNE POLL ; 3

        SYNC
        LDB <$DATAPORT
        STD ,X+

For a minimum of 12 and 9 cycles on alternate bytes. I think this one allows
sufficient margin that it should work even if the disk is running 15% fast.
Based on what Tony said, it sounds like the Coco's FDC hardware can probably
support it.

Like I said several posts back, transferring a byte every 16 microseconds
on a sub 1 MHz processor is tricky.
Received on Wed Nov 10 1999 - 01:09:25 GMT

This archive was generated by hypermail 2.3.0 : Fri Oct 10 2014 - 23:32:28 BST