!!! Ironman Atari - A Compilation of Advanced Atari 8-bit Programming Techniques

by Mark Schmelzenbach

[{TableOfContents }] \\\



!! Introduction

Welcome to "Ironman Atari", a collection of articles detailing new techniques in Atari 8-bit programming. Although the golden Age for Atari has long passed, several new programming methods have been discovered or rediscovered over the past decade. These new methods push and twist classic Atari iron in ways that might have surprised even Jay Miner himself. 

Today, the internet provides a forum that allows people to share information in a manner unsurpassed in the past. This is a compilation of articles gathered for the purpose of preserving this knowledge for future programmers. Who knows? In another 25 years, there may still be people hacking away on their trusty Atari computers. Many of the articles here have been gathered from a variety of sources, including USENET, user forums and solicited articles. Wherever possible credit has been given to the original authors. If you find an attribution that is missing or in error, please contact me so that the attribution can be corrected. 

This book will use use two primary reference machines, the Atari 800XL and the Atari 130XE. Both of these reference machines are unmodified NTSC machines. Although there are places where other machines may be required (for instance a PAL machine, the near mythical 320XE, a double POKEY machine, or other hardware exotics) these are exceptions and will be noted as such. In addition, most examples will work fine on a factory standard Atari 800 with GTIA chip unless explicitly stated otherwise. It should also be noted that almost all of these techniques also can be used on the Atari 5200 since this machine has nearly identical graphic hardware as that of the Atari 8-bit computer line. 

All examples will be written for ATasm, a MAC/65 compatible assembler, and will assemble on real hardware using MAC/65 unless specifically stated. 

It is assumed that the reader is already familiar with the Atari computer platform and assembly language. Additional reference material is provided below: 

__Atari Graphics and Advanced Arcade Game Design__ , by Jeffrey Stanton and Dan Pinal. The full text of this book is available on-line at [http://www.atariarchives.org/agagd|http://www.atariarchives.org/agagd]

__Atari Hardware Manual Atari Roots, A Guide to Atari Assembly Language__, by Mark Andrews. The full text is available at [http://www.atariarchives.org/roots]

__De Re Atari ? By Chris Crawford, Lane Winner, Jim Cox, Amy Chen, Jim Dunion, Kathleen Pitta, Bob Fraser, and Gus Makrea__. This is THE seminal work on the Atari home computer and should be considered required reading. The full text can be found on-line at [http://www.atariarchives.org/dere] 

__Dr. C. Wacko's Miracle Guide to Designing and Programming Your Own Atari Computer Arcade Games, by Robert Kurcina, David L. Heller, and John F. Johnson__. This book can still be found on [http://www.amazon.com]. 

__Mapping the Atari! Revised Edition, by Ian Chadwick__. The full text can be found online at [http://www.atariarchives.org/mapping]

Eventually, I would like to see this expand into a larger work with well-commented programs and games that demonstrate the advanced techniques described. It is not just technology that has advanced over the last 25 years, but also the sophistication of the gaming market. There are several games that exist today, particularly on handheld devices such as phones and the GBA/DS/PSP or the emerging Flash based minigame market that are not out of reach for the Atari 8-bit. All it takes is the knowledge to program and the time to do so. Here is to the next 25 years! 

--Mark Schmelzenbach, ed. 

Disclaimer: As always, errors inevitably creep into a work such as this. If you find an error or better yet, if you wish to submit an article for inclusion in a future release, please contact me.

!! Programming Environment

The Atari programming environment has changed dramatically over the last 25 years. Most development work now can be done cross-platform by mere mortals on a PC instead of on specialized mainframes or squinting at a flickering television on the actual Atari hardware. This opens up new realms previously unavailable. Numeric tables and graphics can be precalculated using today's big muscle CPUs to be used by the native Atari. Large projects can be assembled in less than a second instead of listening to the old floppy drive grind away for a quarter of an hour. The assemble/test/debug cycle is greatly reduced by using an emulator instead of waiting for the real iron to reset. This allows for quick itertaion, testing and development of ideas. 

The following is a list of my current programming environment. HTML links and short descriptions of each item are provided below. I work under Linux, so most of the items below are heavily UNIX based rather than Windows. Many of the UNIX tools either work on the Macintosh or can be ported with a bit of effort. There are some excellent Windows-only tools, but I do not use these. 

Additional disclaimer: I am the author of ATasm and envisionPC.

! Hardware 
* Pentium 4 machine running Linux 
* SIO2PC cable: These cables can be purchased through the AtariAge store, from Steven Tucker, on eBay, or you can build your own.  This cable in conjunction with the proper software turns your PC into a fileserver for the real Atari computer. 
* AtariSIO version 0.2.0:  AtariSIO is a Linux kernel module and user-level program by Mattias Reichl that allows the use of a SIO2PC cable under Linux.  AtariSIO is available at [http://www.horus.com/~hias/atari] 
* AtariMax flash cartridge (1 megabit):  This cartridge is available in 8 megabit size. AtariMax flash cartridges are reprogrammable flash-memory based cartridges.  This effectively provides extra memory that can be quickly paged in for things like unrolled code loops, look-up tables, etc.  More information can be found at [http://www.atarimax.com/flashcart/documentation] 
* Atari800 (NTSC, GTIA, 64K expansion) 
* Atari130XE  (NTSC, stock)

! Assemblers 
* EMACS as primary editor (vi on occasion) 
* ATasm v #5:  ATasm is highly compatible with the original OSS Mac/65 native compiler.  ATasm is written in C, and compiles without modification on any platform that has a GCC compiler.  ATasm has been specifically designed for the development of programs for the Atari home computer.  It can produces Atari native binary load object files, raw cartridge images, and can optionally target the machine state files produced by many emulators.  Binary load files can also be written to disk images for easy loading in other emulators or onto real hardware via SIO2PC.  It is available at [http://atasm.sourceforge.net] 
* Another popular assembler that is also Atari specific is XASM by Fox of Taquart. This assembler is compatible with JBW's Quick Assembler, which was the primary assembler used in Poland and other eastern European countries during the 1990s. XASM includes several pseudo-instructions (like mva) and pseudo-indexing modes. In addition, it can generate Atari native binary object files and has includes utilities to create boot disk images.  XASM can be found at [http://atariarea.krap.pl/x-asm/]  
* If you are looking for a Linux assembler that is Quik Assembler compatible, try a program called Zooey, found at [http://atari8.sourceforge.net/zooey.html] 
* Another assembler worth considering is the MADS assembler, found at http://mads.atari8.info/;  The latest documentation is always provided in Polish, however there is an English translation of version 1.9.5 provided here: [http://mads.atari8.info/mads_eng.html]

! Compilers 
* Some projects have been written using cc65, a version of C that targets the 6502.  In addition to providing a high-level language, the project also provides an assembler and a powerful linker, allowing basic library management.  cc65 can be found at [http://www.cc65.org/]. Please note that cc65 is no longer developed, support slowed down and the website will lead you to other sites.

! Graphics editors 
* envisionPC is a font/map editing program similar to the original Envision program written for the Atari. It runs on an IBM PC (either Linux or DOS/Windows) and includes all source code.  It will load and save maps and character sets to disk images, MAC/65, and Action! formats.  It is available at [http://atari.miribilist.com/envision/index.html] 
* gEnvision is another Envision-like program written for Linux written by Larry Richardson.  gEnvision will allow you to edit Atari character sets in either single or multicolor modes.  It will let you create character based "maps" of up to 256x256 characters.  It will save characters sets and maps as MAC/65 source code. gEnvision is available at [http://bellsouthpwp.net/r/i/rich5462] 
* graph2font is a conversion utility for full screen pictures.  Graph2Font has evolved into a full-featured graphics editor and conversion tool.  The latest version can be found at [http://g2f.atari8.info] 
* GIMP and a number of homebrew tools for graphics conversion

! Sound editors 
* RMT: RASTER Music Tracker (RMT) is a cross-platform tool for making music on a Windows PC. RMT uses Atari music routines written by Radek Sterba. The latest version (1.28) is available at [http://raster.infos.cz/atari/rmt/rmt.htm]. Please note that RMT is no longer supported due to Radek passing away.
* sox: Sound eXchange : universal sound sample translator is a UNIX utility used to convert samples between formats. 
* Homebrew tools to convert .WAV to 4-bit samples playable on the Atari. 

! Emulators 
* Atari800 #3.3 emulator:  This is free and portable Atari800/XL/XE/5200 emulator, originally written by David Firth and now developed by the Atari800 Development Team headed up by Petr Stehlik.  This is the primary emulator that I use.  It has a very nice monitor for debugging and recently has added cycle-exact emulation and greatly improved POKEY emulation. The Atari800 emulator is available at [http://atari800.sourceforge.net]. 
* The Atari++ emulator is also an excellent emulator by Thomas Richter.  It was the first cycle-exact emulator and provides many useful functions including a very nice monitor and emulation of Flash cartridges.  Atari++ is available at [http://www.math.tu-berlin.de/~thor/atari++] 
* Atari800Win Plus: This emulator is Windows only, but is considered one of the best emulators around.  It can be found at [http://www.a800win.atari-area.prv.pl]. 

Although cross-platform development has come a long way, always remember that it is vitally important to test your programs on the real machine.  Many of the techniques detailed within require cycle-exact timing and push the hardware in strange ways that emulators may or may not properly emulate. 



!! Graphics Modes 

! Introduction 

The first three techniques related in this section rely on ?bugs? in the GTIA chip.  The 
original CTIA chip did not have GTIA modes 9,10,1#  However, in early 1982 Atari 
began shipping with GTIA chip which provided 3 new graphics modes.  It appears that 
the GTIA was originally slated to be shipped with the original units but there were 
manufacturing issues that delayed production (see Atari History sec?)  The new GTIA 
modes can be selected by setting the display list to ANTIC mode f (graphics 8), and then 
setting the GPRIOR register appropriately. 

* GTIA mode 9 allows 16 shades of a single color 
* GTIA mode 10 allows for 8 colors from the Atari palette 
* GTIA mode 11 allows 16 colors of a single shade 

! Splitting modes vertically 

The ?unity? demo by Our-5oft has a screen that displays 3 different graphics modes on a 
single scan-line: graphics mode 8 (ANTIC mode f), mode.9 and something similar to 
mode15 (ANTIC mode e). This effect is accomplished by altering the GPRIOR register 
on-the-fly within a DLI.  By writing a 0 into GPRIOR at the beginning of a scan-line, 
waiting some time, switching to GTIA mode.9 by writing #$40 and finally in the last 1/3 
of the scan-line writing a 0 again. Rather than returning the display to mode f, the GTIA 
becomes confused and displays the remainder of the line as ANTIC mode e. 
This effect can be seen is several demos, including ?unity? and is used in the game 
Admirandus by MK-SOFT.  Below is a kernel that demonstrates this technique. 

{{{
dli pha ; preserve registers 
txa 
pha 
tya 
pha 
ldy #64 ; display is 64 scan-lines tall 
wloopsta WSYNC ($d40a) 
lda #0 
ldx #$41 ; for GTIA 9 
sta PRIOR ($d01b) 
nop ; wait for ~24 cycles 
nop 
nop 
nop 
nop 
nop 
nop 
nop 
nop 
nop 
nop 
nop 
stx PRIOR ($d01b) 
nop 
nop 
sta PRIOR ($d01b) 
dey 
bne wloop 
pla ; restore registers and exit 
tay 
pla 
tax 
pla 
rti 
}}}

This technique can also be used to split between GTIA modes, so with careful counting it 
should be possible to have 4 or possibly 5 modes on a single line.  Determining a good 
reason to do this is up to the game author. 

Notice that this technique will work in text modes as well as in graphics modes, but cycle 
counting becomes problematic in text modes as ANTIC steals extra cycles every x 
number of scan lines to refresh the character map.

! HIP/RIP 

This section is based on a translation of the original article published in Polish magazine 
"Atarynka", no. 2/2002 and the original HIP FAQ written by Heaven of Taquart.
 
HIP (HARD-Interlace Picture)  was discovered by members of HARD Software in 1996. 
This mode allows a nearly flicker-free display of 30 shades with a resolution of 160x200. 
The basic idea exploits a bug discovered with the GTIA that appears  when alternating 
between lines of GTIA 9 and GTIA 10.  Whenever a GTIA 10 line is displayed, the GTIA 
chip ends up shifting the line by half of a pixel.  Interlacing two alternating display lists 
allows the generation of an apparent 160 pixel horizontal resolution. 
It should be noted HIP is not the first to experiment with mixing GTIA modes.  However, 
earlier techniques such as APAC mixed modes 9 and 11, yielding a display of capable of 
256 colors.  Mixing these modes does not cause the GTIA to shift, and so displays that 
did this were limited to a horizontal resolution of 80 pixels. 
A HIP display consists of two alternating display lists and associated pictures.  This is 
shown below in figure ??.  By appropriately selecting the GTIA mode 10 palette and 
carefully placing the mode 9 pixels, HIP has an average bit-depth of HIP of 3.5 bits. 
(GTIA mode 9 has 4 bits, GTIA mode 10 has 3). 
 
|| DLIST ONE || DLIST TWO
| mode 9  | mode 10
| mode 10  | mode 9   
| mode 9  | mode 10   
| mode 10  | mode 9   
| ...  | ...  
 

Since GTIA mode 9 obtains its background color from register 712 and mode 10 obtains 
its background color from register 704, it is easiest to assign the palette as follows: 
{{{
704 $00 708 $08 712 $00 
705 $02 709 $0A 
706 $04 710 $0C 
707 $06 711 $0E
}}}
 
This method ends up ?wasting? a color register since registers 704 and 712 are both set to 
the background color.  However, it simplifies the DLI routine as now only the GPRIOR 
register needs to be set instead of alternating the background registers on every line.  By 
selecting the above colors, a palette of 30 colors becomes available in HIP by selecting 
the appropriate color in each display list as detailed below:

 
| 9     | #000  | #111  | #222  | #111  | #222  | #333  | #444  | #333  | #444  | 
| 10   | #000  | #000  | #000  | #222  | #222  | #222  | #222  | #444  | #444  | 
| HIP  | #0.0   | #0.5  | ##0    | ##5  | #2.0  | #2.5  | #3.0  | #3.5  | #4.0  | 
| 9     | #555  | #666  | #555  | #666  | #777  | #888  | #777  | #888  | #999  | 
| 10   | #444  | #444  | #666  | #666  | #666  | #666  | #888  | #888  | #888  | 
| HIP  | #4.5   | #5.0  | #5.5    | #6.0  | #6.5  | #7.0  | #7.5  | #8.0  | #8.5  | 
| 9     | #AAA | #999  | #AAA  | #BBB  | #CCC  | #BBB  | #CCC  | #DDD  | #EEE  | 
| 10   | #888  | #AAA  | #AAA | #AAA  | #AAA  | #CCC  | #CCC  | #CCC  | #CCC  | 
| HIP  | #9.0   | #9.5  | #A.0    | #A.5  | #B.0  | #B.5  | #C.0  | #C.5  | #D.0  | 
| 9     | #DDD | #EEE  | #FFF  |   |  |  |  |  |  | 
| 10   | #EEE  | #EEE  | #EEE  |   |  |  |  |  |  | 
| HIP  | #D.5  | #E.0  | #E.5  |   |  |  |  |  |  | 
  

To display a HIP picture, a special display list set can be constructed that alternates 
between screens every frame.  In addition, two DLI routines need to be written: one to 
alternate between GTIA modes 9 and 10 and the other to alternate between modes 10 and 
9.  A simple display routine is included below. 

HIP pictures are difficult to create because the pixels cannot be independently set.  This is 
due to the fact that we are simulating a 160 pixel wide display by overlapping two offset 
80 pixel displays, as seen in the figure below.  So, in order to reduce flicker, no two HIP 
pixels should vary their luminance by more than 2 shades.  This tends to favor pictures 
that are digitized or ray-traced. This limitation makes drawing HIP pictures by hand 
extremely difficult.  As a result, the best way of generating HIP pictures is to convert it 
using a program on a cross-platform machine.  There are several converters available of 
varying degrees of sophistication  Refer to the links section at the end of this chapter to 
find an appropriate converter. 

{{{
mode 10 .. 00 00 22 22 44 44 .. 
mode 9 .. 00 00 22 22 .. 
HIP .. 00 11 22 33 .. 
}}}

There is another drawback that is a result of interlacing pictures, a single HIP pixel 
vertical line is impossible.  This is seen as a wavy borders on the vertical edges of most 
HIP pictures.  Fortunately, this can be fixed with judicious use of Player/Missiles. 
Remember that register 704 is set to the background color.  Register 704 is also used as the color for player 0 and its missile.  By setting the shape of player 0 and missile 0 to a 
solid strip, they can be used as borders.  This provides the picture solid frame. 
There is a simple extension to the HIP format to provide color instead of only allowing a 
monochrome pallet.  This is known as the RIP (Rocky Interlace Picture) file format.  RIP 
allows modification of the color registers on each GTIA 10 lines. This allows for non- 
monochrome pictures, but RIP pictures selection is not straight-forward. 

! TIP
 
This section is based on a translation of the original article published in Polish magazine 
"Atarynka", no. 2/2002. 

TIP (Taquart interlace picture) is another method of adding color to HIP.  This method 
combines the original HIP format with ideas demonstrated by the original 256 color 
modes mentioned above. Colors are added by introducing a GTIA 11 line above each HIP 
GTIA 9/10 line as shown in the figure below. Unfortunately, adding this extra line halves 
the vertical resolution and introduces dark lines. On the other hand, pixels are now 
square, which helps conversion. 

 
|| DLIST ONE  || DLIST TWO  || 
| mode 11  | mode 11  | 
| mode 9  | mode 10  | 
| mode 11  | mode 11  | 
| mode 10  | mode 9  | 
| ... |  ...  | 


TIP pictures can be converted manually from a 24-bit picture in following manner.  First, 
scale or crop the picture to an appropriate size (160x120).  Second, map the picture to the 
Atari color palette. Note that the 256-color Atari palette is available in PhotoShop format 
in the Atari800 emulator package.  (This can also be read by GIMP).  Finally, the color 
information and ?HIP-data? need to be separated. The easiest way of obtaining the color 
data is to take the color of every second pixel. Conversion of shades to HIP can be 
problematic, but a reasonable method is to take the shade of every second pixel (starting 
from the first) for mode 9, and the other pixels' for mode 10. Use all four bits for GTIA 9 
shades and ignore the lowest bit for mode 10. 

Taquart has released the source to their incredible ?Numen? demo (which demonstrates 
many of the techniques described here).  As part of the source release, they have a few 
conversion utilities including a HIP to TIP colorizer utility and a simple PCX to TIP 
converter. 


! GTIA 9++
 
This text is based on a translation of the original article published in Polish magazine 
"Atarynka", no. 2/2002 by Piotr Fusik (Fox/Taquart); The textmode discussion is from 
Jaskier on AtariAge programming forum 

Before GTIA 9++  was developed, the graphics mode most often used by demo writers 
was a mode called ?Konop's mode.?  This mode was first seen in the Asskicker demo 
written by Konop/Shadow.  Konop's mode has square pixels, 16 shades, and a small 
enough memory-bartprint that the entire screen can be updated in a single vertical blank 
(depending, of course, on the complexity of the drawing routines).  When Konop's mode 
is used in GTIA mode 9, it is also referred to as GTIA 9+.  Konop's mode is generated by 
creating a specialized display list.  Each mode line is generated by a the following: 
{{{
$0f line of mode 15 
$00 blank line 
$4f <screen>line of mode 15 + LSM 
$00 blank line
}}}
 
Fox/Taquart discovered a way of generating a new graphics mode that has many of the 
same strengths as Konop's mode, but without requiring the use of blank lines.  This mode 
has been dubbed GTIA 9++ (or GTIA 10++, depending on the base GTIA mode in use). 

In addition to the cleaner looking display, GTIA++ has a few other advantages over 
Konop's mode: 

# The display list is significantly shorter.
# The display list only uses one LMS instruction, so instead of using two display lists for double-buffering, the screen-address in the DL can be directly altered. 
# The height of each line mode can be easily modified (up to a height of 16 pixels).
# ANTIC takes less cycles to display a GTIA 9++ mode.  This comes as a trade off for some extra work for the CPU.  Although not much is required of the CPU, the work must be carefully synchronized with the display.  This means that a simple implementation of GTIA 9++ will not show significant savings.  A less naïve method can be written with additional programming effort. 
# The method used can be used in other ANTIC modes to generate, for example, a hardware-supported 40x40 text-mode.  TMC 2.0 uses this technique to create a textmode screen of 40x39.  See the discussion at the end of this section for more details.
 
This new graphics mode was used to great affect in Taquart's ?Numen? demo in the Vector engine section. 

The basic idea underlying GTIA 9++ is simple: force ANTIC to repeat the same line 
multiple times using its internal memory. ANTIC normally does this when generating 
native modes.  Consider ANTIC mode 8 (Graphics mode 3), which is a 40x24 display.  In 
this mode, ANTIC generates the first displayed scan-line by retrieving it from main 
memory but proceeds to generate the next 7 scan-lines from its own internal memory. 
ANTIC tracks the vertical dimension of each scan-line with a four-bit counter called the 
Delta Counter (DCTR).  Normally the DCTR starts at 0 and repeats each scan-line until it 
reaches a count specific to each mode.  In ANTIC mode 8, this count is 8 in ANTIC mode 
e this count is 0. 

Unfortunately, the DCTR is not directly accessible by the CPU.  However, the behavior 
can be indirectly effected.  It turns out that the DCTR acts differently if vertical scrolling 
is enabled. On the very first vertical scrolled line, the DCTR will count from the value in 
VSCROL to 0.  On the very last vertical scrolled line, the DCTR counts from 0 to 
VSCROL. Everywhere else, the DCTR behaves normally.  At the beginning of each 
mode line, DCTR is loaded with the appropriate value (either VSCROL or 0).  Near the 
end of each scan-line, the DCTR is compared with the terminating value (either the mode 
specific count, or VSCROL).  If the DCTR and the terminating value are equal, the next 
display list instruction is fetched otherwise the DCTR is incremented and the line is 
repeated. 

Consider the following display list, using ANTIC mode 8 and VSCROL set to 6: 
{{{
$28 line of mode 8, vertical scroll bit set 
$08 line of mode 8, vertical scroll bit clear 
$08 line of mode 8, vertical scroll bit set (last line of the display) 
$41 <dlist>jump to top of the display list 
}}}

The native behavior is for the first line of data to be visible for 2 scan-lines (DCTR values 
6, 7); A match is made and the next line of data is fetched from main memory.  This line 
is seen 8 times (DCTR values 0,1,2,3,4,5,6,7) before its count is reached.  The final line 
is seen for 6 scan-lines (DCTR values 0,1,2,3,4,5). 
So, in order to produce a mode 9++ screen, the values of VSCROL must be set 
appropriately to cause ANTIC to repeat itself. 
Consider following very short display list: 

{{{
$2f line of mode 15, vertical scroll bit set 
$0f line of mode 15, vertical scroll bit clear 
$41 <dlist>jump to top of display list where VSCROL is set to 13.
}}}
 
In this case, the first line will be seen 4 times (DCTR values 13,14,15,0 - DCTR wraps 
from 15 to 0, because it's 4-bit), while the second line will be seen 14 times (DCTR 
values from 0 to 13).  Similarly, if VSCROL is set to 3, the first line will be seen 14 times 
and the second will be seen 4 times. 

So, in order to have each line repeated 4 times, the following display list should be 
created: 

{{{
$6f <screen>line of mode 15, LMS, vertical scroll bit set 
$0f line of mode 15, vertical scroll bit clear 
$2f line of mode 15, vertical scroll bit set 
$0f line of mode 15, vertical scroll bit clear 
... 
}}}

and the VSCROL register needs to be updated at the proper time.
{{{
DCTR 
 13 @------------------------- 
 14 -------------------------- 
 15 -------------------------- 
  0 -------------------------- 
  0 ========================== 
  \1 ========================== 
  2 ========================== 
  3 ====!==================*\#\# 
        \\--visible screen--/ 
 
'-' - first mode line (bit 5 in DL is set) 
'=' - second mode line (bit 5 in DL is clear) 
'@' - VSCROL must contain 13 here 
'!' - Start of DLI 
'*' - VSCROL must contain 3 at this point 
'#' - here we put 13 into VSCROL 
}}}

This diagram shows two lines of GTIA 9++ (8 scan-lines).  Initially, VSCROL must be 
set to 13.  (Technically, because DCTR is a 4-bit register, the high nibble may be 
anything). As soon as ANTIC loads DCTR with this value, the first line will be displayed 
correctly.  The second line will proceed to load properly.  It is only near the end of the last 
scan-line of the second line that VSCROL must be set to 3. This allows plenty of time, 
and may be set on the DLI of the first line. The difficulty comes in preparing ANTIC to 
display the third line.  As demonstrated in the diagram, there is only a very narrow 
window, consisting of only a few cycles, in which to write 13 into VSCROL. 
Manipulation of the VSCROL register can be done in three ways: 

# using a DLI 
# inside of unrolled effect code 
# using a POKEY timer IRQ 

The first implementation is clearly the easiest, but it is also the slowest. The second 
implementation is optimal, but correctly timing code loops realistically constrains this to 
simple effects.  The third option performs better than the first since POKEY timers can be 
set with single cycle accuracy, removing the time wasted in a WSYNC.  However, using 
a POKEY would cost a the program a valuable sound channel.  As such, the first 
implementation will be the method described here. 

In order to reduce the amount of time wasted in the WSYNC, the number of DLIs should 
be as small as possible.  Using a DLI every second mode line, (once every 8 scan-lines), 
is sufficient.  The DLI should be set for every line where the vertical scroll bit is clear. 
The easiest way to ensure proper synchronization is to use WSYNC. Then write 13 to 
VSCROL and followed by the 3.  The following code provides a simple kernel to do this: 
{{{
         dli pha 
         sta WSYNC  ;($d40a) 
         lda #13 
         sta VSCROL ;($d405) 
         lda #3
         sta VSCROL ;($d405) 
         pla 
         rti 
}}}

So, how many cycles does GTIA 9++ save over Konop's mode?  Each mode line in 
Konop's mode costs 6 cycles for the display list plus an additional 64 or 80 cycles for the 
screen, depending on width.  GTIA 9++ takes only one cycle per line plus 32 or 40 cycles 
for screen data.  In addition, VSCROL must be updated which takes a further 6 cycles. 
For an entire screen 59 lines high the following costs are incurred: 

Narrow screen: 
Konop: 1+59\*(6+64)+2+3=4136 
9++: 1+59\*(1+32+6)+2+3=2307 

Normal screen: 
Konop: 1+59\*(6+80)+2+3=5080 
9++: 1+59\*(1+40+6)+2+3=2779 

This indicates that there is about a 6-7% gain in CPU time using GTIA 9++. 

Unfortunately this is only true if GTIA 9++ is implemented using the optimal code 
unrolling method (implementation 2 in the earlier list).  If a DLI is used to update 
VSCROL then Konop's mode is faster.  One possible compromise is to use the time 
before the WSYNC for calculations.  In addition, traditional uses of DLIs (color changes 
or P/M multiplexing) can be had at nearly no additional cost. 

As a final caveat, while Konop's mode can be trivially extended to 240 scan lines by 
placing the JVB at the end of the display list, GTIA 9++ has problems displaying graphics 
in this last line.  Either remain satisfied with 59 lines or reduce the top or bottom line to a 
height of 3 scan lines. 

The native music tracker TMC 2.0 uses the same VSCROL technique described above to 
create a 40x39 character display.  To do this, first a custom character set must be 
designed using an 8x6 font.  Then, set up a display list with alternating text lines with the 
vertical scroll bit set.  Note that as the text lines alternate down the screen, sometimes the 
first 2 lines of the font are cut and sometimes the bottom two lines are cut.  This requres 
the use of two character sets.  The DLI from TMC 2.0 looks like the following: 
{{{
         dli pla 
         sta WSYNC ;($d40a) 
         lda linewsk 
         sta VCROL ;($d405) 
         eor #7 ; toggle between 2/5 
         sta linewsk 
         lda fontwsk 
         sta CHBASE ;($d409) 
         eor #4 ; toggle between two fonts 
         sta fontwsk 
         pla 
         rti 
}}}

This mode could be emulated in software by using a Graphics 8 screen.  However, the 
hardware supported mode takes less memory.  A Graphics 8 screen takes 7680 bytes for 
display, ~200 bytes for the display list, and 1024 bytes for the custom display font 
(although the font could be restricted to only characters used in the display).  This text 
mode takes 1560 bytes for the display, ~42 bytes for the display list, and 2048 bytes for 
the font.  (DMA cycles here too) 

! MCS
 
Something about MCS and/or graph2font conversions 

! Graphics Links and Resources 

* Original HIP FAQ 
* Original Mode9++ document 
* ?Numen? link 
* ?Asskicker? link 
* ?unity? and other 
* MCS demo 
* HIP converters 



!! Scrolling techniques 

! MWP 

The usual method of scrolling on the Atari is to create the entire logical display in 
memory, and then use ANTIC to pan across this display.  This method is easy and very 
efficient, requiring very little CPU time.  The only problem occurs when it is infeasible to 
hold the entire logical display in memory at one time.  If this is the case, then it becomes 
necessary to either generate the display on demand or reduce the scope of the game. 

The traditional method of generating the display on-the-fly is the method that most other 
platforms are forced to use.  The most straight forward method of doing this involved 
creating a double-buffer.  Show one screen to the player, while creating the next frame in 
memory appropriately shifted.  When this second screen is complete, wait for the 
VBLANK, then flip it onto the display and start work on the first screen.  This method 
can be extended by using a technique called triple-buffering which never waits for the 
VBLANK, but instead immediately starts work on the third screen.  In this manner, no 
CPU cycles are wasted waiting for the vertical blank, and the CPU is always working. 

Triple-buffering was used effectively in the IBM-PC game ?Jazz Jackrabbit? by Epic 
MegaGames in the early 1990s.  It is also the method that should be used if a game being 
presented in first-person perspective.  For instance, the Atari port of Space Harrier by 
Chris Hutt uses triple-buffering to help keep the frame-rate up. 

Analogue Multiplexer created a way to generate the screen on demand, but without 
requiring large memory copying.  This allows the program to forgo the double or triple 
buffer and dramatically reduces CPU usage.  This new scrolling technique has been 
named the minimum-wrapping-principle (MWP). 

MWP scrolling combines the strengths of traditional Atari scrolling and scrolling created 
by on-demand screen generation.  The primary advantages are: 

# The memory required for the displayed map is just over the requirement for one screen. Remember, however, that the logical source data still needs to be in memory. See the discussion of when to use MWP below for further information. 
# Large amounts of data copying is not necessary, only one row or column buffer needs to be written at one time. 
# Only 2 LMS commands in the display list 

Although MWP scrolling described here is for a character mode screen, the same 
principle can be applied to a bitmap screen.  In fact, it was during a discussion of 
scrolling bitmap screens that MWP was developed. 

The basic idea of MWP is to wrap the screen around onto itself, a bit like taping the ends 
of a piece of paper.  To demonstrate this, consider a 3x3 display.  MWP requires that one 
row be duplicated, one copy at the beginning of screen memory and the one at the end of 
screen memory.  This means that in order for a 3x3 display to be appropriately created 
there must be 4 rows in all.  For this example, suppose that screen memory begins at 
address $6000.  It then follows: 
{{{
Row 0: $6000-$6002 
Row 1: $6003-$6005 
Row 2: $6006-$6008 
Row 3: $6009-$600B 
}}}

This display is shown in the following diagram.

[{Image src='mwp1.png' }]

Notice that rows 0 and 3 are exact duplicates of each other.  This is necessary in order to 
properly perform wrapping from row 3 back to row #  This can be seen in figure 1B. 
Initially, the screen will start in the state shown in figure 1A.  If vertical scrolling is 
performed, the display list moves from 1A > 2A > 3A > 1A and so on.  The lowest row is 
hidden off-screen and is used as a buffer.  If row 0 or 3 is written into, then a copy must 
be made to rows 3 or 0 respectively. 

When scrolling horizontally, the entire sequence is cycled: 

1A > 1B >1C >2A >2B >2C >3A >3B >3C >1A and so on.  In this case, the rightmost 
row is off of the screen and should be written into as a buffer.  Again, if data in row 0 or 3 
is modified, that modification needs to be reflected in the other row. 

Note that although the starting state may differ, the flow will remain the same.  So, 
scrolling 

Wrapping from row 3 to row 0 is performed via an LMS instruction.  This means that 
MWP has two LMS instructions, one at the top of the screen, and a second LMS 
wherever row 0 has to be displayed.  Using the LMS to wrap the screen is what eliminates 
the need for a full screen copy. 

! Hardware-assisted Parallax Scrolling 

Scrolling while backtracking with characters either dynamically or via flipping through 4 
separate character sets (in ANTIC 4). 

Note C64's ?Flimbo's Quest? 

! Scrolling by Half-steps
 
It is possible to scroll high resolution graphics (320 pixels wide) at half a color clock at a 
time.  This can be accomplished by using soft sprites in Antic F and redrawing the items 
each frame.  Alternatively, hardware assistance can be provided by using two Antic 2 
character sets shifted by a single pixel.  The scrolling routine would then look like: 
character set flip, fine scroll step, character set flip, fine scroll step, then eventually a 
coarse scroll.  

One thing to be careful about when using this technique is color artifacting.  Be sure to 
check on real hardware to ensure your objects' are not color strobing. 

Similarly, it is possible to scroll carefully designed GTIA graphics a single color clock at 
a time.  Basically, it is possible to exploit the pixel shifting property observed in the 
HIP/TIP discussion above.  Graphics 9 and Graphics 10 pixels are 1 color-clock shifted. 
By designing an 8 shade item, and carefully mapping the Graphics 10 palette to match the 
shades of the Graphics 9 palette, you can flip screens (or character sets, if you are using 
GPRIOR modes in Antic 2 instead of Antic F) between fine scrolling.  The scrolling 
routine then looks like: flip image (or character set) and GPRIOR to mode 9, flip image 
(or character set) and GPRIOR to mode 10, fine scroll, flip image and return to GPRIOR 
mode 9, etc. 

Note:  Another, much more limited method is possible by using player-missiles as masks 
to fake the fine scroll.  By placing players set with the background color at the leading 
and tailing edge of a GTIA shape, the player can mask the item underneath.  The scrolling 
routine would then consist of moving the players, then shifting the shape.  The biggest 
problem with this method is that in order for the illusion to work, each scanline of the 
object can only have 1 color.  This is due to the fact that only the object silhouette is 
?moving? at the single color clock rate --  the interior detail will remain static and the 
illusion will be lost. 

!! Using the mouse

This section is based on a news posting by Jaskier/Taquart (Marcin Lewandowski) 
Reading a mouse can be difficult because of the sample speed required.  Most drivers end 
up reading the mouse movement via IRQ (like timer 1) or via a specialized DLI routine. 
If the mouse is not sampled quickly enough, it will tend to drift or track backwards. 
Unfortunately, this can eat up a lot of processor time, leaving very little time for other 
purposes.  However, Marcin Lewandowski posted a new method of reading the mouse 
that is not as CPU intensive.  His article is included below. 

The main procedure below must be continuously called. Fortunately, the VBlank interrupt 
is fast enough to do this.  After a long break, the init procedure must be called to force 
synchronization. 

Main proc: 
{{{
          lda $d300 
          lsr a 
          lsr a 
          lsr a 
          lsr a 
          pha 
          and #10 ; (#3 in ST) 
          ldy #3  ; check left-right move 
l1       cmp htab,y 
          beq l2 
          dey 
          bne l1 
l2       tya 
          clc 
          adc #1 
          and #3 
          cmp xind 
          bne l3 
          sty xind 
          dec xcur 
l3       tya 
          sec 
          sbc #1 
          and #3 
          cmp xind 
          bne l4 
          sty xind 
          inc xcur 
l4       pla 
          and #5 ;(#12 in ST) 
}}}

The main routine above only handles the horizontal axis.  The vertical axis is identical, 
except for the following replacements: change xind to yind,;xcur to ycur; and htab to 
vtab. 

The initialization routine is listed below: 
{{{
init 
          lda $d300
          lsr a 
          lsr a 
          lsr a 
          lsr a 
          pha 
          and #10 ; (#3 in ST) 
          ldy #3 
i1       cmp htab,y 
          beq i2 
          dey 
          bne i1 
i2       sty xind 
          pla 
          and #5 ; (#12 in ST) 
          ldy #3 
i3       cmp vtab,y 
          beq i4 
          dey 
          bne i3 
i4       sty yind
}}}
 
The tables referenced by the above code vary depending on the mouse being used. 
 
||  || Amiga: || Atari ST: || 
|htab:| 0,2,10,8 | 0,2,3,1| 
|vtab:| 0,1,5,4 |0,8,12,4 |
 

Oddly, no emulator currently supports the use of a mouse, so this will only work on real 
hardware.  

Note that the Atari can only read the left mouse button of an unmodified mouse. 

However, a simple hardware modification on the ST mouse allows its younger cousin to 
read the second mouse button by watching POT1 to be pulled low.  To prepare the ST 
mouse for use, turn the mouse over and remove the screws.  Find where the cable is 
connected to the printed circuit board, then take a 4.7 kilohm or 5.6 kilohm. resistor and 
stick one leg into the hole where the red wires is and the other one where the white one is. 
When you press the right mouse button, the value of POT1 should change from 228 to a 
value below 9. 

!! Pushing POKEY 

! Digital samples 

POKEY is capable of playing digital sound by setting the channel to volume only and 
then rapidly storing values into the volume register.  Volume on the Atari is only a 4-bit 
register, so digital samples are limited to 4-bit resolution.  Although it limits the sound 
quality possible, it does halve the memory requirement per sample.  The first digital 
sampling and playback routines I used came with an interesting device called the Parrot. 

This device plugged into the joystick port on the Atari and was read as a paddle.  By 
rapidly sampling POT0 and storing this into memory, the Atari was capable of recording 
and playing back digital samples.  Now, however, it is easier to record  a sound on the PC 
(or rip from another source) and convert it into the appropriate format for the Atari.  

Sample conversion is easiest to do in steps.  The first is to convert the original sound into 
an unsigned 8-bit sound at an appropriate sample rate.  For the playback routine described 
below, this should be about 3.9khz. 

Initial sample conversion can be performed with a tool such as sox. 

{{{
sox infile.wav -t raw -r 3900 -u -b -c 1 outputfile.raw 
}}}

Depending on the sound source, it may be a good idea to add a low-pass filter when 
downsampling.  See the soxexam man page for more details.  

Once a sound is in unsigned raw 8-bit format at the appropriate sample rate, a second 
conversion pass needs to be mad to convert the raw sound to 4-bit big-endian sample.  A 
program to do this is included in the iron archives called rawtoatari.  rawtoatari simply 
reads in each pair of bytes from the input file, strips off the lower nibble, and combines 
the bytes into a single byte in big-endian format.  Here is a quick C snippet to do this: 

{{{
#include <stdio.h> 
int main(int argc, char argv) { 
  FILE *in, *out; 
  unsigned char a,b; 
  int c=0; 
  if (argc!=2) 
    return -1; 
  in=fopen(argv,"rb"); 
  if (!in) 
    return -1; 
  out=fopen(argv,"wb"); 
  if (!out) 
    return -1; 
  printf("Converting '%s'...n",argv); 
  while(!in.feof()) {    / Conversion routine */ 
    a=(fgetc(in)&0xf0)|((!feof(in)?fgetc(in):0))>>4; 
    fputc(a,out); 
    c++; 
  } 
  fclose(in); 
  fclose(out); 
  printf("Wrote %d bytes to file '%s'.n",c,argv); 
  return 0; 
} 

}}}

Once an Atari native sample has been created, a playback routine is required before it is 
useful.  The more CPU time can be devoted to the task of playback, the clearer the sample 
will be, and the higher sample rate can be used.  Optimally, the playback routine should 
shut-off the DMA, and all non-vital interrupts and devote complete time to playback. 
However, realistically, a game or demo will want to display something on screen.  So, a 
compromise must be made.  One method used by Chris Hutt in his Space Harrier 
conversion is to use the VCOUNT register to synchronize playback, creating a solid 
playback frequency.  (Note: this routine has since been replaced with in Space Harrier 
XE.  Now, the sample playback is preformed via IRQs) Another possibility is to include 
the playback inside of a DLI routine. 

{{{
; play_sample, a routine to slave processor to play digital sample 
; taken from the Space Harrier conversion project 
; (c) Chris Hutt, 2000-2004 
play_sample 
         .local 
         lda #0 
         tay ; 1st byte of sample data 
         sta sample_nibble ; initialize counters 
         sta sample_index 
?0a 
         lda #80 
         cmp VCOUNT 
         bne ?0a 
         sta next_vcount 
?0                                     ; setup vcount value to wait for 
         lda next_vcount 
         clc 
         adc #2 
         cmp max_vcount          ; max value of vcount is 130 or 155 on PAL 
         bcc ?1 
         sbc max_vcount          ; wrap to 0 if 131/155 or 1 if 132/156 
?1 
         sta next_vcount 
         lda sample_nibble 
         eor #1                            ; toggle nibble count between 0 and 1 
         sta sample_nibble 
         beq ?2 

         ; handle high nibble of data 
         lda (sample),y 
         lsr a 
         lsr a 
         lsr a 
         lsr a ; shift high nibble into lo byte 
         ora #16 ; turn on volume only bit 
         tax 
         bne ?4 ; always branch 

         ; handle lo nibble of data 
?2 
         lda (sample),y 
         and #15 
         ora #16 
         tax 
         iny ; increment byte pointer for next byte 
         bne ?3 
         inc sample+1 
?3
         lda sample+1 
         cmp dest_A+1 
         bne ?4 
         cpy dest_A 
         beq ?6 
?4 
         lda next_vcount          ; wait for specified vcount. 3.9khz is every 2 
?5 
         cmp VCOUNT 
         bne ?5 
         stx AUDC3                   ; play sample 
         jmp ?0                         ; always loop 
?6 
         lda #0 
         sta AUDC3                   ; switch off volume only 
         rts 
}}}

! Digital Sample Links and Resources 
(Note: Update to Sheddy's new IRQ player) 




!! Trackers 

! RMT 

A RMT module file is standard Atari binary file with from-to head data, because it 
contains many tables of pointers (with absolute memory values, not relative values). If 
you save the music with function "Save as.../RMT file", the module will use a hard-coded 
default address ? starting at address $4000. 

If you want use RMT music in your program, you should use "Export as.../RMT stripped 
file". Then you can specify an arbitrary location for your RMT module data.  In addition, 
this will only save the features used by the module, without any redundant bytes, 
instrument names, unused songs, and so on. 

(Using RMT engine for sound effects in-game as well as for music) 

!  TMC2 

The Theta Music Composer is a tracker that runs on native hardware.  
[http://jaskier.atari8.info/menu2/TMC2/TMC2.zip] 

!! Soft Sprites 

! Bitmap sprites 

Consider massively unrolled loops, hard-coded sprite data, and preinitialized zero page 
vectors (from NRV's softsprite thread on AtariAge)

! Character sprites 

(Turrican, BeyondEvil, etc) 

!! Advanced Player/Missile Use

! ORA overlapping for color 

Player/missile graphics and playfield graphics can be made to mix. Set the lower nibble 
of GPRIOR to 0, and playfield 0,1 and players 0,1 combine their colors by ?or'ing? the 
colors together.  Playfield 2,3 and players 2,3 are also ?or'd.?. Used in combination with 
soft-sprites above, this can be used to create modern looking avatars. 

Note that setting bit 5 causes the same behavior between players.  ANTIC will performs a 
logical OR of colors of players 0/1 and 2/3 when they overlap. If the overlap option is not 
set, the area of overlap for all players will be black. (Really?  Mapping the Atari claims 
this, but I do not remember this being the case) 

! GTIA Gr. 9  overlay 
GPRIOR set to $50 in GTIA 9enables missile "OR" mode, in which overlapping missiles 
on GTIA mode 9 cause transparency effects. 

! Reusing players 
Changing position, color AND shape of a single PM on a single line. 

! Collision detection 

Hardware registers no longer mean anything, use masks, bboxes, etc. 

!! File Access

! xBIOS

xBIOS is almost like a programmers version of DOS. With it you can easily access files from your programs without using Atari DOS. It is smaller than DOS and therefore saves memory in your programs. You can even run programs from as low as $0200. xBIOS can read and write from/to existing files but can not write new files or directories from your programs.

[Link to official xBIOS page (Polish)|http://xxl.atari.pl/]

[xBIOS]

!! Advanced Graphical Tools

! Rastconverter

[Rastaconverter] is a Windows/Linux based utility which iteratively converts ordinary jpeg/gif/png files into executable files which display the pictures natively on the Atari 8-bit. Rastaconverter uses all the power of display list interrupts and player missile graphics to allow the maximum number of colours possible on the screen concurrently. [https://github.com/ilmenit/RastaConverter]

!! Hardware exotics 

! Accessing additional memory

Document laying around somewhere on this (see 8-bit news FAQ?). 

! Double POKEY 

No idea at all. 

! Double GTIA 

In development - no idea at all. 

! Video Board XE 

Electron (aka Tomasz Piórek) has been working on a project that adds a new video card 
to the Atari computer.  This works along side the GTIA chip, but adds several new 
features, including a sprite blitter capable of displaying sprites in 256 colors ranging in 
size from 1x1 to 256x256.  For more current information, video clips of current output 
and more, visit the current homepage at [http://vbxe.atari8.info/].

!! Ironman Contributors 

* Analogue Multiplexer (Analmux): MWP 
* Fox/Tarquat (Piotr Fusik): GTIA 9++ 
* Heaven/Tarquart: HIP, TIP 
* Sheddy (Chris Hutt): Digital sample playback 
* Jaskier/Taquart (Marcin Lewandowski): Mouse driver 
* Mathy van Nisselroy: Hardware mouse modifications
* Snicklin (Steve Nicklin): xBIOS
* Mark Schmelzenbach: Editor