Wikifang:Telefang 2 Translation Patch/NatsumeGBA compression

From Wikifang, a definitive guide to Telefang, Dino Device and Bugsite
Jump to navigation Jump to search

NatsumeGBA compression (formerly Malias2 compression) is the compression system used in numerous Natsume GBA games, including Telefang 2 and Medarot 2 CORE. Similarly to its predecesor, NatsumeGB compression, it is a predictable variation upon LZ77. Why they didn't just use the GBA BIOS LZ77 compression functions is beyond me. (Seriously, why? Unless Nintendo doesn't document their own BIOS functions...)

NatsumeGBA format[edit]

(All reads are little-endian, the default endianness of ARM, unless otherwise specified)

Like NatsumeGB, it has a header and bundles. The header is different - instead of a single "is compressed" byte, there is instead a 16-bit magic value 0x4C 0x65 ("Le") followed by a 32-bit size value. Only the lower 24 bits of this value are actually used. If the magic is invalid, something else happens (I'm not sure yet). The size value is how many bytes of uncompressed data should be written.

NatsumeGBA commands are encoded as bundles of four commands. There are four types of commands. Thus, each bundle's command list fits snugly within a single byte. Command lists are read from the lower bits upwards, two bits at a time.

During this discourse we will talk about a "write head". This is the location of the next byte to write to in memory.

There are four commands:

Mode 0: "Far" LZ77 Copy[edit]

This is an LZ77 copy. The next 16 bits are read in with the lower 12 bits being the copy offset and the upper 4 being the copy length. The source to copy from is stored as an offset from the write head minus five. The length is stored minus three. That many bytes will be copied, byte by byte, from the copy location to the write head.

Mode 1: "RLE" LZ77 Copy[edit]

This is another LZ77 copy. The next 8 bits are read in with the lower 2 bits being the copy offset minus one and the upper 6 being the length minus two. It works the same way as above. Most notably this command is optimized for run-length encoding - i.e., this is the reason why LZ77 is done byte-by-byte.

Mode 2: One uncompressed byte[edit]

The next byte is copied from the source to the write head.

Mode 3: Three uncompressed bytes[edit]

The next three bytes are copied from the source to the write head.

See Also[edit]