Wikifang:Telefang 2 Translation Patch/NatsumeGBA compression: Difference between revisions

From Wikifang, a definitive guide to Telefang, Dino Device and Bugsite
Jump to navigation Jump to search
(rename to NatsumeGBA compression)
 
Line 1: Line 1:
"Malias2" compression is the compression system used in Telefang 2. Like its namesake [[Wikifang:Telefang_1_Translation_Patch/Malias_compression|Malias compression]], it is a predictable variation upon LZ77. Why they didn't just use the GBA BIOS LZ77 compression functions is beyond me. (Seriously, why? Unless Nintendo doesn't document their own BIOS functions...)
'''NatsumeGBA compression''' (formerly Malias2 compression) is the compression system used in numerous Natsume GBA games, including Telefang 2 and Medarot 2 CORE. Similarly to its predecesor, [[Wikifang:Telefang_1_Translation_Patch/NatsumeGB_compression|NatsumeGB compression]], it is a predictable variation upon LZ77. Why they didn't just use the GBA BIOS LZ77 compression functions is beyond me. (Seriously, why? Unless Nintendo doesn't document their own BIOS functions...)


== Malias2 format ==
== NatsumeGBA format ==
(All reads are little-endian, the default endianness of ARM, unless otherwise specified)
(All reads are little-endian, the default endianness of ARM, unless otherwise specified)


Like Malias, it has a header and bundles. The header is different - instead of a single "is compressed" byte, there is instead a 16-bit magic value 0x4C 0x65 ("Le") followed by a 32-bit size value. Only the lower 24 bits of this value are actually used. If the magic is invalid, something else happens (I'm not sure yet). The size value is how many bytes of uncompressed data should be written.
Like NatsumeGB, it has a header and bundles. The header is different - instead of a single "is compressed" byte, there is instead a 16-bit magic value 0x4C 0x65 ("Le") followed by a 32-bit size value. Only the lower 24 bits of this value are actually used. If the magic is invalid, something else happens (I'm not sure yet). The size value is how many bytes of uncompressed data should be written.


Malias2 commands are encoded as bundles of four commands. There are four types of commands. Thus, each bundle's command list fits snugly within a single byte. Command lists are read from the lower bits upwards, two bits at a time.
NatsumeGBA commands are encoded as bundles of four commands. There are four types of commands. Thus, each bundle's command list fits snugly within a single byte. Command lists are read from the lower bits upwards, two bits at a time.


During this discourse we will talk about a "write head". This is the location of the next byte to write to in memory.
During this discourse we will talk about a "write head". This is the location of the next byte to write to in memory.
Line 30: Line 30:
== See Also ==
== See Also ==
* [[User:Kmeisthax/Findings/2012/2/4/Malias2_Listing |A disassembly of the Telefang 2 decompression routine]]
* [[User:Kmeisthax/Findings/2012/2/4/Malias2_Listing |A disassembly of the Telefang 2 decompression routine]]
* Coming Soon - Codemodule support for Malias and Malias2
* Coming Soon - Codemodule support for NatsumeGB and NatsumeGBA
* [https://github.com/Sanky/romhacking/blob/master/telefang/puneedle.py puneedle.py] is a Python Malias2 decompressor implementation.
* [https://github.com/Sanky/romhacking/blob/master/telefang/puneedle.py puneedle.py] is a Python NatsumeGBA decompresser implementation.

Latest revision as of 09:16, 16 November 2020

NatsumeGBA compression (formerly Malias2 compression) is the compression system used in numerous Natsume GBA games, including Telefang 2 and Medarot 2 CORE. Similarly to its predecesor, NatsumeGB compression, it is a predictable variation upon LZ77. Why they didn't just use the GBA BIOS LZ77 compression functions is beyond me. (Seriously, why? Unless Nintendo doesn't document their own BIOS functions...)

NatsumeGBA format[edit]

(All reads are little-endian, the default endianness of ARM, unless otherwise specified)

Like NatsumeGB, it has a header and bundles. The header is different - instead of a single "is compressed" byte, there is instead a 16-bit magic value 0x4C 0x65 ("Le") followed by a 32-bit size value. Only the lower 24 bits of this value are actually used. If the magic is invalid, something else happens (I'm not sure yet). The size value is how many bytes of uncompressed data should be written.

NatsumeGBA commands are encoded as bundles of four commands. There are four types of commands. Thus, each bundle's command list fits snugly within a single byte. Command lists are read from the lower bits upwards, two bits at a time.

During this discourse we will talk about a "write head". This is the location of the next byte to write to in memory.

There are four commands:

Mode 0: "Far" LZ77 Copy[edit]

This is an LZ77 copy. The next 16 bits are read in with the lower 12 bits being the copy offset and the upper 4 being the copy length. The source to copy from is stored as an offset from the write head minus five. The length is stored minus three. That many bytes will be copied, byte by byte, from the copy location to the write head.

Mode 1: "RLE" LZ77 Copy[edit]

This is another LZ77 copy. The next 8 bits are read in with the lower 2 bits being the copy offset minus one and the upper 6 being the length minus two. It works the same way as above. Most notably this command is optimized for run-length encoding - i.e., this is the reason why LZ77 is done byte-by-byte.

Mode 2: One uncompressed byte[edit]

The next byte is copied from the source to the write head.

Mode 3: Three uncompressed bytes[edit]

The next three bytes are copied from the source to the write head.

See Also[edit]