Users browsing this thread: 1 Guest(s)
Patch: allowing use of "reserved" palette colors for player characters

#1
Posts: 45
Threads: 4
Thanks Received: 7
Thanks Given: 1
Joined: Jul 2013
Reputation: 4
Status
None
[Image: imgshk.png] [Image: imgshk.png] [Image: imgshk.png]

updated patch 8/17

Possibly complete / possibly final version

http://bitshare.com/files/dv2hs1en/expan....0.7z.html

The code in this original post is now deprecated. An IPS patch now exists for the latest version, so you don't have to paste anything manually.

See my replies further down in the thread for more details

===============

The screen shot I've included shows Locke with pink clothes, and Terra with orange hair and brown boots. These characters are using original, non-altered palettes. But, they are using colors from their regular palettes which will not normally show up correctly in battle if used by PCs. I've written a patch that allows those colors to be used.

For those who don't know, there are 16 colors in a normal SNES palette. But, only 12 colors are usable for player characters. That is because, when character sprites are loaded for battle, they share a palette with other sprites (like the hand cursor for instance). The last four palette entries are reserved for those other sprites. If a PC uses those colors, his sprite colors will be messed up in battle, though they will appear correctly on the world map. The last four colors can only be used by NPCs who share palettes with PCs, but who don't ever appear in battle.

This 12-color limitation, along with the fact that you only have 6 palettes to choose from, limits you in your choice of a color scheme for a PC.

So, I've written some new code that allows you to use the last four, reserved palette colors for PCs. This should be helpful to people who want to add or alter PC sprites, and want a larger choice of colors to work with.

Important: you are still limited to 12 colors for each PC. However, the colors do not have to be the first 12 colors in the palette. They can be your choice of any 12. In addition, PCs who share a palette do not necessarily have to use the same 12 colors from that palette.

Basically, this code dynamically alters the sprite and palette whenever they are loaded for battle. The effect is to shift colors from the last 4 "slots" into the first 12 "slots", making them appear without glitches.

If any of the last four "reserved" colors are used in the sprite data, they are switched with one of the first 12, non-reserved colors, and the palette is also switched. This, of course, is only possible to do if there is an unused color in the first 12 "slots". That is why you can still only use 12 colors at maximum for each PC. But, again, they can be your choice of any 12 colors in the palette, and they do not need to be the same colors for each PC that shares the palette.

All you have to do is alter the character sprite with any sprite editor.

The 12 colors must include colors 1, 2, 7 & 8, which are transparent, black (outline color), and two colors for skin, respectively. Those colors will always be considered "used" even if you don't use them for any pixel. The outline and skin colors are special because they are dynamically changed by status effects. That means if we can't switch them around, or it will cause glitches.

I consider this an alpha/beta version of the hack. The reason is: because of all the dynamic bit manipulation that goes on whenever a character sprite is loaded in battle, there is a very noticeable "load time", almost like a playstation game. I consider this a very significant flaw. It happens at the start of every battle, and also whenever a character's sprite changes (like when Terra uses morph, or someone is afflicted with Imp status).

I'm not sure if people will be satisfied with this slowdown in exchange for additional sprite colors. Fell free to give feedback: how annoying is it? Is it worth it?

I'd like to improve efficiency, and I think I can. I didn't try very hard for efficiency on this first pass -- I was just trying to make it work.

I think I can cut the CPU cycles down by half at the least, eliminating unnecessary passes through the data. Possibly significantly better than that, depending on the particular sprite.

Can I speed it up enough to not be annoying? I don't know. If anyone good with ASM looks at my code and has a suggestion to VASTLY improve efficiency, I'd be happy to hear it.

The best/only solution might be to pre-compute the color-shifted battle sprites and store them in the ROM. There would need to be two versions of the sprite data for each PC: one for the world map, and one for battle (with switched-around colors). Since there would no longer be any dynamic alteration of sprite graphics in memory, there would be no additional CPU overhead causing "load time". But, that would be a totally different patch.

I originally preferred not to do it that way, because it would no longer integrate automatically or transparently with an existing sprite editor. Currently, all you have to do is alter sprite with an editor like ff3usME, and it will automatically appear correctly both in and out of battle.

If we stored the altered battle sprite data in the ROM, rather than altering it dynamically, you would have to recalculate the altered sprite every time you change the original sprite, and re-patch the rom with that altered data. I could write an EXE to do that automatically, so it wouldn't be that hard. But it would be an extra complication for people who edit sprite resources. (Plus, it would take up space in the ROM.)

OK, so finally, here is the code. I don't have an IPS patch. You have to copy/paste the hex code manually. I hope anyone interested in testing this is comfortable doing that, especially since this is a "hacker's hack" (a hack that facilitates further hacking). I'll make an IPS patch once I think I have a finished version, and not just a test version.

You must have an expanded ROM. The new code goes in an expanded location. Insert the following hex values in the ROM location beginning at 3100FD (which corresponds to the CPU address F1/00FD).

This ROM address assumes your ROM has no header. Add $200 if it does. And again, the ROM must be expanded, or this location won't even exist. Sorry for the weird offset -- this just happens to be where I planted this code in my own ROM.

Also, my ROM is version 1.0, but I don't know if that will make a difference.

EDIT: As mentioned above, this is now obsolete. Read my follow-up a few posts down in the thread, and use that code instead. There's also an IPS patch.

Code:
Paste to 3100FD:

C9FFD0
016B8514861CAAA5104848DAA61C861A
A5140A186514AAA9C285168BA97F48AB
BF45CEC28512C220BF43CEC28510A945
C78514A61AA90001851AA910008518A7
14C9FFFFD00C7B9D0000E8E8C618D0F7
800EA8B7109D0000E8E8C8C8C618D0F3
E614E614C61AD0D27BE220A61CA94085
12BDC0030A66100A66100A66100A6610
0A66100A66100A66100A6610A5109DC0
03BDC0100A66100A66100A66100A6610
0A66100A66100A66100A6610A5109DC0
10E8C612D0BBABFA680A0A0A0A0ADAAA
BDAE2EC90ED012BDC62EC901D00BADA0
1E2908F004FA7B803B08C23048DA5AA5
1C186900208510A97F008512A9000085
1AA510290F00D008A51038E910008510
C610C610A20000A90000851438A00000
B7103F0007F1F0022614A00100B7103F
0007F1F00426140614A01000B7103F00
07F1F0082614061406140614A01100B7
103F0007F1F010261406140614061406
140614061406142614A51A0514851AE8
E88AC91000D0A0A510C51CD084A51A09
C300851AA90000E22085138514851585
16C220A20000A51A3F1007F1F00BE8E8
8AC91800D0F04C1E039BA21800A51A3F
1007F1D00BE8E88AC92000D0F04C1E03
E230A513D0118A4A0A0A0A0A8513984A
051385134C0603A514D0118A4A0A0A0A
0A8514984A051485144C0603A515D011
8A4A0A0A0A0A8515984A051585154C06
03A516D0298A4A0A0A0A0A8516984A05
1685164C0603C230DABBBF1007F1451A
851AFABF1007F1451A851A4C8302C230
A51C186900208510C230A510290F00D0
08A51038E910008510C610C610A20000
E230A9008518A000B7103F0007F1F006
A51809018518A001B7103F0007F1F006
A51809028518A010B7103F0007F1F006
A51809048518A011B7103F0007F1F006
A51809088518E220A513F068851A4A4A
4A4AC518F02DA514F05A851A4A4A4A4A
C518F01FA515F04C851A4A4A4A4AC518
F011A516F03E851A4A4A4A4AC518F003
4CF403A000A901851BA51A251BF00C80
00BF0007F117109710800ABF0007F149
FF37109710C898C902D002A010061B98
C912D0D56418E8E88AC910F0034C4203
C230A510C51CF0034C28037AFA6828FA
BF2BCEC2C2200A0A0A0A0AAA7BE22068
0A0A0A0A0AA85AA9188510BF0063ED99
AD81E8C8C610D0F308C23048DA5AA510
48A51248A51448A51648A51848A51A48
A51C48A51E488A38E91800851A9838E9
1800851CA51329FF004A4A4A4A0A1865
1AAAA513290F000A18651CA8E220BF00
63ED99AD81E8C8BF0063ED99AD81C220
A51429FF004A4A4A4A0A18651AAAA514
290F000A18651CA8E220BF0063ED99AD
81E8C8BF0063ED99AD81C220A51529FF
004A4A4A4A0A18651AAAA515290F000A
18651CA8E220BF0063ED99AD81E8C8BF
0063ED99AD81C220A51629FF004A4A4A
4A0A18651AAAA516290F000A18651CA8
E220BF0063ED99AD81E8C8BF0063ED99
AD81C22068851E68851C68851A688518
6885166885146885126885107AFA6828
FAFEC4616B0000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
01000200040008001000200040008000
01000200040008001000200040008000
00010002000400080010002000400080

This function replaces the function at ROM location 013D43 (no header) or CPU location C1/3D43. To make the old function call the new function, you must also overwrite the byes starting at 013D43, with:

Code:
Paste to 013D43:

22 FD 00 F1 60

This just makes the old function jump to the new function, and then return. If you want, you can overwrite the rest of the original funciton, since it's no longer used. It stretches through 013E4C (or C1/3E4C).

If you want to see it work, you'll also have to alter the sprites to use the newly expanded color selection. I'm including the two altered sprite sheets that I used to test it. Locke and Terra both use their original palettes, but with different colors.

OK, and finally, I'm including my assembler code. In case you want to look at it or reassemble it yourself. It's based on the disassembly of C1 provided and commented by Imzogelmo, assassin, Lenophis, & Novalia Spirit. But with a good deal of new code I added. I hope I'm not too embarrassed by its sloppiness.

Code:
;====================================

ADDRESS_BEGIN_DATA .EQU $0700
ADDRESS_BEGIN_DATA_ABSOLUTE .EQU ADDRESS_BEGIN_DATA+$F10000
lbg_data_bitmask .EQU ADDRESS_BEGIN_DATA_ABSOLUTE+0
lbg_data_shifted_bit .EQU lbg_data_bitmask+16

;====================================

_push_varibles_10_through_1F .MACRO

    LDA $10
    PHA
    LDA $12
    PHA
    LDA $14
    PHA
    LDA $16
    PHA
    LDA $18
    PHA
    LDA $1A
    PHA
    LDA $1C
    PHA
    LDA $1E
    PHA

    .ENDM

_pull_varibles_10_through_1F .MACRO

    PLA
    STA $1E
    PLA
    STA $1C
    PLA
    STA $1A
    PLA
    STA $18
    PLA
    STA $16
    PLA
    STA $14
    PLA
    STA $12
    PLA
    STA $10

    .ENDM

;====================================

; The routine that loads character battle graphics and palettes
C1_3D43:    ;C9FF        CMP #$FF        (from C1_316F, C1_3B1A, C1_3B2A, C1_3B3A, C1_3B4A)
    CMP #$FF                ;Accumulator is expected to hold the character sprite ID.
                        ;if the chracter ID is FF, that means empty
C1_3D45:    ;D001        BNE $3D48
    BNE C1_3D48
C1_3D47:    ;60          RTS
    RTL                    ;rts replaced with rtl

C1_3D48:    ;8514        STA $14        (from only C1_3D45)

    STA $14
C1_3D4A:    ;861C        STX $1C
    STX $1C
C1_3D4C:    ;AA          TAX
    TAX
C1_3D4D:    ;A510        LDA $10
    LDA $10
C1_3D4F:    ;48          PHA
    PHA
C1_3D50:    ;48          PHA
    PHA
C1_3D51:    ;DA          PHX
    PHX
C1_3D52:    ;A61C        LDX $1C
    LDX $1C
C1_3D54:    ;861A        STX $1A
    STX $1A
C1_3D56:    ;A514        LDA $14
    LDA $14
C1_3D58:    ;0A          ASL A
    ASL A
C1_3D59:    ;18          CLC
    CLC
C1_3D5A:    ;6514        ADC $14
    ADC $14
C1_3D5C:    ;AA          TAX
    TAX                    ; X now holds 3 * the character ID passed in in the accumulator. This is to
                        ; offset a 3-byte long pointer.
C1_3D5D:    ;A9C2        LDA #$C2
    LDA #$C2                ; This will be the high byte of a 3-byte pointer (to data in C2)
C1_3D5F:    ;8516        STA $16
    STA $16
C1_3D61:    ;8B          PHB
    PHB
C1_3D62:    ;A97F        LDA #$7F
    LDA #$7F
C1_3D64:    ;48          PHA
    PHA
C1_3D65:    ;AB          PLB
    PLB                    ; We're now using $7f as our data bank
C1_3D66:    ;BF45CEC2    LDA $C2CE45,X    (High byte of pointer to start of character battle graphics)
    LDA >$C2CE45,X
C1_3D6A:    ;8512        STA $12
    STA $12
C1_3D6C:    ;C220        REP #$20
    REP #$20                ;set the accumulator to 16 bit
    .LONGA ON
C1_3D6E:    ;BF43CEC2    LDA $C2CE43,X     (Pointer to start of character battle graphics)
    LDA >$C2CE43,X
C1_3D72:    ;8510        STA $10
    STA $10
C1_3D74:    ;A945C7      LDA #$C745
    LDA #$C745                ; Note that we already stored #$C2 in $16
C1_3D77:    ;8514        STA $14
    STA $14                    ; Combinted with the new 16-bit address, $14-$16 now hold a
                        ; 16-bit address. This is a data section in C2. I believe it contains
                        ; The data on how to compose 8x8 tiles into sprite tiles.
C1_3D79:    ;A61A        LDX $1A
    LDX $1A                    ; This re-loads the original value which was stored in X when we entered the function.
                        ; This should be $0000, $2000, $4000, or $6000, depending on whether this is
                        ; battle-character 1, 2, 3 or 4.
C1_3D7B:    ;A90001      LDA #$0100
    LDA #$0100                ; $1A will be a counter for an outer loop, counting down from 256
C1_3D7E:    ;851A        STA $1A
    STA $1A
C1_3D80:    ;A91000      LDA #$0010
    LDA #$0010                ; This begins an inner loop...
C1_3D83:    ;8518        STA $18
    STA $18                    ; And the inner loop counter counts down from 16
C1_3D85:    ;A714        LDA [$14]
    LDA [$14]                ; Load from the pointer stored in DP addresses $14-16. Again, these point to data in
                        ; C2.
C1_3D87:    ;C9FFFF      CMP #$FFFF
    CMP #$FFFF                ; #$FFFF appears to represent a blank tile (or blank line?). Special code handles this
                        ; case.
C1_3D8A:    ;D00C        BNE $3D98
    BNE C1_3D98                ; Otherwise, skip ahead to the normal case.
C1_3D8C:    ;7B          TDC
    TDC                    ; (the =#$FFFF case)
                        ; I think this loads 0 into the accumulator. I'm not sure why they didn't just use
                        ; the literal zero?
C1_3D8D:    ;9D0000      STA $0000,X
    STA |$0000,X                ; Now we write #$0000 16 times (for a total of 32 bytes)
C1_3D90:    ;E8          INX
    INX
C1_3D91:    ;E8          INX
    INX
C1_3D92:    ;C618        DEC $18
    DEC $18
C1_3D94:    ;D0F7        BNE $3D8D
    BNE C1_3D8D                ; loop...
C1_3D96:    ;800E        BRA $3DA6
    BRA C1_3DA6                ; when loop ends, skip over the "normal" code and continue
C1_3D98:    ;A8          TAY
    TAY                    ; Here begins the normal != #$FFFF case
C1_3D99:    ;B710        LDA [$10],Y
    LDA [$10],Y                ; Load into A the long address stored in $10-$12, offset by Y, which contains 3 *
                        ; the character spire ID. (This gives us the pointer to the start of the character
                        ; battle graphics)
C1_3D9B:    ;9D0000      STA $0000,X
    STA |$0000,X                ; Store the loaded graphics in memory
C1_3D9E:    ;E8          INX
    INX
C1_3D9F:    ;E8          INX
    INX
C1_3DA0:    ;C8          INY
    INY
C1_3DA1:    ;C8          INY
    INY                    ; ... and increment the counters by 2 bytes each
C1_3DA2:    ;C618        DEC $18
    DEC $18                    ; loop 16 times
C1_3DA4:    ;D0F3        BNE $3D99
    BNE C1_3D99
C1_3DA6:    ;E614        INC $14
    INC $14
C1_3DA8:    ;E614        INC $14
    INC $14                    ; Incremet this pointer into C2 (tile-related data)
C1_3DAA:    ;C61A        DEC $1A
    DEC $1A                    ; Decrement the larger, outer counter, which had the initial value of 256 (#$0100)
C1_3DAC:    ;D0D2        BNE $3D80
    BNE C1_3D80                ; This ends the outer loop. In total, with the inner and outer loop, we will load data
                        ; 256 * 16 times, * 2 bytes per loop, = 8192 bytes (#$2000).
C1_3DAE:    ;7B          TDC
    TDC                    ; This, again is meant to set accum to 0, I think
C1_3DAF:    ;E220        SEP #$20
    SEP #$20                ; Set the accumulator to 8-bit mode
    .LONGA OFF
C1_3DB1:    ;A61C        LDX $1C
    LDX $1C                    ; Restore X to its original value, viz. #$0000, #$2000, #$4000, or #$6000, depending
                        ; on which battle character this is.
C1_3DB3:    ;A940        LDA #$40
    LDA #$40                ; We're going to repeat this loop 64 times
C1_3DB5:    ;8512        STA $12
    STA $12                    ; Loop counter stored in $12
C1_3DB7:    ;BDC003      LDA $03C0,X
    LDA $03C0,X                ; Starting with byte $03C0 ...
C1_3DBA:    ;0A          ASL A
    ASL A                    ; ... we repeatedly ASL A ...
C1_3DBB:    ;6610        ROR $10
    ROR $10                    ; ... and ROR $10, with carry ...
    ASL A
    ROR $10
    ASL A
    ROR $10
    ASL A
    ROR $10
    ASL A
    ROR $10
    ASL A
    ROR $10
    ASL A
    ROR $10
    ASL A                    ; ... repeated 8 times ...
    ROR $10                    ; ... this has the effect of reversing the order of the bits.
C1_3DD2:    ;A510        LDA $10
    LDA $10                    ; Then we take the reversed-bit result...
C1_3DD4:    ;9DC003      STA $03C0,X
    STA $03C0,X                ; ... and save it back to to this memory location.
C1_3DD7:    ;BDC010      LDA $10C0,X
    LDA $10C0,X                ; Then we repeat the same thing for another memory location.
    ASL A
    ROR $10
    ASL A
    ROR $10
    ASL A
    ROR $10
    ASL A
    ROR $10    
    ASL A
    ROR $10
    ASL A
    ROR $10
    ASL A
    ROR $10
    ASL A
    ROR $10    
C1_3DF2:    ;A510        LDA $10
    LDA $10
C1_3DF4:    ;9DC010      STA $10C0,X
    STA $10C0,X                ; ...and save it
C1_3DF7:    ;E8          INX
    INX
C1_3DF8:    ;C612        DEC $12
    DEC $12
C1_3DFA:    ;D0BB        BNE $3DB7
    BNE C1_3DB7                ; Repeat this loop 64 times, for 64 bytes in each of two locations.
C1_3DFC:    ;AB          PLB
    PLB
C1_3DFD:    ;FA          PLX
    PLX
C1_3DFE:    ;68          PLA
    PLA                    ; Restoring a bunch of stuff from the stack
C1_3DFF:    ;0A          ASL A
    ASL A                    ; Shifting bits by 5. I think this has to do with multiplying the character ID by
                        ; 32, in order to get an offset.
C1_3E00:    ;0A          ASL A
    ASL A
C1_3E01:    ;0A          ASL A
    ASL A
C1_3E02:    ;0A          ASL A
    ASL A
C1_3E03:    ;0A          ASL A
    ASL A
C1_3E04:    ;DA          PHX
    PHX
C1_3E05:    ;AA          TAX
    TAX
C1_3E06:    ;BDAE2E      LDA $2EAE,X
    LDA $2EAE,X                ; This checks the character ID...
C1_3E09:    ;C90E        CMP #$0E
    CMP #$0E                ; I believe this section has something to do with handling the palette in the
                        ; special case where the character has imp status, and possibly other special status.
C1_3E0B:    ;D012        BNE $3E1F
    BNE C1_3E1F                ; In any case, we skip it in the general case...
C1_3E0D:    ;BDC62E      LDA $2EC6,X
    LDA $2EC6,X
C1_3E10:    ;C901        CMP #$01
    CMP #$01
C1_3E12:    ;D00B        BNE $3E1F
    BNE C1_3E1F
C1_3E14:    ;ADA01E      LDA $1EA0
    LDA $1EA0
C1_3E17:    ;2908        AND #$08
    AND #$08
C1_3E19:    ;F004        BEQ $3E1F
    BEQ C1_3E1F
C1_3E1B:    ;FA          PLX
    PLX
C1_3E1C:    ;7B          TDC
    TDC
C1_3E1D:    ;8005        BRA $3E24
    BRA C1_3E24
C1_3E1F:    ;FA          PLX

                        ; Jump here if we don't do the special (imp?) palette thing, or after doing it.
                        ; But before doing C1/3E1F, new code.


load_battle_graphics_new:            ; new code begins, in which we mess with the tile and palette values to make it
                        ; so we can use the last four palette colors for battle PCs.


    PHP                    ; push status register

    REP #$30                ; Make accum/index 16 bit
    .LONGA ON
    .LONGI ON

    PHA                    ; push A
    PHX                    ; and X
    PHY                    ; and Y

    LDA $1C                    ; Load into A the value originally stored in X, viz. $0000, $2000... &c
                        ; depending on which of the four battle characters this is.
    CLC
    ADC #$2000                ; Point to the end of the data we just loaded for this character

    STA $10                    ; Store into dp $10-$11, to use as index
    LDA #$7F                ; the third byte of the pointer
    STA $12

    LDA #0
    STA $1A                    ; $1A-$1B will be 16 bits, each representing a palette entry, telling us which
                        ; palette numbers have been used in any pixel of this sprite data.

; We're going to loop through every pixel we just stored in memory, checking the 4-bit pixel/palette value, to
; see which values are used by this sprite and which ones are not.

load_battle_graphics_outer_byte_loop:

    LDA $10
    AND #$000F                ; Check if this is a multiple of 16
    BNE load_battle_graphics_outer_byte_loop_2    ; if not, don't do anything special
    LDA $10
    SEC
    SBC #16                    ; If so, subtract an additional 16 (to account for the weird bitplane format)
    STA $10

load_battle_graphics_outer_byte_loop_2:
    DEC $10
    DEC $10
    LDX #0                    ; X is holding the bit offset (offset of data containing 8 bitmasks, 2 bytes each)

load_battle_graphics_bit_loop:

    LDA #0
    STA $14                    ; $14-$15 are holding temp data on which palette is used by this pixel
    SEC

lbg_byte_0:

    LDY #0                    ; Y is holding the byte offset (1-4)
    LDA [$10],Y                ; Load the start of this 4-byte block of gfx data
    AND >lbg_data_bitmask,X            ; Clear everything but the bit we're interested in

    BEQ lbg_byte_1                ; If this is equal to 0, don't change the pixel palette bit
    ROL $14                    ; Otherwise, shift it by 1

lbg_byte_1:

    LDY #1                    ; Y is holding the byte offset (1-4)
    LDA [$10],Y                ; Load the start of this 4-byte block of gfx data
    AND >lbg_data_bitmask,X            ; Clear everything but the bit we're interested in

    BEQ lbg_byte_2                ; If this is equal to 0, don't change the pixel palette bit
    ROL $14
    ASL $14                    ; Otherwise, shift it by 2

lbg_byte_2:

    LDY #16                    ; Y is holding the byte offset (1-4)
    LDA [$10],Y                ; Load the start of this 4-byte block of gfx data
    AND >lbg_data_bitmask,X            ; Clear everything but the bit we're interested in

    BEQ lbg_byte_3                ; If this is equal to 0, don't change the pixel palette bit
    ROL $14
    ASL $14
    ASL $14
    ASL $14                    ; Otherwise, shift it by 4

lbg_byte_3:

    LDY #17                    ; Y is holding the byte offset (1-4)
    LDA [$10],Y                ; Load the start of this 4-byte block of gfx data
    AND >lbg_data_bitmask,X            ; Clear everything but the bit we're interested in

    BEQ lbg_byte_done            ; If this is equal to 0, don't change the pixel palette bit
    ROL $14
    ASL $14
    ASL $14
    ASL $14
    ASL $14
    ASL $14
    ASL $14
    ASL $14                    ; Otherwise, shift it by 8

lbg_byte_done:

    ROL $14                    ; Shift by 1 more, so "no bits set" / "palette 1" corresponds to bit 1.

    LDA $1A                    ; Load the cumulative "palette entries used" data
    ORA $14                    ; Turn the correct bit on (if it isn't already)
    STA $1A                    ; And store it back

lbg_bit_done:

    INX
    INX                    ; advance the bitmask-pointer-index by one word
    TXA
    CMP #16                    ; Check if we've done all of them
    BNE load_battle_graphics_bit_loop    ; do the next bit if not

lbg_outer_byte_done:

    LDA $10                    ; Load the data pointer
    CMP $1C                    ; Have we reached the start of this character's gfx data?
    BNE load_battle_graphics_outer_byte_loop ; If not, then loop

lbg_palette_usage_loop_done:

    LDA $1A                    ; Palette entries 1, 2, 7 & 8 should always be considered used...
    ORA #%0000000011000011            ; (these are clear, black / outline, and two skin colors)
    STA $1A                    ; Since these are manipulated dynamically in battle, we shouldn't mess with them.

; $13, $14, $15 and $16 (single-byte values) will store data on which bitmap / palette values to switch. The low 4-bits should contain
; the original value (found in the first 12 entries, but unused), and the high 4-bits should contain the switched value (found in the
; last 4 entries, but used)

lbg_calculate_replacement_colors:

    LDA #0

    SEP #$20                ;8-bit accumulator
    .LONGA OFF

    STA $13                    ; initialize the four replacements to zero
    STA $14
    STA $15
    STA $16

    REP #$20                ;16-bit accumulator
    .LONGA ON

lbg_find_unused_color_pre:

    LDX #0                    ; X will be used to count 1-12 (word data) for the first 12 palette entries

lbg_find_unused_color:

    LDA $1A
    AND >lbg_data_shifted_bit,X        ; Check against the appropriate bitmask, to see if the color is used
    BEQ lbg_find_expanded_color_pre        ; If zero (unused), look for a replacement in palette numbers 13-16
    
    INX
    INX                    ; If not, move to the next palette entry
    TXA
    CMP #24                    ; Check if we've reached the end of the first 12 (non-expanded colors)
    BNE lbg_find_unused_color        ; If not, continue to check
    JMP lbg_done_replacement_colors        ; If so, there are no more unused colors, and we're done.

lbg_find_expanded_color_pre:
    
    TXY                    ; Store the original (1-12) color in Y temporarily
    LDX #24                    ; now X will be used to index the "expanded" colors

lbg_find_expanded_color:

    LDA $1A
    AND >lbg_data_shifted_bit,X        ; Check against the appropriate bitmask, to see if the color is used
    BNE lbg_set_replacement            ; If non-zero (used), store the replacement

    INX
    INX                    ; else, move to the next palette entry
    TXA
    CMP #32                    ; Check if we've reached the last palette entry
    BNE lbg_find_expanded_color        ; If not, continue to loop through the expanded colors
    JMP lbg_done_replacement_colors        ; If so, there are no more used expanded colors, and we are done

lbg_set_replacement:

    SEP #$30                ;8-bit accumulator/index
    .LONGA OFF
    .LONGI OFF

lbg_replace_in_13:

    LDA $13
    BNE lbg_replace_in_14            ; If it's != 0 (value already set), check the next one

    TXA
    LSR                    ; Divide X (the expanded palette index) by 2, to get the palette number
    ASL
    ASL
    ASL
    ASL                    ; Shift left by 4, because we want to store this in the higher four bits
    STA $13

    TYA
    LSR                    ; Divide Y (the original palette index) by 2, to get the palette number
    ORA $13                    ; Transfer the high bits we already stored
    STA $13                    ; Store the total number
    JMP lbg_replace_done

lbg_replace_in_14:

    LDA $14
    BNE lbg_replace_in_15            ; If it's != 0 (value already set), check the next one

    TXA
    LSR                    ; Divide X (the expanded palette index) by 2, to get the palette number
    ASL
    ASL
    ASL
    ASL                    ; Shift left by 4, because we want to store this in the higher four bits
    STA $14

    TYA
    LSR                    ; Divide Y (the original palette index) by 2, to get the palette number
    ORA $14                    ; Transfer the high bits we already stored
    STA $14                    ; Store the total number
    JMP lbg_replace_done

lbg_replace_in_15:

    LDA $15
    BNE lbg_replace_in_16            ; If it's != 0 (value already set), check the next one

    TXA
    LSR                    ; Divide X (the expanded palette index) by 2, to get the palette number
    ASL
    ASL
    ASL
    ASL                    ; Shift left by 4, because we want to store this in the higher four bits
    STA $15

    TYA
    LSR                    ; Divide Y (the original palette index) by 2, to get the palette number
    ORA $15                    ; Transfer the high bits we already stored
    STA $15                    ; Store the total number
    JMP lbg_replace_done

lbg_replace_in_16:

    LDA $16
    BNE lbg_done_replacement_colors        ; If it's != 0 (value already set), we can't do any more replacements

    TXA
    LSR                    ; Divide X (the expanded palette index) by 2, to get the palette number
    ASL
    ASL
    ASL
    ASL                    ; Shift left by 4, because we want to store this in the higher four bits
    STA $16

    TYA
    LSR                    ; Divide Y (the original palette index) by 2, to get the palette number
    ORA $16                    ; Transfer the high bits we already stored
    STA $16                    ; Store the total number
    JMP lbg_replace_done

lbg_replace_done:

    REP #$30                ;16-bit accumulator/index
    .LONGA ON
    .LONGI ON

    PHX
    TYX

    LDA >lbg_data_shifted_bit,X        ; Get the bitmask for the original color
    EOR $1A                    ; Switch this value from unused to used, so we don't replace the same one
    STA $1A

    PLX

    LDA >lbg_data_shifted_bit,X        ; Get the bitmask for the replacement color
    EOR $1A                    ; Switch this value from used to unused, so we don't make the same replacement
    STA $1A

    JMP lbg_find_unused_color_pre        ; Return to the beginning of the loop

lbg_done_replacement_colors:


;======================
; This part of the code goes through each pixel of the sprite, checks if it is one that we need to change, and changes it
; if necessary.

lbg_pixel_changing_loop_pre:

    REP #$30                ; 16-bit accumulator/index
    .LONGA ON
    .LONGI ON

    LDA $1C                    ; Load into A the value originally stored in X, viz. $0000, $2000... &c
                        ; depending on which of the four battle characters this is.
    CLC
    ADC #$2000                ; Point to the end of the data we loaded for this character
    STA $10                    ; Store into dp $10-$11, to use as index

lbg_bitmap_outer_byte_loop:

    REP #$30                ; 16-bit accumulator/index
    .LONGA ON
    .LONGI ON

    LDA $10
    AND #$000F                ; Check if this is a multiple of 16
    BNE lbg_bitmap_outer_byte_loop_2    ; if not, don't do anything special
    LDA $10
    SEC
    SBC #16                    ; If so, subtract an additional 16 (to account for the weird bitplane format)
    STA $10

lbg_bitmap_outer_byte_loop_2:
    DEC $10
    DEC $10

    LDX #0                    ; X is holding the bit offset (offset of data containing 8 bitmasks, 2 bytes each)

    SEP #$30                ; 8-bit accumulator/index
    .LONGA OFF
    .LONGI OFF

lbg_bitmap_bit_loop:

    LDA #0
    STA $18                    ; $18 will hold the (original) palette number of the current pixel

lbg_bitmap_byte_0:

    LDY #0                    ; Y is holding the byte offset (1-4)
    LDA [$10],Y                ; Load the start of this 4-byte block of gfx data
    AND >lbg_data_bitmask,X            ; Clear everything but the bit we're interested in

    BEQ lbg_bitmap_byte_1            ; If this is equal to 0, don't change the pixel palette bit

    LDA $18
    ORA #%1                    ; Otherwise, turn on the first bit
    STA $18

lbg_bitmap_byte_1:

    LDY #1                    ; Y is holding the byte offset (1-4)
    LDA [$10],Y                ; Load the start of this 4-byte block of gfx data
    AND >lbg_data_bitmask,X            ; Clear everything but the bit we're interested in

    BEQ lbg_bitmap_byte_2            ; If this is equal to 0, don't change the pixel palette bit

    LDA $18
    ORA #%10                ; Otherwise, turn on the second bit
    STA $18

lbg_bitmap_byte_2:

    LDY #16                    ; Y is holding the byte offset (1-4)
    LDA [$10],Y                ; Load the start of this 4-byte block of gfx data
    AND >lbg_data_bitmask,X            ; Clear everything but the bit we're interested in

    BEQ lbg_bitmap_byte_3            ; If this is equal to 0, don't change the pixel palette bit

    LDA $18
    ORA #%100                ; Otherwise, turn on the third bit
    STA $18

lbg_bitmap_byte_3:

    LDY #17                    ; Y is holding the byte offset (1-4)
    LDA [$10],Y                ; Load the start of this 4-byte block of gfx data
    AND >lbg_data_bitmask,X            ; Clear everything but the bit we're interested in

    BEQ lbg_bitmap_byte_done        ; If this is equal to 0, don't change the pixel palette bit

    LDA $18
    ORA #%1000                ; Otherwise, turn on the fourth bit
    STA $18

lbg_bitmap_byte_done:

    SEP #$20
    .LONGA OFF                ; 8-bit accumulator

lbg_bitmap_check_13:
    LDA $13                    ; check if this is one of the expanded palette numbers we want to change
    BEQ lbg_bitmap_bit_done            ; If both values are 0, this switch is not used.

    STA $1A                    ; store a copy of this in $1A, so we can access it if it's a match
    LSR
    LSR
    LSR
    LSR                    ; shift right 4, because the expanded palette is in the high bits
    CMP $18                    ; compare to the number for this pixel
    BEQ lbg_bitmap_make_switch_pre

lbg_bitmap_check_14:
    LDA $14                    ; check if this is one of the expanded palette numbers we want to change
    BEQ lbg_bitmap_bit_done            ; If both values are 0, this switch is not used.

    STA $1A                    ; store a copy of this in $1A, so we can access it if it's a match
    LSR
    LSR
    LSR
    LSR                    ; shift right 4, because the expanded palette is in the high bits
    CMP $18                    ; compare to the number for this pixel
    BEQ lbg_bitmap_make_switch_pre

lbg_bitmap_check_15:
    LDA $15                    ; check if this is one of the expanded palette numbers we want to change
    BEQ lbg_bitmap_bit_done            ; If both values are 0, this switch is not used.

    STA $1A                    ; store a copy of this in $1A, so we can access it if it's a match
    LSR
    LSR
    LSR
    LSR                    ; shift right 4, because the expanded palette is in the high bits
    CMP $18                    ; compare to the number for this pixel
    BEQ lbg_bitmap_make_switch_pre

lbg_bitmap_check_16:
    LDA $16                    ; check if this is one of the expanded palette numbers we want to change
    BEQ lbg_bitmap_bit_done            ; If both values are 0, this switch is not used.

    STA $1A                    ; store a copy of this in $1A, so we can access it if it's a match
    LSR
    LSR
    LSR
    LSR                    ; shift right 4, because the expanded palette is in the high bits
    CMP $18                    ; compare to the number for this pixel
    BEQ lbg_bitmap_make_switch_pre

    JMP lbg_bitmap_bit_done            ; If none are a match, we're done with this bit

lbg_bitmap_make_switch_pre:

    LDY #0                    ; Y stores the index into the bitplane byte (0, 1, 16 or 17)
    LDA #1
    STA $1B                    ; 1B is a bitmask

lbg_bitmap_make_switch_loop:
    LDA $1A                    ; load the palette-switch value

    AND $1B                    ; Check the appropriate bit
    BEQ lbg_bitmap_make_switch_reset    ; If zero, reset the bit
    BRA lbg_bitmap_make_switch_set        ; otherwise, set it

lbg_bitmap_make_switch_set:

    LDA >lbg_data_bitmask,X                ; set the bit
    ORA [$10],Y
    STA [$10],Y
    BRA lbg_bitmap_make_switch_increment

lbg_bitmap_make_switch_reset:

    LDA >lbg_data_bitmask,X                ; reset the bit
    EOR #%11111111
    AND [$10],Y
    STA [$10],Y

lbg_bitmap_make_switch_increment:

    INY
    TYA
    CMP #2                        ; We want to jump from offset 1 straight to offset 16
    BNE lbg_bitmap_make_switch_increment_2
    LDY #16
lbg_bitmap_make_switch_increment_2:
    ASL $1B
    TYA
    CMP #18
    BNE lbg_bitmap_make_switch_loop

lbg_bitmap_bit_done:

    STZ $18
    INX
    INX                    ; advance the bitmask-pointer-index by one word
    TXA
    CMP #16                    ; Check if we've done all of them
    BEQ lbg_bitmap_outer_byte_done        ; do the next bit if not
    JMP lbg_bitmap_bit_loop

lbg_bitmap_outer_byte_done:

    REP #$30                ; 16 bit accumulator/index
    .LONGA ON
    .LONGI ON

    LDA $10                    ; Load the data pointer
    CMP $1C                    ; Have we reached the start of this character's gfx data?
    BEQ lbg_bitmap_loop_done
    JMP lbg_bitmap_outer_byte_loop         ; If not, then loop

lbg_bitmap_loop_done:

    PLY
    PLX
    PLA                    ; Pull everything we pushed onto the stack
    PLP


;======================
; Finally, first section of new code is done, and we pick up where we left off

;C1/3E1F:    ;FA          PLX
    PLX                    ; Jump here if we don't do the special (imp?) palette thing, or after doing it.
C1_3E20:    ;BF2BCEC2    LDA $C2CE2B,X
    LDA >$C2CE2B,X                ; $C2CE2B: Battle Character Palette Assignments (1 byte each)
C1_3E24:    ;C220        REP #$20
    REP #$20                ; Accumulator back to 16 bit
    .LONGA ON
C1_3E26:    ;0A          ASL A
    ASL A                    ; Again ASL 5 times. A is the character battle palette assignment. This means we will
C1_3E27:    ;0A          ASL A        ; multiply that number by 32 when using it as an offset.
    ASL A
C1_3E28:    ;0A          ASL A
    ASL A
C1_3E29:    ;0A          ASL A
    ASL A
C1_3E2A:    ;0A          ASL A
    ASL A
C1_3E2B:    ;AA          TAX         ; store the offset in  X
    TAX
C1_3E2C:    ;7B          TDC
    TDC
C1_3E2D:    ;E220        SEP #$20
    SEP #$20                ; Long A off again
    .LONGA OFF
C1_3E2F:    ;68          PLA
    PLA
C1_3E30:    ;0A          ASL A
    ASL A
C1_3E31:    ;0A          ASL A
    ASL A
C1_3E32:    ;0A          ASL A
    ASL A
C1_3E33:    ;0A          ASL A
    ASL A
C1_3E34:    ;0A          ASL A
    ASL A
C1_3E35:    ;A8          TAY
    TAY
C1_3E36:    ;5A          PHY
    PHY
C1_3E37:    ;A918        LDA #$18
    LDA #$18                ; Load the number #$18 -- 24 in decimal. Probably for 12 palette colors.
C1_3E39:    ;8510        STA $10
    STA $10                    ; $10 is being used as the loop index
C1_3E3B:    ;BF0063ED    LDA $ED6300,X
    LDA >$ED6300,X
C1_3E3F:    ;99AD81      STA $81AD,Y
    STA $81AD,Y
C1_3E42:    ;E8          INX
    INX
C1_3E43:    ;C8          INY
    INY
C1_3E44:    ;C610        DEC $10
    DEC $10
C1_3E46:    ;D0F3        BNE $3E3B
    BNE C1_3E3B

; new code for switching the palette assignments

lbg_new_switch_palette:

    PHP                    ; push status register

    REP #$30
    .LONGA ON
    .LONGI ON

    PHA                    ; push A
    PHX                    ; and X
    PHY                    ; and Y

    _push_varibles_10_through_1F

; X and Y have both been incremented #$18 (24) times, and we want to restore them to their original state
    TXA
    SEC
    SBC #$18
    STA $1A

    TYA
    SEC
    SBC #$18
    STA $1C

    LDA $13                    ; load the first switcheroo, located in $13
    AND #%0000000011111111            ; we want to use only the low byte
    LSR
    LSR
    LSR
    LSR                    ; shift right 4 times, to get the "expanded palette" value stored in the upper bits
    ASL                    ; shift left, since it's a word per color
    CLC
    ADC $1A
    TAX

    LDA $13
    AND #%00001111                ; use only the low bits, which have the "original palette" information
    ASL                    ; again, multiply by 2
    CLC
    ADC $1C
    TAY

    SEP #$20                ; Long A off
    .LONGA OFF
    LDA >$ED6300,X                ; load the expanded color...
    STA $81AD,Y                ; ...and replace the color at the correct location
    INX
    INY
    LDA >$ED6300,X                ; load the expanded color...
    STA $81AD,Y                ; ...and replace the color at the correct location
    REP #$20                ; Long A off again
    .LONGA ON

; Repeat for $14
    LDA $14                    ; now the one located in $14
    AND #%0000000011111111            ; we want to use only the low byte
    LSR
    LSR
    LSR
    LSR                    ; shift right 4 times, to get the "expanded palette" value stored in the upper bits
    ASL                    ; shift left, since it's a word per color
    CLC
    ADC $1A                    ; Add to this the "starting" value of X, stored earlier
    TAX

    LDA $14
    AND #%00001111                ; use only the low bytes, which have the "original palette" information
    ASL                    ; again, multiply by 2
    CLC
    ADC $1C                    ; Add the "starting" value of Y
    TAY

    SEP #$20                ; Long A off
    .LONGA OFF
    LDA >$ED6300,X                ; load the expanded color...
    STA $81AD,Y                ; ...and replace the color at the correct location
    INX
    INY
    LDA >$ED6300,X                ; load the expanded color...
    STA $81AD,Y                ; ...and replace the color at the correct location
    REP #$20                ; Long A off again
    .LONGA ON

; Repeat for $15
    LDA $15                    ; now the one located in $15
    AND #%0000000011111111            ; we want to use only the low byte
    LSR
    LSR
    LSR
    LSR                    ; shift right 4 times, to get the "expanded palette" value stored in the upper bits
    ASL                    ; shift left, since it's a word per color
    CLC
    ADC $1A                    ; Add to this the "starting" value of X, stored earlier
    TAX

    LDA $15
    AND #%00001111                ; use only the low bytes, which have the "original palette" information
    ASL                    ; again, multiply by 2
    CLC
    ADC $1C                    ; Add the "starting" value of Y
    TAY

    SEP #$20                ; Long A off
    .LONGA OFF
    LDA >$ED6300,X                ; load the expanded color...
    STA $81AD,Y                ; ...and replace the color at the correct location
    INX
    INY
    LDA >$ED6300,X                ; load the expanded color...
    STA $81AD,Y                ; ...and replace the color at the correct location
    REP #$20                ; Long A off again
    .LONGA ON

; Repeat for $16
    LDA $16                    ; now the one located in $16
    AND #%0000000011111111            ; we want to use only the low byte
    LSR
    LSR
    LSR
    LSR                    ; shift right 4 times, to get the "expanded palette" value stored in the upper bits
    ASL                    ; shift left, since it's a word per color
    CLC
    ADC $1A                    ; Add to this the "starting" value of X, stored earlier
    TAX

    LDA $16
    AND #%00001111                ; use only the low bytes, which have the "original palette" information
    ASL                    ; again, multiply by 2
    CLC
    ADC $1C                    ; Add the "starting" value of Y
    TAY

    SEP #$20                ; Long A off
    .LONGA OFF
    LDA >$ED6300,X                ; load the expanded color...
    STA $81AD,Y                ; ...and replace the color at the correct location
    INX
    INY
    LDA >$ED6300,X                ; load the expanded color...
    STA $81AD,Y                ; ...and replace the color at the correct location
    REP #$20                ; Long A off again
    .LONGA ON

    _pull_varibles_10_through_1F

    PLY
    PLX
    PLA
    PLP

; done with palette switching code

C1_3E48:    ;FA          PLX
    PLX
C1_3E49:    ;FEC461      INC $61C4,X
    INC $61C4,X
C1_3E4C:    ;60          RTS
    RTL                    ; changed to rtl from rts

    .ORG ADDRESS_BEGIN_DATA
;lbg_data_bitmask
    .WORD %0000000000000001, %0000000000000010, %0000000000000100, %0000000000001000, %0000000000010000, %0000000000100000, %0000000001000000, %0000000010000000
;lbg_data_shifted_bit:
    .WORD %1, %10, %100, %1000, %10000, %100000, %1000000, %10000000, %100000000, %1000000000, %10000000000, %100000000000, %1000000000000, %10000000000000, %100000000000000, %1000000000000000
  Find
Quote  
[-] The following 4 users say Thank You to Eggers for this post:
  • Gi Nattak (08-09-2013), Murak Modder (01-26-2014), Royaken (03-12-2014), SSJ Rick (08-09-2013)

#2
Posts: 763
Threads: 83
Thanks Received: 55
Thanks Given: 7
Joined: Apr 2015
Reputation: 22
Status
Obliviscence
This is absolutely brilliant. When I get time (probably not til next week), I will definitely pick apart your code. For the longest time I wanted to make a patch that allowed all 16 colors, but am a total n00b when it comes to the graphics engine. But THIS is right up my alley, pure logic in the loading. I'm sure I can get this down to a nice compact code. This will help me a freaking ton in my hack. Thank you.


Edit: it looks like your JSL is wrong. It's jumping to F1/00FD instead of F1/01D9 where you told us to put the code. It eventually gets to the code, but just thought you should know.

Also, this doesn't seem to be working for me. ZSNESW will hang forever, and snes9x crashes after a moment. In my snes9x debugger I am trying to see where it breaks down. I see its running F1/0207 to F1/0265 over and over but I think that's intended... again, I haven't had a chance to look at the code yet.

Edit 2: To help me out a little, can you give me a quick rundown of how the game interprets the sprite data in relation to how it extracts the color in the palette that gets used (for the sprites themselves, not for your patch)? I see that there is 1 nibble for each pixel in a spritesheet, I'm assuming that means each nibble corresponds to a pixel, but looking at the data, I don't see how this maps to the actual spritesheet unless it goes in some weird order.
  Find
Quote  

#3
Posts: 45
Threads: 4
Thanks Received: 7
Thanks Given: 1
Joined: Jul 2013
Reputation: 4
Status
None
Oops! Sorry about the mistake. I actually gave the wrong location to paste the code. It should be 3100FD. I actually gave the correct address when I pasted the long jump statement from C1, but the wrong one when I told you where to paste the new function. (The reason I got mixed up is that the address I originally gave is actually the start of MY new code, but there is also code duplicated from the original function in C1.)

About how the sprite data is stored. This actually took a long time to figure out. There are useful docs on http://wiki.superfamicom.org/snes/show/HomePage, but even after reading the docs there I was still a bit confused.

(edited this description for clarity, typos regarding bit/byte, &c)

- OK, so, each pixel needs four bits to store its color index. But, those four bits aren't in the same byte.

- Each byte stores basically a "row" of binary / two-color graphics data. 8 pixels long by 1 pixel high.

- A byte for one row is followed immediately by another byte for the same row, but containing the second "bitplane". The bits/pixels in the first byte, with the corresponding bits/pixels in the second byte, can each hold a two-bit color value (four possible colors).

- This repeats for 8 rows, meaning a single 8 x 8 FOUR-COLOR graphics tile takes up 16 bytes.

Here's a diagram, representing bytes as boxes, showing rows (R1 / R2 / &c...), as well as bitplanes "A" and "B".

Code:
$00      $01       $02        $03      $04       $05       $06       $07
_______  _______   _______   _______   _______   _______   _______   _______
|     |  |     |   |     |   |     |   |     |   |     |   |     |   |     |
| R1  |  | R1  |   | R2  |   | R2  |   | ... |   |     |   |     |   |     |
| A   |  | B   |   | A   |   | B   |   |     |   |     |   |     |   |     |
_______  _______   _______   _______   _______   _______   _______   _______

  $08      $09       $0A        $0B      $0C       $0D       $0E       $0F
_______  _______   _______   _______   _______   _______   _______   _______
|     |  |     |   |     |   |     |   |     |   |     |   |     |   |     |
| R5  |  | R5  |   | ... |   |     |   |     |   |     |   | R8  |   | R8  |
| A   |  | B   |   |     |   |     |   |     |   |     |   | A   |   | B   |
_______  _______   _______   _______   _______   _______   _______   _______

- OK, so, that's a four-color image. Now you might think a 16-color image would just have 4 bitplanes in a row, A B C D. But no. It turns out it has 16 bytes for one four-color tile, then starts again with another 16 bytes holding bitplanes C and D of the same four-color tile. It's as if it's two separate 8x8 pixel tiles in a row, each with 2 bits of color depth. But when they're interpreted as 4-bit art, they're displayed as one tile with 4 bits of color depth.

Code:
$00      $01       $02        $03      $04       $05       $06       $07
_______  _______   _______   _______   _______   _______   _______   _______
|     |  |     |   |     |   |     |   |     |   |     |   |     |   |     |
| R1  |  | R1  |   | R2  |   | R2  |   | ... |   |     |   |     |   |     |
| A   |  | B   |   | A   |   | B   |   |     |   |     |   |     |   |     |
_______  _______   _______   _______   _______   _______   _______   _______

  $08      $09       $0A        $0B      $0C       $0D       $0E       $0F
_______  _______   _______   _______   _______   _______   _______   _______
|     |  |     |   |     |   |     |   |     |   |     |   |     |   |     |
| R5  |  | R5  |   | ... |   |     |   |     |   |     |   | R8  |   | R8  |
| A   |  | B   |   |     |   |     |   |     |   |     |   | A   |   | B   |
_______  _______   _______   _______   _______   _______   _______   _______

  $10      $11       $12       $13       $14       $15       $16       $17
_______  _______   _______   _______   _______   _______   _______   _______
|     |  |     |   |     |   |     |   |     |   |     |   |     |   |     |
| R1  |  | R1  |   | R2  |   | R2  |   | ... |   |     |   |     |   |     |
| C   |  | D   |   | C   |   | D   |   |     |   |     |   |     |   |     |
_______  _______   _______   _______   _______   _______   _______   _______

  $18      $19       $1A       $1B       $1C       $1D       $1E       $1F
_______  _______   _______   _______   _______   _______   _______   _______
|     |  |     |   |     |   |     |   |     |   |     |   |     |   |     |
| R5  |  | R5  |   | ... |   |     |   |     |   |     |   | R8  |   | R8  |
| C   |  | D   |   |     |   |     |   |     |   |     |   | C   |   | D   |
_______  _______   _______   _______   _______   _______   _______   _______

And that's how a 16-color graphics tile is stored. Which makes the bit manipulation weird and cumbersome. This is why there are parts of my code where I'm reading offset 0, followed by offset 1, followed by offset 16, followed by offset 17 (to get the four bitplanes for the same pixel of the same row of the same tile).

Hope this helps, and I hope you get the code working once you paste it in the right place. It's working for me with snes9x debug version.



So, I'm including some more notes on how my code works, since the "big picture" comments might be too scarce, and the labels might be not descriptive enough.

I'm also including my notes on how I think it can be sped up / how I plan to speed it up next revision.

There are basically four sections to it.

Section one, which starts with label load_battle_graphics_new:

This takes place after some original-game code (not written by me) loads the graphics data from the ROM into $2000 bytes of space in RAM.

It goes systematically through every pixel in every frame of the character sprite, calculates which palette entry that pixel uses (based on all four bitplanes), and "checks off" that palette color as used, if it has been used.

Section two, which starts with label lbg_calculate_replacement_colors:

Just stores, in a different format, four bits of information, containing an unused color (from the set of 12 which are usable in the unaltered rom), and a corresponding "expanded" color (from the last 4, normally unusable), where the second should replace the first.

Section three, which starts with label lbg_pixel_changing_loop_pre:

Goes through all the graphics tile data AGAIN, and does the replacement, in a similar bit-by-bit way to section 1.

Section four, which starts with label lbg_new_switch_palette:

This takes place after some more code which was part of the original function. This just replaces the palette colors. Section three replaced the graphics data, this does the palette data.

Now,

efficiency:
- Parts two and four probably take very little time, so it's one and three that we need to worry about.

Pertaining to part 1, where we read which colors are used:
- Unfortunately, there are no shortcuts we can take here in general. It has to go through every pixel to "prove" a particular color is unused.
- If all of the first 12 colors get filled, we could stop then, because then nothing can possibly get switched out. But that won't always happen even for the original sprites. And it's the only case where we can stop early.
- Actually, we could stop after ANY 12 colors are used at least once, and just assume the remaining ones are unused. This would result in unpredictable behavior in cases where the sprite actually uses more than 12 colors, but it should work if the sprite artist follows the "contract".
- The information we get here, once complete, only takes up 16 bits of space. If there is any significant free RAM, we could cache it, and then it only has to be done once per sprite each time the game is loaded. (The only free RAM I know of is in the location that gets stored to SRAM.)
- There might be a faster way to do the bit manipulation?
- Could it be sped up if I combine it with the code present in the original function, where the tile data is originally loaded?

Pertaining to part 3, where we dynamically change the sprite data:
- The change only potentially has to be done if the sprite uses one of the originally unusable, last four palette values / colors. This could be sped up (for some sprites more than others) if we do a quick check on each row of graphics data to see if any pixels in that row use those values. We could just LOGICAL-AND the last bitplane with the second-last bitplane, to check. Then pixel changing has to be done for that row if the result is non-zero (pixels that use the last four colors have both the last two bitplanes "on").
- Again, I'm not sure if my bit-manipulation code is inefficient here. I've only done a little bit of ASM coding before (in classes) so I'm not the perfect expert at doing this kind of bit prodding in the fewest CPU cycles.

If I make the above changes, I think that will significantly speed things up, but I don't know if can make it fast enough that it will be without a noticeable delay.
  Find
Quote  

#4
Posts: 45
Threads: 4
Thanks Received: 7
Thanks Given: 1
Joined: Jul 2013
Reputation: 4
Status
None
OK, edited the original post again.

I realized I only pasted the hex code starting at $01D9, when I wanted all the code starting at $00FD. Even after I updated the "paste-to" address to reflect that, I forgot to paste the actual hexes.

They're there now. NOW it should work.

Also, assembling the code I provided should work, and allow you to put the code anywhere, if you have an assembler handy. The reason it has to be put at the right address if you just copy the hexes is because it uses some jump statements when doing conditional branches / loops.
  Find
Quote  

#5
Posts: 45
Threads: 4
Thanks Received: 7
Thanks Given: 1
Joined: Jul 2013
Reputation: 4
Status
None
UPDATE

I got inspired, and did a new version of it, with vastly improved efficiency. The most CPU intensive parts of the code have been completely rewritten. I'd say it's now two or three times as fast (based on my subjective judgment -- I didn't do any clocking.)

The "load time" is now much more bearable. I'd say it usually isn't that noticeable. The main circumstance I notice it in is when Terra switches from human to esper form, or back (the animation makes it more obvious than usual). But, even then, though it's noticeable, it isn't bad.

I could probably clean things up just a little bit more, but I consider this close to a "release candidate".

Anyway, I'd like people to test this. I've tested it myself, and I don't notice any bugs, but it's possible there's some circumstance I haven't tried.

I'd also still be happy for any comments on how I can improve the algorithm, though it's already much improved.

I'll wait a while for feedback, and possibly make a few more small changes, before declaring it "done".

Speaking of which, does anyone have any suggestions on how I should "release" this? Should I submit it to romhacking.net or something like that? Considering that it's mainly a patch to help people who are doing their own patches (as opposed to a patch that improves the game in itself), would this site be interested in hosting it or linking to it?

So, here's the new hex, and source:

Code:
paste into 3100FD (F1/00FD)

C9FFD0
016B8514861CAAA5104848DAA61C861A
A5140A186514AAA9C285168BA97F48AB
BF45CEC28512C220BF43CEC28510A945
C78514A61AA90001851AA910008518A7
14C9FFFFD00C7B9D0000E8E8C618D0F7
800EA8B7109D0000E8E8C8C8C618D0F3
E614E614C61AD0D27BE220A61CA94085
12BDC0030A66100A66100A66100A6610
0A66100A66100A66100A6610A5109DC0
03BDC0100A66100A66100A66100A6610
0A66100A66100A66100A6610A5109DC0
10E8C612D0BBABFA680A0A0A0A0A08C2
3048DA5AA9C300851AA21E00E2206419
6418C220A51C186900208510A97F0085
12C220A510290F00D008A51038E91000
8510C610C610A00000E220A9028514A9
FF85158A2514F004A9008002A9FF5710
25158515F0110614C898C912F009C902
D0E1A0100080DCE220A515F016C220A5
1A1F1006F1851A8AC91800300EE220E6
198008C220A510C51CD096C2208AC918
00100CA51A3F1006F1D004E220E618C2
20CACA8AC90000F02CA51A3F1006F1D0
EE8AC918001019E220A518C519D011C2
20A51A1F1006F1851ACACA8A10F18005
C2204CD201A90000E220851385148515
8516C220A20000A51A3F1006F1F00BE8
E88AC91800D0F04C3F039BA21800A51A
3F1006F1D00BE8E88AC92000D0F04C3F
03E230A513D0118A4A0A0A0A0A851398
4A051385134C2703A514D0118A4A0A0A
0A0A8514984A051485144C2703A515D0
118A4A0A0A0A0A8515984A051585154C
2703A516D0298A4A0A0A0A0A8516984A
051685164C2703C230DABBBF1006F145
1A851AFABF1006F1451A851A4CA402E2
30A513D0034CE603C230A51C18690020
8510C230A510290F00D008A51038E910
008510C610C610E230A200A011A9FF85
18A9808519BF130000D0034CDB032519
F004A9008002A9FF57102518F0448518
98C900F00EC910D004A0018001884619
4C7503A000A9018519BF1300002519F0
08B710051897108008A9FF4518371097
10C898C912F00BC902D002A01006194C
A903E88AC904F0034C6B03C230A510C5
1CF0034C5203C2307AFA6828DAAABDAE
2EC90ED012BDC62EC901D00BADA01E29
08F004FA7B8005FABF2BCEC2C2200A0A
0A0A0AAA7BE220680A0A0A0A0AA85AA9
188510BF0063ED99AD81E8C8C610D0F3
08C23048DA5AA51048A51248A51448A5
1648A51848A51A48A51C48A51E488A38
E91800851A9838E91800851CA51329FF
004A4A4A4A0A18651AAAA513290F000A
18651CA8E220BF0063ED99AD81E8C8BF
0063ED99AD81C220A51429FF004A4A4A
4A0A18651AAAA514290F000A18651CA8
E220BF0063ED99AD81E8C8BF0063ED99
AD81C220A51529FF004A4A4A4A0A1865
1AAAA515290F000A18651CA8E220BF00
63ED99AD81E8C8BF0063ED99AD81C220
A51629FF004A4A4A4A0A18651AAAA516
290F000A18651CA8E220BF0063ED99AD
81E8C8BF0063ED99AD81C22068851E68
851C68851A6885186885166885146885
126885107AFA6828FAFEC4616B000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
01000200040008001000200040008000
01000200040008001000200040008000
00010002000400080010002000400080

As before, you also have to jump to the new code from the old function.

Code:
Paste to 013D43 (C1/3D45):

22 FD 00 F1 60

And the ASM.

(By the way: this ASM code includes both my new code, and the disassembly of the original function from C1. I added a significant number of my own comments to the original C1 code. So, if you are interested in understanding/editing this code yourself, those might be helpful to you.)

Code:
;====================================

ADDRESS_BEGIN_DATA .EQU $0600
ADDRESS_BEGIN_DATA_ABSOLUTE .EQU ADDRESS_BEGIN_DATA+$F10000
lbg_data_bitmask .EQU ADDRESS_BEGIN_DATA_ABSOLUTE+0
lbg_data_shifted_bit .EQU lbg_data_bitmask+16

;====================================

; The routine that loads character battle graphics and palettes
C1_3D43:    ;C9FF        CMP #$FF        (from C1_316F, C1_3B1A, C1_3B2A, C1_3B3A, C1_3B4A)
    CMP #$FF                ;Accumulator is expected to hold the character sprite ID.
                        ;if the chracter ID is FF, that means empty
C1_3D45:    ;D001        BNE $3D48
    BNE C1_3D48
C1_3D47:    ;60          RTS
    RTL                    ;rts replaced with rtl

C1_3D48:    ;8514        STA $14        (from only C1_3D45)

    STA $14
C1_3D4A:    ;861C        STX $1C
    STX $1C
C1_3D4C:    ;AA          TAX
    TAX
C1_3D4D:    ;A510        LDA $10
    LDA $10
C1_3D4F:    ;48          PHA
    PHA
C1_3D50:    ;48          PHA
    PHA
C1_3D51:    ;DA          PHX
    PHX
C1_3D52:    ;A61C        LDX $1C
    LDX $1C
C1_3D54:    ;861A        STX $1A
    STX $1A
C1_3D56:    ;A514        LDA $14
    LDA $14
C1_3D58:    ;0A          ASL A
    ASL A
C1_3D59:    ;18          CLC
    CLC
C1_3D5A:    ;6514        ADC $14
    ADC $14
C1_3D5C:    ;AA          TAX
    TAX                    ; X now holds 3 * the character ID passed in in the accumulator. This is to
                        ; offset a 3-byte long pointer.
C1_3D5D:    ;A9C2        LDA #$C2
    LDA #$C2                ; This will be the high byte of a 3-byte pointer (to data in C2)
C1_3D5F:    ;8516        STA $16
    STA $16
C1_3D61:    ;8B          PHB
    PHB
C1_3D62:    ;A97F        LDA #$7F
    LDA #$7F
C1_3D64:    ;48          PHA
    PHA
C1_3D65:    ;AB          PLB
    PLB                    ; We're now using $7f as our data bank
;    .DBREG $7F
C1_3D66:    ;BF45CEC2    LDA $C2CE45,X    (High byte of pointer to start of character battle graphics)
    LDA >$C2CE45,X
C1_3D6A:    ;8512        STA $12
    STA $12
C1_3D6C:    ;C220        REP #$20
    REP #$20                ;set the accumulator to 16 bit
    .LONGA ON
C1_3D6E:    ;BF43CEC2    LDA $C2CE43,X     (Pointer to start of character battle graphics)
    LDA >$C2CE43,X
C1_3D72:    ;8510        STA $10
    STA $10
C1_3D74:    ;A945C7      LDA #$C745
    LDA #$C745                ; Note that we already stored #$C2 in $16
C1_3D77:    ;8514        STA $14
    STA $14                    ; Combinted with the new 16-bit address, $14-$16 now hold a
                        ; 16-bit address. This is a data section in C2. I believe it contains
                        ; The data on how to compose 8x8 tiles into sprite tiles.
C1_3D79:    ;A61A        LDX $1A
    LDX $1A                    ; This re-loads the original value which was stored in X when we entered the function.
                        ; This should be $0000, $2000, $4000, or $6000, depending on whether this is
                        ; battle-character 1, 2, 3 or 4.
C1_3D7B:    ;A90001      LDA #$0100
    LDA #$0100                ; $1A will be a counter for an outer loop, counting down from 256
C1_3D7E:    ;851A        STA $1A
    STA $1A
C1_3D80:    ;A91000      LDA #$0010
    LDA #$0010                ; This begins an inner loop...
C1_3D83:    ;8518        STA $18
    STA $18                    ; And the inner loop counter counts down from 16
C1_3D85:    ;A714        LDA [$14]
    LDA [$14]                ; Load from the pointer stored in DP addresses $14-16. Again, these point to data in
                        ; C2.
C1_3D87:    ;C9FFFF      CMP #$FFFF
    CMP #$FFFF                ; #$FFFF appears to represent a blank tile (or blank line?). Special code handles this
                        ; case.
C1_3D8A:    ;D00C        BNE $3D98
    BNE C1_3D98                ; Otherwise, skip ahead to the normal case.
C1_3D8C:    ;7B          TDC
    TDC                    ; (the =#$FFFF case)
                        ; I think this loads 0 into the accumulator. I'm not sure why they didn't just use
                        ; the literal zero?
C1_3D8D:    ;9D0000      STA $0000,X
    STA |$0000,X                ; Now we write #$0000 16 times (for a total of 32 bytes)
C1_3D90:    ;E8          INX
    INX
C1_3D91:    ;E8          INX
    INX
C1_3D92:    ;C618        DEC $18
    DEC $18
C1_3D94:    ;D0F7        BNE $3D8D
    BNE C1_3D8D                ; loop...
C1_3D96:    ;800E        BRA $3DA6
    BRA C1_3DA6                ; when loop ends, skip over the "normal" code and continue
C1_3D98:    ;A8          TAY
    TAY                    ; Here begins the normal != #$FFFF case
C1_3D99:    ;B710        LDA [$10],Y
    LDA [$10],Y                ; Load into A the long address stored in $10-$12, offset by Y, which contains 3 *
                        ; the character spire ID. (This gives us the pointer to the start of the character
                        ; battle graphics)
C1_3D9B:    ;9D0000      STA $0000,X
    STA |$0000,X                ; Store the loaded graphics in memory
C1_3D9E:    ;E8          INX
    INX
C1_3D9F:    ;E8          INX
    INX
C1_3DA0:    ;C8          INY
    INY
C1_3DA1:    ;C8          INY
    INY                    ; ... and increment the counters by 2 bytes each
C1_3DA2:    ;C618        DEC $18
    DEC $18                    ; loop 16 times
C1_3DA4:    ;D0F3        BNE $3D99
    BNE C1_3D99
C1_3DA6:    ;E614        INC $14
    INC $14
C1_3DA8:    ;E614        INC $14
    INC $14                    ; Incremet this pointer into C2 (tile-related data)
C1_3DAA:    ;C61A        DEC $1A
    DEC $1A                    ; Decrement the larger, outer counter, which had the initial value of 256 (#$0100)
C1_3DAC:    ;D0D2        BNE $3D80
    BNE C1_3D80                ; This ends the outer loop. In total, with the inner and outer loop, we will load data
                        ; 256 * 16 times, * 2 bytes per loop, = 8192 bytes (#$2000).
C1_3DAE:    ;7B          TDC
    TDC                    ; This, again is meant to set accum to 0, I think
C1_3DAF:    ;E220        SEP #$20
    SEP #$20                ; Set the accumulator to 8-bit mode
    .LONGA OFF
C1_3DB1:    ;A61C        LDX $1C
    LDX $1C                    ; Restore X to its original value, viz. #$0000, #$2000, #$4000, or #$6000, depending
                        ; on which battle character this is.
C1_3DB3:    ;A940        LDA #$40
    LDA #$40                ; We're going to repeat this loop 64 times
C1_3DB5:    ;8512        STA $12
    STA $12                    ; Loop counter stored in $12
C1_3DB7:    ;BDC003      LDA $03C0,X
    LDA $03C0,X                ; Starting with byte $03C0 ...
C1_3DBA:    ;0A          ASL A
    ASL A                    ; ... we repeatedly ASL A ...
C1_3DBB:    ;6610        ROR $10
    ROR $10                    ; ... and ROR $10, with carry ...
    ASL A
    ROR $10
    ASL A
    ROR $10
    ASL A
    ROR $10
    ASL A
    ROR $10
    ASL A
    ROR $10
    ASL A
    ROR $10
    ASL A                    ; ... repeated 8 times ...
    ROR $10                    ; ... this has the effect of reversing the order of the bits.
C1_3DD2:    ;A510        LDA $10
    LDA $10                    ; Then we take the reversed-bit result...
C1_3DD4:    ;9DC003      STA $03C0,X
    STA $03C0,X                ; ... and save it back to to this memory location.
C1_3DD7:    ;BDC010      LDA $10C0,X
    LDA $10C0,X                ; Then we repeat the same thing for another memory location.
    ASL A
    ROR $10
    ASL A
    ROR $10
    ASL A
    ROR $10
    ASL A
    ROR $10    
    ASL A
    ROR $10
    ASL A
    ROR $10
    ASL A
    ROR $10
    ASL A
    ROR $10    
C1_3DF2:    ;A510        LDA $10
    LDA $10
C1_3DF4:    ;9DC010      STA $10C0,X
    STA $10C0,X                ; ...and save it
C1_3DF7:    ;E8          INX
    INX
C1_3DF8:    ;C612        DEC $12
    DEC $12
C1_3DFA:    ;D0BB        BNE $3DB7
    BNE C1_3DB7                ; Repeat this loop 64 times, for 64 bytes in each of two locations.
C1_3DFC:    ;AB          PLB
    PLB
C1_3DFD:    ;FA          PLX
    PLX
C1_3DFE:    ;68          PLA
    PLA                    ; Restoring a bunch of stuff from the stack
C1_3DFF:    ;0A          ASL A
    ASL A                    ; Shifting bits by 5. I think this has to do with multiplying the character ID by
                        ; 32, in order to get an offset.
C1_3E00:    ;0A          ASL A
    ASL A
C1_3E01:    ;0A          ASL A
    ASL A
C1_3E02:    ;0A          ASL A
    ASL A
C1_3E03:    ;0A          ASL A
    ASL A


load_battle_graphics_new:            ; new code begins, in which we mess with the tile and palette values to make it
                        ; so we can use the last four palette colors for battle PCs.


    PHP                    ; push status register

    REP #$30                ; Make accum/index 16 bit
    .LONGA ON
    .LONGI ON

    PHA                    ; push A
    PHX                    ; and X
    PHY                    ; and Y


; We're going to loop through every byte of graphics data we just stored in memory, checking for each palette value whether that value
; is used. When we're done, we'll have information on which palette values are used anywhere in the sprite data.

; We do this by looping through palette values 0 thru 15, and looping through the four bitplanes associated with a single set
; of pixels. In the end, we'll know whether a palette value was used anywhere in that "row" of 8 pixels.


                        ; $1A-$1B will be 16 bits, each representing a palette entry, telling us which
                        ; palette numbers have been used in any pixel of this sprite data.

                        ; Palette entries 1, 2, 7 & 8 should always be considered used...
    LDA #%0000000011000011            ; (these are clear, black / outline, and two skin colors)
    STA $1A                    ; Since these are manipulated dynamically in battle, we shouldn't mess with them.

    LDX #30                    ; X holds the palette number we are checking for, times two.
                        ; Indexed starting with zero, so #15 (x2 = #30) is the last. We check the last first.

    SEP #$20                ; 8 bit accumulator
    .LONGA OFF

    STZ $19                    ; $19 will represent the number of used colors in the expanded palette which we
                        ; have found, and need to switch in to the main palette.

    STZ $18                    ; $18 will represent the number of definitely unused colors in the main palette,
                        ; which can safely be replaced with expanded colors.

lbg_check_memory_for_palette_loop:
    
    REP #$20                                ; 16 bit accumulator
        .LONGA ON

    LDA $1C                    ; Load into A the value originally stored in X, viz. $0000, $2000... &c
                        ; depending on which of the four battle characters this is.
    CLC
    ADC #$2000                ; Point to the end of the data we just loaded for this character

    STA $10                    ; Store into dp $10-$11, to use as index
    LDA #$7F                ; the third byte of the pointer
    STA $12

load_battle_graphics_outer_byte_loop:

    REP #$20                                ; 16 bit accumulator
        .LONGA ON

    LDA $10
    AND #$000F                ; Check if this is a multiple of 16
    BNE load_battle_graphics_outer_byte_loop_2    ; if not, don't do anything special
    LDA $10
    SEC
    SBC #16                    ; If so, subtract an additional 16 (to account for the weird bitplane format)
    STA $10

load_battle_graphics_outer_byte_loop_2:
    DEC $10
    DEC $10
    LDY #0                    ; Y is the byte offset for the bitplane (0, 1, 16, or 17)

    SEP #$20                ; 8 bit accumulator
    .LONGA OFF

    LDA #%10
    STA $14                    ; $14 is a temporary bitmask, which gives us one bit, and shifts left with each
                        ; iteration of Y. Its purpose is to give us the Nth bit stored in X

    LDA #%11111111                ; $15 is a temporary value, composed of the logical AND of all bitplanes, telling us
    STA $15                    ; Whether each one of them has the "correct" value for the palette value we're
                        ; currently checking. In other words, it tells us if a given palette value is used
                        ; in the current set of pixels.

lbg_palette_check_loop:

    TXA
    AND $14                    ; Find out if the Nth bit of X is a zero (N=1..4, depending on bitplane)
    BEQ lbg_palette_check_not_set

lbg_palette_check_set:                ; If the bit is a 1...
    LDA #%00000000                ; Load all 0's into 8-bit A, so EOR will do nothing
    BRA lbg_palette_check_loop_2

lbg_palette_check_not_set:            ; If the bit is a 0...
    LDA #%11111111                ; Load all 1's into 8-bit A, so EOR will flip all bits

lbg_palette_check_loop_2:
    EOR [$10],Y                ; EOR with the gfx byte, either getting its value or the logical-NOT of its value,
                        ; depending on whether we "want" these bits to be on, or off.

    AND $15                    ; AND with $15. The resulting byte will have 1's for only those bits/pixels which
                        ; have the "right" value across all bitplanes so far tested.
    STA $15                    ; And store back.

    BEQ lbg_palette_check_bitplane_done    ; If all bits are already zero, we don't have to check the rest of the bitplanes.

    ASL $14                    ; shift the bitmask left, so we check the next bit next iteration
    INY                    ; move on to the next bitplane
    TYA
    CMP #18                    ; check if we're done with all four bitplanes
    BEQ lbg_palette_check_bitplane_done

    CMP #2                    ; If Y increments past 1, we want to jump straight to 16, to account for the weird
    BNE lbg_palette_check_loop        ; offsets of the four bitplanes
    LDY #16
    BRA lbg_palette_check_loop

lbg_palette_check_bitplane_done:

    SEP #$20
    .LONGA OFF

    LDA $15                    ; Load the data for this row of pixels
    BEQ lbg_palette_check_bitplane_done_2    ; If all are 0, the palette color was not found in any of these 8 bits, so do not set

                        ; Otherwise, it was found, and we should set it as "used"

    REP #$20                ; 16 bit accumulator
    .LONGA ON
    
    LDA $1A                    ; Load the cumulative "palette used" data, 16 bits
    ORA >lbg_data_shifted_bit,X        ; Use the appropriate bitmask to set the bit
    STA $1A                    ; Store the new value, with the bit turned on

    TXA
    CMP #24                    ; Check if this is one of the expanded palette numbers
    BMI all_gfx_bytes_checked_done        ; If not, go ahead, we're finished here

    SEP #$20
    .LONGA OFF

    INC $19                    ; If so, increment the counter for used expanded colors before continuing

    BRA all_gfx_bytes_checked_done        ; Since we just found this palette #, no need to keep searching

lbg_palette_check_bitplane_done_2:

    REP #$20                ; 16 bit accumulator
    .LONGA ON

    LDA $10                    ; Load the data pointer
    CMP $1C                    ; Have we reached the start of this character's gfx data?
    BNE load_battle_graphics_outer_byte_loop ; If not, then loop

all_gfx_bytes_checked_done:
                        ; After we've checked all bytes for a particular palette value...

    REP #$20                ; 16 bit accumulator
    .LONGA ON

    TXA
    CMP #24                    ; Check if this was an expanded palette color
    BPL all_gfx_bytes_checked_done_2    ; If so, continue....

    LDA $1A                    ; If a main palette color, check if it was set
    AND >lbg_data_shifted_bit,X        ; If so, continue...
    BNE all_gfx_bytes_checked_done_2

    SEP #$20
    .LONGA OFF

    INC $18                    ; If an unused main palette color, increment the number of unused palette numbers

all_gfx_bytes_checked_done_2:
    REP #$20                ; 16 bit accumulator
    .LONGA ON

    DEX                    ; Decrement X twice, so we check the previous palette #
    DEX
    TXA
    CMP #0
    BEQ lbg_palette_check_completely_done    ; If X is 0 (which we don't need to check) we are done.

    LDA $1A                    ; Check if the new X represents a palette color that is already "used"
    AND >lbg_data_shifted_bit,X        ; (This will happen for colors we set to 1 at the beginning, because they are
                        ;  manipulated by the gfx engine and we don't want to mess with them)
    BNE all_gfx_bytes_checked_done_2    ; If so, decrement again

    TXA
    CMP #24                    ; Check if this was an expanded palette color
    BPL all_gfx_bytes_checked_continue_loop    ; If so, just continue....

    SEP #$20
    .LONGA OFF

    LDA $18                    ; If this is a normal palette number...
    CMP $19                    ; Check the used expanded against the unused normal...
    BNE all_gfx_bytes_checked_continue_loop ; If not equal, we can't stop yet.

lbg_palette_check_fill_with_ones:

    REP #$20                ; 16 bit accumulator
    .LONGA ON

    LDA $1A                    ; But if equal, we can stop.
    ORA >lbg_data_shifted_bit,X        ; Mark all remaining colors as "used", since we don't need them.
    STA $1A
    
    DEX
    DEX
    TXA
    BPL lbg_palette_check_fill_with_ones    ; Loop until we reach 0
    BRA lbg_palette_check_completely_done

all_gfx_bytes_checked_continue_loop:
    REP #$20                ; 16 bit accumulator
    .LONGA ON

    JMP lbg_check_memory_for_palette_loop

lbg_palette_check_completely_done:
                        ; Now we're completely done with all bytes of graphics data

; $13, $14, $15 and $16 (single-byte values) will store data on which bitmap / palette values to switch. The low 4-bits should contain
; the original value (found in the first 12 entries, but unused), and the high 4-bits should contain the switched value (found in the
; last 4 entries, but used)

lbg_calculate_replacement_colors:

    LDA #0

    SEP #$20                ;8-bit accumulator
    .LONGA OFF

    STA $13                    ; initialize the four replacements to zero
    STA $14
    STA $15
    STA $16

    REP #$20                ;16-bit accumulator
    .LONGA ON

lbg_find_unused_color_pre:

    LDX #0                    ; X will be used to count 1-12 (word data) for the first 12 palette entries

lbg_find_unused_color:

    LDA $1A
    AND >lbg_data_shifted_bit,X        ; Check against the appropriate bitmask, to see if the color is used
    BEQ lbg_find_expanded_color_pre        ; If zero (unused), look for a replacement in palette numbers 13-16
    
    INX
    INX                    ; If not, move to the next palette entry
    TXA
    CMP #24                    ; Check if we've reached the end of the first 12 (non-expanded colors)
    BNE lbg_find_unused_color        ; If not, continue to check
    JMP lbg_done_replacement_colors        ; If so, there are no more unused colors, and we're done.

lbg_find_expanded_color_pre:
    
    TXY                    ; Store the original (1-12) color in Y temporarily
    LDX #24                    ; now X will be used to index the "expanded" colors

lbg_find_expanded_color:

    LDA $1A
    AND >lbg_data_shifted_bit,X        ; Check against the appropriate bitmask, to see if the color is used
    BNE lbg_set_replacement            ; If non-zero (used), store the replacement

    INX
    INX                    ; else, move to the next palette entry
    TXA
    CMP #32                    ; Check if we've reached the last palette entry
    BNE lbg_find_expanded_color        ; If not, continue to loop through the expanded colors
    JMP lbg_done_replacement_colors        ; If so, there are no more used expanded colors, and we are done

lbg_set_replacement:

    SEP #$30                ;8-bit accumulator/index
    .LONGA OFF
    .LONGI OFF

lbg_replace_in_13:

    LDA $13
    BNE lbg_replace_in_14            ; If it's != 0 (value already set), check the next one

    TXA
    LSR                    ; Divide X (the expanded palette index) by 2, to get the palette number
    ASL
    ASL
    ASL
    ASL                    ; Shift left by 4, because we want to store this in the higher four bits
    STA $13

    TYA
    LSR                    ; Divide Y (the original palette index) by 2, to get the palette number
    ORA $13                    ; Transfer the high bits we already stored
    STA $13                    ; Store the total number
    JMP lbg_replace_done

lbg_replace_in_14:

    LDA $14
    BNE lbg_replace_in_15            ; If it's != 0 (value already set), check the next one

    TXA
    LSR                    ; Divide X (the expanded palette index) by 2, to get the palette number
    ASL
    ASL
    ASL
    ASL                    ; Shift left by 4, because we want to store this in the higher four bits
    STA $14

    TYA
    LSR                    ; Divide Y (the original palette index) by 2, to get the palette number
    ORA $14                    ; Transfer the high bits we already stored
    STA $14                    ; Store the total number
    JMP lbg_replace_done

lbg_replace_in_15:

    LDA $15
    BNE lbg_replace_in_16            ; If it's != 0 (value already set), check the next one

    TXA
    LSR                    ; Divide X (the expanded palette index) by 2, to get the palette number
    ASL
    ASL
    ASL
    ASL                    ; Shift left by 4, because we want to store this in the higher four bits
    STA $15

    TYA
    LSR                    ; Divide Y (the original palette index) by 2, to get the palette number
    ORA $15                    ; Transfer the high bits we already stored
    STA $15                    ; Store the total number
    JMP lbg_replace_done

lbg_replace_in_16:

    LDA $16
    BNE lbg_done_replacement_colors        ; If it's != 0 (value already set), we can't do any more replacements

    TXA
    LSR                    ; Divide X (the expanded palette index) by 2, to get the palette number
    ASL
    ASL
    ASL
    ASL                    ; Shift left by 4, because we want to store this in the higher four bits
    STA $16

    TYA
    LSR                    ; Divide Y (the original palette index) by 2, to get the palette number
    ORA $16                    ; Transfer the high bits we already stored
    STA $16                    ; Store the total number
    JMP lbg_replace_done

lbg_replace_done:

    REP #$30                ;16-bit accumulator/index
    .LONGA ON
    .LONGI ON

    PHX
    TYX

    LDA >lbg_data_shifted_bit,X        ; Get the bitmask for the original color
    EOR $1A                    ; Switch this value from unused to used, so we don't replace the same one
    STA $1A

    PLX

    LDA >lbg_data_shifted_bit,X        ; Get the bitmask for the replacement color
    EOR $1A                    ; Switch this value from used to unused, so we don't make the same replacement
    STA $1A

    JMP lbg_find_unused_color_pre        ; Return to the beginning of the loop

lbg_done_replacement_colors:


;======================
; This part of the code goes through each pixel of the sprite, checks if it is one that we need to change, and changes it
; if necessary.

lbg_pixel_changing_loop_pre:


    SEP #$30                ; 8-bit accumulator/index
    .LONGA OFF
    .LONGI OFF

    LDA $13                    ; First, check if we found ANY colors we need to switch
    BNE lbg_pixel_changing_loop_pre_2    ; If we did, continue
    JMP lbg_bitmap_loop_done        ; Otherwise, skip this whole thing

lbg_pixel_changing_loop_pre_2:

    REP #$30                ; 16-bit accumulator/index
    .LONGA ON
    .LONGI ON

    LDA $1C                    ; Load into A the value originally stored in X, viz. $0000, $2000... &c
                        ; depending on which of the four battle characters this is.
    CLC
    ADC #$2000                ; Point to the end of the data we loaded for this character
    STA $10                    ; Store into dp $10-$11, to use as index

lbg_bitmap_outer_byte_loop:

    REP #$30                ; 16-bit accumulator/index
    .LONGA ON
    .LONGI ON

    LDA $10
    AND #$000F                ; Check if this is a multiple of 16
    BNE lbg_bitmap_outer_byte_loop_2    ; if not, don't do anything special
    LDA $10
    SEC
    SBC #16                    ; If so, subtract an additional 16 (to account for the weird bitplane format)
    STA $10

lbg_bitmap_outer_byte_loop_2:
    DEC $10
    DEC $10

    SEP #$30                ; 8-bit accumulator/index
    .LONGA OFF
    .LONGI OFF
    
    LDX #0                    ; X holds an offset for the "color to switch" information. This information is stored
                        ; in direct page memory $13-$16, and X indexes into these values.

lbg_bitmap_bitplane_loop_start:
    LDY #17                    ; Y holds the bitplane (0, 1, 16, or 17). We start at 17 and decrement, for efficiency
                        ; reasons.

    LDA #%11111111                ; $18 holds a cumulative logical-AND of all the bitplanes we've checked so far,
    STA $18                    ; telling us which bits match the palette number we're looking for

    LDA #%10000000                ; $19 holds a bitmask telling us which bit to check for equality in the "color to
    STA $19                    ; switch" data. Starts at #%10000000 because the "extended" palette number is in the
                        ; high bits. Shifts right when Y decreases.

lbg_bitmap_check_replacement:

    LDA >$13,X                ; Load dp byte 13+X, which tells us colors to switch
    BNE lbg_bitmap_check_replacement_cont    ; Only continue if there is a value set here
    JMP lbg_bitmap_outer_byte_done        ; If it's all zero, there are no more replacements to check

lbg_bitmap_check_replacement_cont:
    AND $19                    ; Use the bitmask to check the value of the Nth bit, which corresponds to the Yth
                        ; bitplane
    BEQ lbg_bitmap_check_replacement_not_set    ; If it's zero, we are looking for bits NOT set

lbg_bitmap_check_replacement_set:
    LDA #%00000000                ; Load all 0's, so an EOR will do nothing
    BRA lbg_bitmap_check_replacement_2

lbg_bitmap_check_replacement_not_set:
    LDA #%11111111                ; Load all 1's, so an EOR will flip all bits

lbg_bitmap_check_replacement_2:

    EOR [$10],Y                ; EOR with the gfx byte, current bitplane. We either get the byte, or the negation
                        ; of the byte. This tells us whether each bit's value is "right".
    AND $18                    ; AND with the cumulative check from previous bitplanes.
    BEQ lbg_bitmap_bitplane_loop_done    ; If all bits are 0, there are no matches in this row of gfx data.
    STA $18                    ; Store this value back

    TYA
    CMP #0
    BEQ lbg_bitmap_match_found        ; If we've reached the lowest bitplane, and there are still 1's in $18, we have
                        ; matching pixels.

lbg_bitmap_check_replacement_3:            ; If we haven't reached the lowest bitplane, continue...
    CMP #16
    BNE lbg_bitmap_check_replacement_4
    LDY #1                    ; If y is currently 16, we want to skip down to 1, due to the format of gfx data
    BRA lbg_bitmap_check_replacement_5

lbg_bitmap_check_replacement_4:
    DEY                    ; Otherwise, just decrement

lbg_bitmap_check_replacement_5:
    LSR $19                    ; Finally, shift the bitmask in $19 right
    JMP lbg_bitmap_check_replacement    ; And reiterate loop

lbg_bitmap_match_found:                ; If some of the bits in $18 are still on after checking all bitplanes, a match
                        ; Now switch the matching bits in all bitplanes

    LDY #0                    ; Y holds the bitplane offset
    LDA #1
    STA $19                    ; $19 holds a bitmask

lbg_bitmap_match_found_loop:
    LDA >$13,X                ; load the "color to replace" data
    AND $19                    ; check the appropriate bit with the bitmask
    BEQ lbg_bitmap_match_found_loop_not_set

lbg_bitmap_match_found_loop_set:        ; If the bit is supposed to be 1
    LDA [$10],Y                ; Load the data for this bitplane
    ORA $18                    ; Turn all the "matched" bits on
    STA [$10],Y                ; Store it back
    BRA lbg_bitmap_match_found_loop_2    ; And continue

lbg_bitmap_match_found_loop_not_set:        ; If the bit is supposed to be 0
    LDA #%11111111
    EOR $18                    ; Load the logical negation of the "matched" bits
    AND [$10],Y                ; AND this with the data, setting all the matched bits to 0
    STA [$10],Y                ; store it back
                        ; and continue
lbg_bitmap_match_found_loop_2:
    INY
    TYA
    CMP #18
    BEQ lbg_bitmap_bitplane_loop_done    ; If we've surpassed 17, we're done

    CMP #2
    BNE lbg_bitmap_match_found_loop_3    ; If we've just surpassed offset 1, skip to offset 16
    LDY #16

lbg_bitmap_match_found_loop_3:
    ASL $19                    ; Shift the bitmask
    JMP lbg_bitmap_match_found_loop        ; continue to loop

lbg_bitmap_bitplane_loop_done:
    INX                    ; Increment the offset of the "color to switch" data
    TXA
    CMP #4                    ; Check if we've done all four bytes of data
    BEQ lbg_bitmap_outer_byte_done        ; If so, we're done with this loop
    JMP lbg_bitmap_bitplane_loop_start    ; Otherwise, continue

lbg_bitmap_outer_byte_done:

    REP #$30                ; 16 bit accumulator/index
    .LONGA ON
    .LONGI ON

    LDA $10                    ; Load the data pointer
    CMP $1C                    ; Have we reached the start of this character's gfx data?
    BEQ lbg_bitmap_loop_done
    JMP lbg_bitmap_outer_byte_loop         ; If not, then loop

lbg_bitmap_loop_done:

    REP #$30                ; 16 bit accumulator/index
    .LONGA ON
    .LONGI ON

    PLY
    PLX
    PLA                    ; Pull everything we pushed onto the stack
    PLP

;======================
; Finally, first section of new code is done, and we pick up where we left off

    .LONGA OFF


C1_3E04:    ;DA          PHX
    PHX
C1_3E05:    ;AA          TAX
    TAX
C1_3E06:    ;BDAE2E      LDA $2EAE,X
    LDA $2EAE,X                ; This checks the character ID...
C1_3E09:    ;C90E        CMP #$0E
    CMP #$0E                ; I believe this section has something to do with handling the palette in the
                        ; special case where the character has imp status, and possibly other special status.
C1_3E0B:    ;D012        BNE $3E1F
    BNE C1_3E1F                ; In any case, we skip it in the general case...
C1_3E0D:    ;BDC62E      LDA $2EC6,X
    LDA $2EC6,X
C1_3E10:    ;C901        CMP #$01
    CMP #$01
C1_3E12:    ;D00B        BNE $3E1F
    BNE C1_3E1F
C1_3E14:    ;ADA01E      LDA $1EA0
    LDA $1EA0
C1_3E17:    ;2908        AND #$08
    AND #$08
C1_3E19:    ;F004        BEQ $3E1F
    BEQ C1_3E1F
C1_3E1B:    ;FA          PLX
    PLX
C1_3E1C:    ;7B          TDC
    TDC
C1_3E1D:    ;8005        BRA $3E24
    BRA C1_3E24
C1_3E1F:    ;FA          PLX
    PLX
C1_3E20:    ;BF2BCEC2    LDA $C2CE2B,X
    LDA >$C2CE2B,X                ; $C2CE2B: Battle Character Palette Assignments (1 byte each)
C1_3E24:    ;C220        REP #$20
    REP #$20                ; Accumulator back to 16 bit
    .LONGA ON
C1_3E26:    ;0A          ASL A
    ASL A                    ; Again ASL 5 times. A is the character battle palette assignment. This means we will
C1_3E27:    ;0A          ASL A        ; multiply that number by 32 when using it as an offset.
    ASL A
C1_3E28:    ;0A          ASL A
    ASL A
C1_3E29:    ;0A          ASL A
    ASL A
C1_3E2A:    ;0A          ASL A
    ASL A
C1_3E2B:    ;AA          TAX         ; store the offset in  X
    TAX
C1_3E2C:    ;7B          TDC
    TDC
C1_3E2D:    ;E220        SEP #$20
    SEP #$20                ; Long A off again
    .LONGA OFF
C1_3E2F:    ;68          PLA
    PLA
C1_3E30:    ;0A          ASL A
    ASL A
C1_3E31:    ;0A          ASL A
    ASL A
C1_3E32:    ;0A          ASL A
    ASL A
C1_3E33:    ;0A          ASL A
    ASL A
C1_3E34:    ;0A          ASL A
    ASL A
C1_3E35:    ;A8          TAY
    TAY
C1_3E36:    ;5A          PHY
    PHY
C1_3E37:    ;A918        LDA #$18
    LDA #$18                ; Load the number #$18 -- 24 in decimal. Probably for 12 palette colors.
C1_3E39:    ;8510        STA $10
    STA $10                    ; $10 is being used as the loop index
C1_3E3B:    ;BF0063ED    LDA $ED6300,X
    LDA >$ED6300,X
C1_3E3F:    ;99AD81      STA $81AD,Y
    STA $81AD,Y
C1_3E42:    ;E8          INX
    INX
C1_3E43:    ;C8          INY
    INY
C1_3E44:    ;C610        DEC $10
    DEC $10
C1_3E46:    ;D0F3        BNE $3E3B
    BNE C1_3E3B

; new code for switching the palette assignments

lbg_new_switch_palette:

    PHP                    ; push status register

    REP #$30
    .LONGA ON
    .LONGI ON

    PHA                    ; push A
    PHX                    ; and X
    PHY                    ; and Y

; X and Y have both been incremented #$18 (24) times, and we want to restore them to their original state
    TXA
    SEC
    SBC #$18
    STA $1A

    TYA
    SEC
    SBC #$18
    STA $1C

    LDA $13                    ; load the first switcheroo, located in $13
    AND #%0000000011111111            ; we want to use only the low byte
    LSR
    LSR
    LSR
    LSR                    ; shift right 4 times, to get the "expanded palette" value stored in the upper bits
    ASL                    ; shift left, since it's a word per color
    CLC
    ADC $1A
    TAX

    LDA $13
    AND #%00001111                ; use only the low bits, which have the "original palette" information
    ASL                    ; again, multiply by 2
    CLC
    ADC $1C
    TAY

    SEP #$20                ; Long A off
    .LONGA OFF
    LDA >$ED6300,X                ; load the expanded color...
    STA $81AD,Y                ; ...and replace the color at the correct location
    INX
    INY
    LDA >$ED6300,X                ; load the expanded color...
    STA $81AD,Y                ; ...and replace the color at the correct location
    REP #$20                ; Long A off again
    .LONGA ON

; Repeat for $14
    LDA $14                    ; now the one located in $14
    AND #%0000000011111111            ; we want to use only the low byte
    LSR
    LSR
    LSR
    LSR                    ; shift right 4 times, to get the "expanded palette" value stored in the upper bits
    ASL                    ; shift left, since it's a word per color
    CLC
    ADC $1A                    ; Add to this the "starting" value of X, stored earlier
    TAX

    LDA $14
    AND #%00001111                ; use only the low bytes, which have the "original palette" information
    ASL                    ; again, multiply by 2
    CLC
    ADC $1C                    ; Add the "starting" value of Y
    TAY

    SEP #$20                ; Long A off
    .LONGA OFF
    LDA >$ED6300,X                ; load the expanded color...
    STA $81AD,Y                ; ...and replace the color at the correct location
    INX
    INY
    LDA >$ED6300,X                ; load the expanded color...
    STA $81AD,Y                ; ...and replace the color at the correct location
    REP #$20                ; Long A off again
    .LONGA ON

; Repeat for $15
    LDA $15                    ; now the one located in $15
    AND #%0000000011111111            ; we want to use only the low byte
    LSR
    LSR
    LSR
    LSR                    ; shift right 4 times, to get the "expanded palette" value stored in the upper bits
    ASL                    ; shift left, since it's a word per color
    CLC
    ADC $1A                    ; Add to this the "starting" value of X, stored earlier
    TAX

    LDA $15
    AND #%00001111                ; use only the low bytes, which have the "original palette" information
    ASL                    ; again, multiply by 2
    CLC
    ADC $1C                    ; Add the "starting" value of Y
    TAY

    SEP #$20                ; Long A off
    .LONGA OFF
    LDA >$ED6300,X                ; load the expanded color...
    STA $81AD,Y                ; ...and replace the color at the correct location
    INX
    INY
    LDA >$ED6300,X                ; load the expanded color...
    STA $81AD,Y                ; ...and replace the color at the correct location
    REP #$20                ; Long A off again
    .LONGA ON

; Repeat for $16
    LDA $16                    ; now the one located in $16
    AND #%0000000011111111            ; we want to use only the low byte
    LSR
    LSR
    LSR
    LSR                    ; shift right 4 times, to get the "expanded palette" value stored in the upper bits
    ASL                    ; shift left, since it's a word per color
    CLC
    ADC $1A                    ; Add to this the "starting" value of X, stored earlier
    TAX

    LDA $16
    AND #%00001111                ; use only the low bytes, which have the "original palette" information
    ASL                    ; again, multiply by 2
    CLC
    ADC $1C                    ; Add the "starting" value of Y
    TAY

    SEP #$20                ; Long A off
    .LONGA OFF
    LDA >$ED6300,X                ; load the expanded color...
    STA $81AD,Y                ; ...and replace the color at the correct location
    INX
    INY
    LDA >$ED6300,X                ; load the expanded color...
    STA $81AD,Y                ; ...and replace the color at the correct location
    REP #$20                ; Long A off again
    .LONGA ON

    PLY
    PLX
    PLA
    PLP

; done with palette switching code

C1_3E48:    ;FA          PLX
    PLX
C1_3E49:    ;FEC461      INC $61C4,X
    INC $61C4,X
C1_3E4C:    ;60          RTS
    RTL                    ; changed to rtl from rts

    .ORG ADDRESS_BEGIN_DATA
;lbg_data_bitmask
    .WORD %0000000000000001, %0000000000000010, %0000000000000100, %0000000000001000, %0000000000010000, %0000000000100000, %0000000001000000, %0000000010000000
;lbg_data_shifted_bit:
    .WORD %1, %10, %100, %1000, %10000, %100000, %1000000, %10000000, %100000000, %1000000000, %10000000000, %100000000000, %1000000000000, %10000000000000, %100000000000000, %1000000000000000
  Find
Quote  

#6
Posts: 763
Threads: 83
Thanks Received: 55
Thanks Given: 7
Joined: Apr 2015
Reputation: 22
Status
Obliviscence
Still isn't running for me. Anyone else having any luck? Just hangs forever, unless its supposed to take longer than 2 minutes.

Just to verify: I am pasting to F1/00FD (F1/02FD on my headered ROM) and ending at F1/072C (F1/092C). The last three bytes being 40 00 80. My Jump is correct.

Making a patch might be an easier way to go and only takes a few seconds with lips.
  Find
Quote  

#7
Posts: 45
Threads: 4
Thanks Received: 7
Thanks Given: 1
Joined: Jul 2013
Reputation: 4
Status
None
Your ending offset seems to be wrong for some reason. But never mind. I've used lunarIPS like you suggest. Sorry for the difficulty.

http://bitshare.com/files/v1di8y87/expan...es.7z.html

(I'm not sure what the problem was. I guess it could be a number of things. I was using a version of the rom in which I'd inserted other code and made other changes. I didn't think it would interfere, but maybe it did? Or maybe I pasted the wrong hex values somehow? In any case, never mind.)

OK. These are two patches -- one for a headered rom, and one for an unheadered. They both use clean, 1.0 roms. You don't need to expand the ROM before applying the patch, or anything else. The patch does that for you.

In addition to the new function for loading / editing battle graphic data, the patches also edit sprite data, so you can see the effect without editing any sprites yourself. The following characters are now using "expanded" colors of unaltered palettes: Terra, Locke, Edgar, Sabin, & Imperial Soldier (Biggs/Wedge).

If you want to make your own changes to sprites (including re-importing the original sprites), it should be no problem to do with ff3usME, or anything else.

I've tested both patches with snes9x debug version, with its auto-IPS feature, and though I only tested briefly, both seem to work fine.

The new code is now pasted to F10000, if you're looking for it.

So, I just noticed a "bug" that isn't actually a bug in my code.

Edgar and Sabin appear in different colors in and out of battle, with the altered sprites I gave them. However, this is not because my code is loading the palettes wrong. It's because some of the in-battle palettes are actually different from the corresponding out-of-battle palettes in their last four colors. I guess because those colors are normally unusable, they didn't bother.

Anyway, if you want to fix this discrepancy, alter the in-battle palette to match the out-of-battle one. You can do this in ff3usME, or whatever.
  Find
Quote  

#8
Posts: 763
Threads: 83
Thanks Received: 55
Thanks Given: 7
Joined: Apr 2015
Reputation: 22
Status
Obliviscence
Ahh. I see what happened... you have C9FFD0 in the first line and my brain saw that and immediately passed it off as an address. It was practically invisible to me. If I had looked at it for even 2 seconds I would have realized that wasn't the address I was using and might consider that could possibly be data. That's how I know I've been doing too much hacking recently. Sigh...

Locke is pretty in pink. Laugh

The delay is a hair too long to be comfortable, I'm running through ideas for how this could be written in (without having looked at your code yet).

Eggers Wrote:- Actually, we could stop after ANY 12 colors are used at least once, and just assume the remaining ones are unused. This would result in unpredictable behavior in cases where the sprite actually uses more than 12 colors, but it should work if the sprite artist follows the "contract".
This was my original thought. I would imagine if the spriter didn't follow the rules it would act much like the game currently does if it loads a sprite with out-of-bound colors, they end up blanked or with the colors in places where those colors would have been loaded.

That being said, the people applying these patches are expected to follow the rules, just like they are already expected to follow the rules of only using the first 12. I don't see how you are gaining anything by running through the whole sprite sheet. What does your code do in the event it DOES find more than 12 colors? In most cases you are going to find all 12 colors in the first 6-12 8x8 tiles, which would cut down on your loading time by almost half right there.

Correct me if I'm wrong, but the real kicker is essentially rewriting the sprite sheet to use the colors properly shifted. I have two, maybe three, ideas for this, depending on feasibility.

1. Could the palettes possibly be left in place and the unused colors get changed (exactly where they are) to the miscellaneous sprite colors normally at the end of each palette? So instead of rewriting the spritesheets, rewrite the pointer finger, damage/healing numerals, etc... to look for the new color placements instead? This could be tricky with things that change color (like damage and healing) since the colors are on different palettes, which brings me to number 2.

2. Maybe instead of rewriting the spritesheets making a mask for the colors as they are interpretted, as in where it would normally call color 01, now it calls 04 instead, and it apply the mask every time the sprite is refreshed so you don't have to change the whole sheet at once, only the sprites in use. This might not work considering how the bits are placed in the RAM, I don't know how this actually gets applied in the end.

3. Maybe a combination of one and two (assuming both are doable to an extent) to minimize how much is being changed at any given point and to overcome the multiple color problem involved with numerals.

And I will throw in my idea from way back...
4. Load all 16 colors and move the miscellaneous sprite palettes elsewhere entirely. I know hardly anything about the usage of the VRAM so there's likely not 32 bytes floating around that are still free but in the off chance there is, we could get all the colors avaiable... We could potentially make VRAM space by requiring 2 possible monsters to share 1 palette somehow. It could check the palette ID and if they were all different it would force the last two mobs to use the same one or something... I don't know, I'm just brainstorming. I'm way out of my element here.

Anyway. I finally have it working and will try to set aside some time tomorrow to actually look at the disassembly. Thanks again for all your hard work.
  Find
Quote  
[-] The following 1 user says Thank You to B-Run for this post:
  • Eggers (08-12-2013)

#9
Posts: 45
Threads: 4
Thanks Received: 7
Thanks Given: 1
Joined: Jul 2013
Reputation: 4
Status
None
Quote:That being said, the people applying these patches are expected to follow the rules, just like they are already expected to follow the rules of only using the first 12. I don't see how you are gaining anything by running through the whole sprite sheet. What does your code do in the event it DOES find more than 12 colors? In most cases you are going to find all 12 colors in the first 6-12 8x8 tiles, which would cut down on your loading time by almost half right there.

The latest code has already done most of the efficiency things I talked about, and more. (You should have seen the load time before. It was atrocious.)

I think I've gotten close to optimal on my newest revision (posted in a reply). It's heavily commented, so you can read it. But basically, I transposed the original loop. Meaning: there used to be an outer loop through bytes of graphics data, and an inner loop through colors. Now the outer look is through colors, and it stops as soon as it finds an instance of a color, and looks for the next color from the start of graphics data.

It checks the expanded colors first (the last four, "unusable" ones). I judged this to be probably more efficient on average. Many original sprites do use less than 12 colors, so even in the case of an unaltered sprite, you sometimes end up looping through all the data once or more. For sprites that do use the expanded colors, it's usually more efficient to check those first.

It also keeps track of how many expanded colors are found (which need to be switched in), and it only looks for that number of normal colors to replace. It stops after it's found enough unused colors that can be replaced.

I don't think it's doing too much superfluous reading, bit it still takes time, because there are so many bytes.

Quote:Correct me if I'm wrong, but the real kicker is essentially rewriting the sprite sheet to use the colors properly shifted. I have two, maybe three, ideas for this, depending on feasibility.

Writing the sprite sheet is probably less than half the total execution time. Though I haven't rigorously clocked it, I expect reading takes longer.

If you want an intuitive benchmark of this, load a save game where you have all characters in your party. Then use a party of only "expanded palette" characters (characters that use the last 4 colors), and compare with a party of only characters with unchanged sprites. The expanded palette characters will take longer to load, but the unaltered sprites (using only the first 12 palette entries) will take pretty long too. That's because it still has to read enough sprite data to determine that those sprites don't use the expanded palette. It doesn't need to write anything, and it skips that part of the code entirely. But, it still takes time.

Writing RAM is pretty fast, and the write process only has to loop through the graphics data AT MOST four times. At maximum, it has to replace four unused "normal" colors with four "expanded" colors from palette entries 13-16. Often fewer than that.

The reading code, on the other hand, always has to loop through the whole $2000 bytes AT LEAST four times, to determine that the last four "expanded" palette entries are unused (if they are). That is the best case. If any of those entries are used, it then has to find a corresponding number of unused entries from the earlier palette numbers, which means looping through the whole thing again, at least once for each replacement color, plus the partial traversals to find that used colors are used.

Quote:1. Could the palettes possibly be left in place and the unused colors get changed (exactly where they are) to the miscellaneous sprite colors normally at the end of each palette? So instead of rewriting the spritesheets, rewrite the pointer finger, damage/healing numerals, etc... to look for the new color placements instead? This could be tricky with things that change color (like damage and healing) since the colors are on different palettes, which brings me to number 2.

This is such a good idea. I wish I'd thought of it. It might be a little finicky to get working with the different palettes and so on, but I don't think it should be impossible.

I know that most of these graphics (the numerals and so on) are compressed. I remember reading somewhere where they are, but I can't seem to find the information. Anyone have it?

But anyway, I can't think of any fundamental reason this idea won't work. And it should cut down on the writing time drastically. The only downside I can think of is that I'll need to find a few bytes of unused RAM to store which palette indices are unused. The number of bytes needed to replace, though, should be drastically fewer.

The only reason this isn't a total solution is that, like I said, reading probably takes longer than writing.

I earlier proposed that we could cache the data on palette entries that are used/unused. If I implement that cache, and also cut down on writing time as per your suggestion, it should make the algorithm almost instantaneous, except for the first time a sprite is loaded.

The other idea would be to store the altered graphics in the ROM, like I suggested before. This would of course fix it entirely, at the cost of having to alter the ROM.
  Find
Quote  

#10
Posts: 2,768
Threads: 88
Thanks Received: 24
Thanks Given: 87
Joined: Jun 2009
Reputation: 25
Status
None
I never thought I'd see the day that this kind of patch would be made

I personally want to thank you for this


"Sometimes ninjas do wrong to each other, and in dat way the force of tha earf' comes around da moon - and at that presence, da dirt, it overshadows the grass, so you're like, I can't cut dis grass, there's no sun comin' through. So in order to enable each other the two fruits have to look each other in da eye and understand we can only be right, as da ripe is wrong, you know what I mean?"

-HNIC
Quote  



Forum Jump:

Users browsing this thread: 1 Guest(s)


Theme by Madsiur2017Custom Graphics by JamesWhite