asm hacking: "faking" short jumps from your code (in a new bank) to old code

Users browsing this thread: 1 Guest(s)

Thread Modes

07-27-2013, 05:45 PM

Eggers

Imp

Posts: 45
Threads: 4
Thanks Received: 7
Thanks Given: 1
Joined: Jul 2013
Reputation: 4

Status

None

A technique for assembler hacking. There's a lot of explanation here of what I'm doing. The actual code is very short. You can just scroll down and copy it if you don't want to understand it. But I wouldn't recommend that (problems might crop up).

So: I've been lurking these forums for the last while. After reading a bunch of documents, I've finally started toying around with actual hacking. I have fairly ambitious plans, but I don't want to make any promises or go into detail until I have some degree of success.

Anyway: I've noticed people complaining about the difficulty of inserting new code or making significant changes to code in existing banks. In particular, C1 (where a lot of complex graphic routines seem to reside) seems to have only a few bytes of free space, which means you can't really add any new subroutines, nor make existing ones significantly longer. You can't just use another bank, because interfacing between banks is difficult unless the code is specifically designed for it. The destination bank has to "know" the jump comes from another bank, and use the appropriate RTL statement, instead of a regular RTS.

In the existing code, there are subroutines that have their own "long access" helper subroutines. These tiny subroutines just accept long jumps, then make a short jump to the subroutine they provide access to, and then RTL. This means the code can be accessed from another bank, and also with a regular JSR from the same bank.

Now, you could just write your own new "long access" subroutines and insert them into unused bytes in the destination bank. But, for every subroutine you want to jump to from a separate bank, you have to write it its own unique access subroutine. C1 in particular has so few free bytes, and so many separate tiny subroutines, this is a problem. (It is also a problem in other banks.)

Here's my solution. This allows us to write our own code in a brand new bank in an expanded rom, and have it call any subroutine in C1 or any existing bank. Only two bytes of new code have to be inserted into the destination bank.

The two bytes I added to C1:

Code:
01FFE5: 60 6B

This is right after the last byte of existing code. It is a RTS followed by a RTL. Most of the work is done in the new bank, where we presumably have a lot of space. Our new code will jump to these two bytes in the destination bank, after pushing the right values onto the stack.

Now, how to fake a short jump.

Write your new ASM code, and use this macro wherever you would do a JSR if you were in the same bank.

Code:
_long_short_jump_c1 .MACRO c1addr

phk

per #9

pea #$FFE5

pea c1addr-1

jmp >$c1ffe5

.ENDM

(Look up these opcodes if you want to edit hexes by hand.)

Anyway, to go through the logic step by step:

phk: push the current program bank onto the stack. Then...
per #9: push onto the stack the current value of the program counter, plus a constant value of 9.

With the above, you push onto the stack exactly what would be pushed onto the stack if you did a long JSR (aka JSL) from this part of the code. These values will be pulled when an RTL executes, making the cpu return here.

The +9 is important. That sets the address in the stack to the last byte of the last instruction that is a part of this macro. When we do a RTL, and pull these values from the stack, the program counter will be incremented by 1 before reading the next instruction. This means we will resume reading code right after this macro.

pea #$FFE5: push the absolute 16-bit address in C1 where we wrote those two new bytes. (Obviously, use a different value if you wrote them to a different local address.)

This pushes the same thing that a 16-bit JSR command would push, if it were called from $C1FFE5. So, we're faking that we've made a short jump from our two-byte access function, to somewhere else.

pea c1addr-1: The c1addr-1 is a variable in the macro, not a literal value. What you literally want to push to the stack is the 16-bit address of your destination subroutine, minus 1. The minus 1 is important, unless you want to start on the second byte of the destination subroutine (you don't).
jmp >$c1ffe5: Finally we do long jump to the entry-point in C1. Note this is a regular JMP, and not a JSR. That means it does not in itself push anything to the stack. It is not designed to be paired with a return statement.

OK, so we jump to $c1ffe5. The first thing we'll read is a RTS. This is a (short-address) return from subroutine. But, counter to its name, we're using it to jump to the destination subroutine. Got it?

The top values on the stack are the local address of the destination subroutine. We pushed those there ourselves. So, we "return" to our destination code.

It doesn't matter that we never actually "left" from that location, nor that there is not actually a JSR statement there. Nothing stops us from using a "return" statement in the opposite way intended.

So then, the subroutine in C1 will execute as normal. At the end it will call its own RTS. This will return to $FFE5, because those bytes are now on the top of the stack (after we set them there manually). We'll then execute the second opcode we wrote, at $FFE6: the RTL. The top of the stack is now the appropriate 24-bit address in our new block of code (again, we wrote it ourselves in our macro). So, we'll return to our new code, and continue executing as normal.

I've tested this by copying the NMI routine used to update graphics in battle (originally found at C1/0BAA) to a new bank. This subroutine happens to call a ton of other C1 subroutines, so this technique is very necessary. I made the necessary replacements to the JSR statements, and changed nothing else. Then I redirected the NMI to my new code (the address is set in C2).

It works like a charm. The game behaves exactly as normal, which it should, since the code is essentially identical, but in a new location.

Now that I have a proof of concept, I can change the code as I see fit without any significant limitations on space, and still interface with the existing code.

Theoretically, these extra pushes and jumps entail some overhead CPU cycles. I don't know if this will be a problem. There is no significant slowdown in my snes9x emulator. Since the NMI function is called every frame, and it uses a lot of in-bank jumps, this would be one of the "worst" functions to move to a new bank. But I don't notice any issues. Will it be a problem in another emulator, or on a slower computer, or with real hardware? I don't think so, but I can't say.

So anyway, this technique is now known, if it wasn't before. I hope it helps some people with their asm hacks.

Find

Thank You

Quote

The following 1 user says Thank You to Eggers for this post:
• C-Dude (07-01-2023)

04-16-2016, 09:39 AM

Tenkarider Offline

Tenkarider

Posts: 1,633
Threads: 56
Thanks Received: 13
Thanks Given: 84
Joined: Apr 2014
Reputation: 12

Status

Sorry for the necropost, but i tried to do this trick following this guide, but looks like i fail in understanding something in the process ;[
So i'm gonna ask if there's someone here that tested this trick succesfully who can help me to understand better the whole thing,
in particular i need to see a working example of code(in hex would be the best) with all the offsets.

THE GREATEST CHALLENGE OF ALL TIMES AWAITS:
http://www.ff6hacking.com/forums/showthr...p?tid=2593
DO YOU HAVE WHAT IT TAKES TO SLAY A GOD?
------------------------------------------------------------------------
Tenkarider's project #2 is started: FF6 Curse of the Madsiur Joke (CotMJ)
http://www.ff6hacking.com/forums/showthr...p?tid=2755
What happens when Madsiur tweaks your account? This full game hack will show that!

Find

Thank You

Quote

•

04-16-2016, 07:18 PM (This post was last modified: 04-16-2016, 07:45 PM by madsiur.)

madsiur

coder

Posts: 3,970
Threads: 279
Thanks Received: 236
Thanks Given: 58
Joined: Oct 2011
Reputation: 65

Status

The following code allowed me to call a $C2 routine while being in bank $EE. Not the best example but it proves it works. Just enter the battle and set breakpoint at $C20CB0. I'm including the debugger console log. I never touched "step over" or "step out" except one "step over" at $C20D42.

xkas code:

Code:
hirom

;what is needed for fake jump

org $C26469

RTS

RTL

;hook up

org $C20CB0

JSL newCode

BRA continue

;ending point where code continue normally

org $C20CBA

continue:

org $EEAF01

;$C24B5A function

newCode:

PHX

INC $BE

LDX $BE

LDA $C0FD00,X

PLX

;$C20CB3 code

ORA #$E0

STA $E8     

;Fake Jump replacing JSR $0D3D

PHK

PER $0009

PEA $6469        ; return after execution of $C20D3D

PEA $0D3C        ; 0D3Dh - 1

JMP $C26469

;Your custom code could continue here :D

RTL

Log:

Code:
c20cb0 jsl $eeaf01   [eeaf01] A:00ff X:000c Y:0004 S:15d6 D:0000 DB:7e NvMXdIzc V:158 H: 268 F:27

eeaf01 phx                    A:00ff X:000c Y:0004 S:15d3 D:0000 DB:7e NvMXdIzc V:158 H: 322 F:27

eeaf02 inc $be       [0000be] A:00ff X:000c Y:0004 S:15d2 D:0000 DB:7e NvMXdIzc V:158 H: 342 F:27

eeaf04 ldx $be       [0000be] A:00ff X:000c Y:0004 S:15d2 D:0000 DB:7e nvMXdIzc V:158 H: 376 F:27

eeaf06 lda $c0fd00,x [c0fd47] A:00ff X:0047 Y:0004 S:15d2 D:0000 DB:7e nvMXdIzc V:158 H: 396 F:27

eeaf0a plx                    A:0078 X:0047 Y:0004 S:15d2 D:0000 DB:7e nvMXdIzc V:158 H: 426 F:27

eeaf0b ora #$e0               A:0078 X:000c Y:0004 S:15d3 D:0000 DB:7e nvMXdIzc V:158 H: 452 F:27

eeaf0d sta $e8       [0000e8] A:00f8 X:000c Y:0004 S:15d3 D:0000 DB:7e NvMXdIzc V:158 H: 464 F:27

eeaf0f phk                    A:00f8 X:000c Y:0004 S:15d3 D:0000 DB:7e NvMXdIzc V:158 H: 484 F:27

eeaf10 per $0009     [7e0009] A:00f8 X:000c Y:0004 S:15d2 D:0000 DB:7e NvMXdIzc V:158 H: 504 F:27

eeaf13 pea $6469     [7e6469] A:00f8 X:000c Y:0004 S:15d0 D:0000 DB:7e NvMXdIzc V:158 H: 584 F:27

eeaf16 pea $0d3c     [7e0d3c] A:00f8 X:000c Y:0004 S:15ce D:0000 DB:7e NvMXdIzc V:158 H: 618 F:27

eeaf19 jml $c26469   [c26469] A:00f8 X:000c Y:0004 S:15cc D:0000 DB:7e NvMXdIzc V:158 H: 652 F:27

c26469 rts                    A:00f8 X:000c Y:0004 S:15cc D:0000 DB:7e NvMXdIzc V:158 H: 676 F:27

c20d3d php                    A:00f8 X:000c Y:0004 S:15ce D:0000 DB:7e NvMXdIzc V:158 H: 716 F:27

c20d3e rep #$20               A:00f8 X:000c Y:0004 S:15cd D:0000 DB:7e NvMXdIzc V:158 H: 736 F:27

c20d40 lda $f0       [0000f0] A:00f8 X:000c Y:0004 S:15cd D:0000 DB:7e NvmXdIzc V:158 H: 754 F:27

c20d42 jsr $47b7     [c247b7] A:001e X:000c Y:0004 S:15cd D:0000 DB:7e nvmXdIzc V:158 H: 782 F:27

c20d45 inc                    A:001d X:000c Y:0004 S:15cd D:0000 DB:7e nvmXdIzc V:159 H: 622 F:27

c20d46 sta $f0       [0000f0] A:001e X:000c Y:0004 S:15cd D:0000 DB:7e nvmXdIzc V:159 H: 634 F:27

c20d48 plp                    A:001e X:000c Y:0004 S:15cd D:0000 DB:7e nvmXdIzc V:159 H: 662 F:27

c20d49 rts                    A:001e X:000c Y:0004 S:15ce D:0000 DB:7e NvMXdIzc V:159 H: 688 F:27

c2646a rtl                    A:001e X:000c Y:0004 S:15d0 D:0000 DB:7e NvMXdIzc V:159 H: 728 F:27

eeaf1d rtl                    A:001e X:000c Y:0004 S:15d3 D:0000 DB:7e NvMXdIzc V:159 H: 770 F:27

c20cb4 bra $0cba     [c20cba] A:001e X:000c Y:0004 S:15d6 D:0000 DB:7e NvMXdIzc V:159 H: 812 F:27

c20cba clc                    A:001e X:000c Y:0004 S:15d6 D:0000 DB:7e NvMXdIzc V:159 H: 830 F:27

Find

Thank You

Quote

•

« Next Oldest | Next Newest »

View a Printable Version

Forum Jump:

Users browsing this thread: 1 Guest(s)