I've been working on a similar project (LEGO Island decompilation). We've developed an extensive set of annotations and corresponding tools that facilitate matching the assembly/binary:
https://github.com/isledecomp/isle/tree/master/tools
We've been considering creating a separate project/repository for the tools since they might be interesting for other projects such as yours as well.
Cool to see krystalgamer here on HN. I found them a few years ago through the (then super tiny) SM2000 modding scene on youtube. It was my first 3D game, and I still have dreams about the graphics. Their work on the resource unpacker utility allowed me to make custom skins of my own characters for the game, which I screenshotted and used for a comic. It's weird: I often think and dream in the visual language of that game.
Anywayyyyy, will be following this development more closely now. Would be really cool to eventually see some type of level editor for this game, or maybe a multiplayer server (akin to this Bomb Rush Cyberfunk mod: https://thunderstore.io/c/bomb-rush-cyberfunk/p/NotNet/SlopC... )
I can’t imagine decompiling a project that was likely written by at least a team of a dozen developers over the course of 2-3 yrs to be completed by a single person.
Looking at the commit history, I see only contributions by the author “krystalgamer”. Wild.
Rare to find a person with such resolve and enjoyment. Wish this person good luck!
I've been working on my own reverse-engineering/decompilation project (Tenchu: Stealth Assassins) and I've created a Ghidra extension that can export a program selection as a working, relocatable object file [1].
I've had some really good results on x86 since writing an analyzer for an architecture where relocation spots target 4-byte immediate fields inside of instructions is fairly easy. Unfortunately, the PlayStation uses a MIPS processor and writing an analyzer for split HI16/LO16 relocations is proving to be a devilishly tricky problem. I got it to a point where it works well enough on MIPS most of the time, but there's always a new weird edge case hidden inside a function thousands of instructions long where it breaks down...
“ As described in this post, decompilation project is tedious and laborious. Therefore it's easy to lose motivation and leave it in backburner for a long-time. The solution I have found is consistency, everyday try to work on it a bit as all effort will compound.”
I’m current self studying calculus. This is exactly the situation I am falling into. I now spend an extra hour and a half or so at work in the lunch room (my workplace is 24x7) and go through the calculus book. I’ve been doing this for 6 weeks or so, and though my progress may seem slow to many (I just got through limits) I understand the material properly.
(In terms of 6 weeks, I did the prerequisites chapters first which took some time, then I had a holiday, and I spent almost a week and a half on the Delta-Epsion definition of a limit - and I do all the problems in the book and there are a lot)
Awesome project. The PS1 version of this game is one of my favorites from growing up and also the reason my home server's hostname was "eelnats" for many years.
I've been doing a bit of research on and off for the past few years on decompilation and it's definitely challenging to decide how close you want to go to matching. If you can get the exact compiler and exact compilation settings, it's totally feasible to do matching decompilation, and if you're able to make this somehow incremental such that you can incrementally work up to 100% matching over time, it seems like a really good approach, but it requires a lot of groundwork and understanding how the compiler and linker really work. In the process of matching compilation of functions on a binary I was analyzing that was compiled with Visual Studio 2003, I realized that very subtle differences can cause e.g. different register allocation, even in an old compiler with dramatically less sophisticated optimization passes.
Anyway, I guess this tangent is really unrelated, but I think more people should be embarking on decompilation projects. It's very fun, and it's uniquely rewarding if you manage to get some non-trivial decompilation of code to work properly.
I had one odd use case for decompiling that was actually, as far as I know, completely licit: WebView2Loader. Microsoft distributed the WebView2 SDK as 3-BSD so that you could integrate it into your applications without worrying about licensing, but the glue logic that actually interacts with the WebView2 installation and instantiates the COM objects is closed source. But... since it is closed-source 3-BSD, without a EULA... we can reverse engineer it. It being a relatively small shim, I did just that[1]. This was an easy exercise armed with an interactive disassembler, and since it was relatively simple and very small I didn't need to bother with matching anything: I just roughly replicated the behavior instead. The use case for this was allowing people to make WebView2 bindings that didn't have any external dependencies; the OpenWebView2Loader code was ported to Pascal and Go by others, making it possible to have pure bindings that don't require any C code or external DLLs and can directly talk to the WebView2 installation. There's now a static copy of the WebView2Loader with the SDK, which obviates some of the use of this, but this is still a nice approach for Go where you can entirely avoid CGo or messing with weird object format conversion. (It's way better than my original approach for WebView2 in Go, which is to emulate the Windows linker to link and execute an entirely in-memory copy of the WebView2Loader DLL using a lot of unsafe code. That also works, but it is much more bug prone and frankly horrifying.)
Which decompiler?
Man, makes me sad just how much time and effort is wasted on decompiling those old games.
All of this effort could’ve been spent on implementing features and fixing bugs by super dedicated fans of the games. Instead, there’s a constant fight with code and bloodsucking lawyers of greedy companies who don’t give a single shit about those games.
Most of those aren’t even sold anymore. Destined to rot, because of petty reasons.
Nice project. Spider-Man (2000) was a great game for the time. It had a solid action/platforming engine, it had voice-acting with a number of actors pulled from some of the animated shows that were running at the time, and it came out before the 2002 Sam Riami film that forced most subsequent games to follow the movie timelines. I played the N64 version at the time, and found it fairly polished and much better than most of the Spider-Man games that came out before it (Arcade's Revenge, Maximum Carnage, Separation Anxiety, to name a few).
I would argue it would hold the crown for best Spider-Man game for some time until the flawed masterpiece Spider-Man 2 (2004) gave us truly amazing web-slinging physics in a sandbox environment, or the less-free-but-tighter-overall Ultimate Spider-Man in 2005.