C# on the PSP

I put some custom firmware on my PSP last week. Then I put a homebrew development environment on my PC. Then I wrote a Common Language Runtime for the PSP, and now I can run the prototypical C# Hello World program from my PSP. For the gory details, read on.

Getting the Tools

Homebrew console development is typically mired in obtuse (and frequently undocumented) complexity from line zero; it’s the grim reality of working with something that is essentially a giant, constantly-evolving hack. When I set out to get the PSP development toolchain on to my machine, I was prepared for all sorts of mucking about with Cygwin and various other Unix tools, and I was knee deep in getting everything sorted out and configured when I stumbled across this blog entry by Paulo Lopes and this related forum thread. Paulo’s SDK distribution is extremely pleasant to install, and with just one simple change (adding -lstdc++ to the LIBS in the makefile) I was able to build C++ applications from within Visual Studio without any of the mess I was expecting. I haven’t had any trouble with it yet, so I highly recommend it to anybody looking to get into PSP development instead of PSP screwing-around-with-makefiles-for-hours.

The Depths of the CLR

Like C#, the Common Language Runtime is described in an ECMA standard. Anybody looking to build themselves a CLR, or understand an existing CLR implementation, needs to have that document handy.

Code for the CLR is stored inside “assemblies,” which also includes resources and other modules. Assemblies are packaged up inside files in the Portable Executable format (the very same format used by Windows .exe and .dll files). The PE format is well-documented and discussed at length elsewhere, so I won’t go into it too much. It’s sufficient to say that the first task for my project was to implement a PE file loader, which was a relatively straightforward task.

Within the PE file is the CLR metadata, which pretty much contains all the information to load and run the application. There’s a header, with the kind of information you’d expect to find in a header, and there are “heaps” of string, GUID, and binary data. The meat of the metadata, however, are the tables. The tables consist of rows of data, each row having a specific number of columns. The data in each column of a particular table is generally a fixed constant value or an index into another table or the string/GUID/binary blob heaps. Each table is identified by a eight-bit ordinal (for example, the table containing method definitions is 0x06). Conceptually, there’s nothing too difficult here. In practice, however, there are few snags you can easily run afoul of.

Table Format Woes

For starters, the table columns that contain indices into other tables are not a fixed size. In the interests of saving space, all indices are two bytes in size if the table they’re indexing has less than two bytes worth of rows. Otherwise they’re four bytes. In the metadata table, you’re only given a bitfield indicating which tables exist and a series of integers specifying how many rows those tables each have. The tables are all contiguously stored in the metadata, but since the size of each table as a whole isn’t provided, you can’t just jump to any random table easily. You need to process every table definition, checking every column that’s an index against the reported size of the appropriate table, and build up a database of table row sizes at runtime.

The variable nature of the table sizes also means accessing them in a clean fashion is tricky, and best served by lightweight wrapper classes for each type of table row. The processing of the table definitions and implementation of the table row wrappers is rather mechanical, so it lends itself to automation by a build tool.

Data Alignment

On the PSP in particular, misaligned data access is a problem. The PSP’s main CPU is a MIPS chip, which has memory access alignment requirements: two-byte accesses must occur at an address divisible by two, four-byte accesses must occur at an address divisible by four, et cetera. The MIPS processor isn’t one I’m terribly familiar with; I know it has instructions supporting misaligned load/store operations, and I was able to see them generated by the compiler, but I’d still get hard crashes on the PSP when I tried to access misaligned data, so some further investigation is warranted.

For now, I simply use a set of wrapper routines to perform any memory IO. These routines ensure the address in question meets alignment requirements, and if it doesn’t they fetch data from the nearest aligned addresses and shift and mask the results to yeild the value I’m interested in. Misaligned access occurs quite frequently when implementing the CLR. For example, CIL opcode are a single byte, but they are occasionally followed by a four-byte argument token, which will usually be misaligned.

What About Mono?

The reason I embarked on this project, and in this particular fashion (rather than, say, porting Mono), was to learn more about the gritty details of the CLR. Mono’s very mature relative to my little upstart implementation, but by porting it I wouldn’t be learning as much about the CLR as I would be about Mono’s code structure and internals. I felt rather lost digging around in Mono’s code, as well, because not only did I not know their codebase, I didn’t know the domain within which their codebase was operating — and as I’ve said before, learning about that domain by reading their source code wouldn’t have taught me as much.

On the other hand, while I’ve learned a lot about the CLR’s guts so far, the actual amount of PSP-specific code I’ve had to write is quite small, and other than the memory alignment crashes I haven’t run in to many problems on the PSP that didn’t also manifest themselves in my Windows build. This will likely change as I start being able to implement more and more of the standard library.

Downloading the Project

I have a Google Code page for the PSP CLR, if you’d like to look around. I should warn you (if the low SVN revision numbers don’t tip you off) that this is an extremely early attempt. There are a lot of places I’m taking shortcuts, failing to check for errors, and making assumptions, and even if all of that works out, I only have support for a handful of opcodes and a single standard library function (System.Console.Write(string)). In other words, it doesn’t actually do much yet, so don’t expect too much. Nonetheless, it’s there if you want to take a look.