So, in the days of yore, there was a game called “X-Wing vs. TIE Fighter,” in which the player flew various missions in an X-Wing (or another ship). When I was a child I enjoyed this game immensely, along with the slew of aerial combat simulators that came out in the mid-to-late 90s.
X-Wing uses a custom mission file format, .XWI, which has already been decoded (at http://www.quantumg.net/xwing_format.php), but I can’t seem to find information on the .BRF format that’s paired with it. The .XWI format specifies the mission, but the .BRF format specifies the briefing that comes before the mission. Without the briefing, the mission is confusing. Obviously, without the mission, the briefing is pointless, but that isn’t the problem here. So I’m going to reverse-engineer the .BRF format, and hopefully learn how to reverse-engineer formats along the way.
Fortunately, I have access to a program called “XMB Editor” which can edit XWI and BRF files (it’s so old that wine runs it using DOSbox, which is funny since wine claims it’s not an emulator).
I don’t have access to the source code, like I do for AssaultCube (another story), but it will still be a tremendous help since it will let us see what the metaphors are.
First let’s take a look at a pseudo-blank briefing as a starting point. XXD gives us this: tmp
At the beginning is a bunch of binary data that I won’t interpret yet, but is almost certainly a header. Around 0x2d0 we see the “>Destroy Imperial Freighters” string, which is the name of the mission.
Immediately before that string is the byte 0x00, but before that is 0x1c = 28, which is the number of characters in the text. The string (“They are expected…”) is 100 characters long, and has 0x64 = 100 two characters before it. So this is the length of the string.
The strings appear to all be bunched together at the end, so the header probably contains the number of strings. We can track it down by adding the string “This is a test” to the briefing.
Result: a
We see the string at the end of the document. For some reason it’s been padded out to 44 bytes.
Oddly, while the text appeared at the end, none of the header changed. Probably we have to add it to the animation page, which is what controls how the map animates.
The XMB editor considers the briefing to consist of multiple animation pages, each with some scripted animation. So let’s do the simplest thing, and add a new “full text” page with our new text.
Result: b
Radically different. The text “Congratulations…” has moved 24 bytes later in the file. The first line to change is line starting at 0x60. The 4th-to-last byte changed from a 3 to a 4 when we added the 4th page, so let’s say that this is the page count.
The rest of the header is the same until line 0x1c0, where “Congratulations…” starts. The first change is halfway through the line, a 0x01 changes to a 0x08. Specifically, the following was added:
0x0800 0x0000 0x0100 0x0000 0x0b00 0x0000 0x0000 0x0c00 0x0900 0x0f27 0x2900 0x0000
Let’s guess that the 0x0800 is a section header of some sort. Under that assumption, and the knowledge that there is another “full text” page immediately before this one, we note that the previous “full text” page has this string:
0x0800 0x0000 0x0100 0x0000 0x0b00 0x0100 0x0000 0x0c00 0x0800 0x0f27 0x2900 0x0000
The same length, and differing in only two places, highlighted.
In xmb-edit, each animation page has a list of animation commands. Each animation command has three editable fields: Clock Ticks, Cmd (an enum, e.g. “Show Main Text”), and Parameter. In both of the pages above, there are two animation commands “Show Title Text” (value 11=0x0b) and “Show Main Text” (value 12=0x0c). Each has a Clock Ticks parameter of 0.
Thus, we can say that the format for an animation command is:
- Clock Ticks (2 byte short)
- Command (2 byte enum, possible values below)
- Parameter(s) (depends on command, see below)
In the companion .XWI format, there are an awful lot of 2 byte shorts, so I’m making a lot of guesses here that everything is 2 byte shorts. Later testing will help verify this.
Possible values for command, as well as the GUI’s parameters (xmb-edit gives me the number in the GUI next to the name):
- 1 - Wait for Click - No parameters
- 10 - Clear Text - No parameters
- 11 - Show Title Text - 1 parameter, text ID.
- 12 - Show Main Text - 1 parameter, text ID.
- 15 - Center Map - 2 parameters, X and Y.
- 16 - Zoom Map - 2 parameters, X and Y.
- 21 - Clear Boxes - No parameters
- 22 - “Box 1” - 1 parameter
- 23 - “Box 2” - 1 parameter
- 24 - “Box 3” - 1 parameter
- 25 - “Box 4” - 1 parameter
- 26 - “Clear Tags” - No parameters
- 27 - “Tag 1” - 3 parameters: 1 integer “parameter”, X, and Y.
- 28 - “Tag 2” - 3 parameters: 1 integer “parameter”, X, and Y.
- 29 - “Tag 3” - 3 parameters: 1 integer “parameter”, X, and Y.
- 30 - “Tag 4” - 3 parameters: 1 integer “parameter”, X, and Y.
Note that these all have different parameter sets. Let’s add an animation command to “Center Map” at X=1.00,Y=2.00, and to “Tag 1” with parameter 1 at X=2.00,Y=0.01.
The result: c
The complete animation frame is now (starting from where we think the previous frame ended to where we think the “Congratulations…” string started, and split according to the animation commands):
1100 0000 0100 // Header
0000 0b00 0000 // Show Title Text
0000 0c00 0900 // Show Main Text
0000 0f00 6400 38ff // 0x0000 clock ticks, Center Map (0x0f), X=1.0 (0x6400), Y=2.0 (0x38ff).
0000 1b00 0000 c800 ffff // 0x0000 clock ticks, Tag 1 (0x1b), parameter=1, X=2.0 (0xc800), Y=0.01 (0xffff)
0f27 2900 0000 // Footer
The 0x0800 turned into an 0x1100 when 9 2-byte words were added, so that’s apparently the number of words after the count (not counting the footer, but counting the second two bytes in the header).
The number encoding is weird. If nothing else, then it’s weird because Y=2.0 is 0x38ff but X=2.0 is 0xc800. I suspect off the cuff that it’s some sort of floating point notation, so let’s encode a bunch of different numbers (by just modifying the Tag 1 command):
X=2.0,Y=2.0 encodes to c800 38ff
X=-2.0,Y=-2.0 is 38ff c800 (WAT. This probably means that they use the same encoding, but for some reason one of them is negated. Y is often negated in graphics, I’ll assume that one for now)
X=2.5,Y=-2.25 is fa00 e100
Here’s something to note: 0x00e1 = 225 in decimal. 0x00fa = 250 in decimal. 0xff38 = 65536 - 200 in decimal. We can conclude that the number is a little-endian-encoded fixed point number, in hundredths. Sign is encoded as the two’s complement.
Now that we roughly understand how data is encoded into animation frames (note: we don’t yet know the semantics of the data yet, but we can figure that out later), let’s move on to “Ships & Objects.” Like above, we’ll start by adding a ship. We’ll add 1 wave of 5 TIE Advanced (ID 17), with designation “ALPHA”, cargo “BETA”, alternate cargo “GAMMA”, alternate vessel 3, at X=4,Y=5,Z=6, and the personal vessel will be vessel 2.
The result: d
The first word stayed the same, but the second word changed from 0 to 1, so this is probably the count of ships & objects. The third word stayed at 2. After that, something was inserted.
The “Congratulations…” string moved from 0x1f8 to 0x244, so presumably 76 bytes were added. At 0x52 we see the “0x0200 0x0000” string continuing, so that’s presumably where the addition stops. The bytes in the middle are:
00000000: 9001 0cfe 5802 0000 0000 ..........X..... 00000010: 0000 1100 0000 0500 0100 414c 5048 4100 ..........ALPHA. 00000020: 0000 0000 0000 0000 0000 4245 5441 0000 ..........BETA.. 00000030: 0000 0000 0000 0000 0000 4741 4d4d 4100 ..........GAMMA. 00000040: 0000 0000 0000 0000 0000 0100 0000 4000 ..............@. 00000050: 0000
17, the decimal ID for the TIE Advanced, equals 0x11 in hex, so the “0x1100” is probably the object type. This is followed by a zero, which is followed by a 0x0500, which is the quantity, and a 0x0100, which is the number of waves we set. Then we have three strings: designation, cargo, and alternate cargo, each field 16 bytes long and null terminated (fixed-length strings are used in the XWI format as well).
The X,Y, and Z coordinates (4.0, 5.0, and 6.0, respectively) map to 0x0190, 0x01f4, and 0x0258. The X and Z coordinates both appear verbatim in the first and third words in the snippet, respectively, but the Y coordinate doesn’t… However, taking our knowledge from above, if we invert 0xfe0c we find the the Y coordinate was negated (and -5.00 is what is stored).
After the last string we see 0x0100 0x0000 0x4000 0x0000. If we change the IFF from “default” to “alliance”, the first word changes to 0x0000. For “imperial” it changes to 0xffff (-1), and for “neutral” it changes to 0xfeff (-2). This is a strange encoding, to be sure, but it appears to be an enum between the four values.
To pick on another field to manipulate, I changed the “alternate vessel” from 0 to 1. This brought down a world of hurt: In the GUI, the locations changed, and the alternate vessel was reset from 1 to 0. The IFF field was changed to 0xfeff. So something’s afoot, let’s poke at some other fields to see whether any of them are messed up.
Changing the “personal” vessel from 2 to 3 changed the IFF field to 0xfcff (-4) (the IFF field was also reset to “default”). According to the GUI, these three fields are inextricably linked in some way.
Perhaps it’s a bitfield of some sort. If we change the personal vessel to 0, and the IFF to “alliance”, that word changes to 0xfeff. However, the “0x1100 0x0000 0x0500” changed to “0x1100 0x0100 0x0500”. If we change the IFF from alliance to imperial, that field changes to a 0x0200 (however, the later field changed to a 0xfdff).
For clarity, this is the values of those bytes, with the two fields in bold:
00000000: 9001 0cfe 5802 0000 0000 ..........X..... 00000010: 0000 1100 0200 0500 0100 414c 5048 4100 ..........ALPHA. 00000020: 0000 0000 0000 0000 0000 4245 5441 0000 ..........BETA.. 00000030: 0000 0000 0000 0000 0000 4741 4d4d 4100 ..........GAMMA. 00000040: 0000 0000 0000 0000 0000 fdff 0000 4000 ..............@. 00000050: 0000
Apparently bold doesn’t work very well in preformatted tags.
Changing the IFF to “neutral” changes the first bolded field to 0x0300, and the second bold field to 0xfeff. So we can say that that first bold field corresponds to the IFF.
Changing the personal vessel to 2 changes the second field to 0xfdff. Changing it to 3 changes it to 0xfeff. Interestingly, if I save the file over and over, the second field keeps decrementing. However, the GUI is saving the personal vessel number, because I can open the file and it remembers the correct number. So I’m going to guess that that field is some unknown field, and that the personal vessel is saved somewhere else. Here’s the full hex dump of the file so far (with the ship changed to X-Wing, as well): e
If you compare the files d and e using diff, you will see that byte 0x348 also changed from a 2 to a 3 when the personal vessel changed from a 2 to a 3. I don’t know what this block of data is used for, it appears to be a sea of zeros.
For completeness, here are all possible values of object type:
- 0 - Unassigned
- 1 - X-Wing
- 2 - Y-Wing
- 3 - A-Wing
- 4 - TIE Fighter
- 5 - TIE Interceptor
- 6 - TIE Bomber
- 7 - Assault Gunboat
- 8 - Transport
- 9 - Shuttle
- 10 - Tug
- 11 - Container
- 12 - Freighter
- 13 - Calamari Cruiser
- 14 - Nebulan B Frigate
- 15 - Corvette
- 16 - Star Destroyer
- 17 - TIE Advanced
- 18 - Mine, Type 1
- 19 - Mine, Type 2
- 20 - Mine, Type 3
- 21 - Mine, Type 4
- 22 - Comm Sat
- 23 - Nav Buoy
- 24 - Probe
- 25 - B-Wing
- 26 - Asteroid, Size 1
- 27 - Asteroid, Size 2
- 28 - Asteroid, Size 3
- 29 - Asteroid, Size 4
- 30 - Asteroid, Size 5
- 31 - Asteroid, Size 6
- 32 - Asteroid, Size 7
- 33 - Asteroid, Size 8
- 34 - Rock World
- 35 - Gray Ring World
- 36 - Gray World
- 37 - Brown Habitable
- 38 - Gray Habitable
- 39 - World w/ Moon
- 40 - Gray Crescent
- 41 - Orange Crescent 1
- 42 - Orange Crescent 2
- 43 - Orange Crescent 3
- 44 - Orange Crescent 4
- 45 - Orange Crescent 5
- 46 - Orange Crescent 6
- 47 - Orange Crescent 7
- 48 - Orange Crescent 8
- 49 - Death Star
The unknown field appears near the phrase “T/B’s” in the text, which is what we use for tags. So let’s investigate the map tags next. As usual, we insert a tag “Test Tag”, and then save the file to see what’s changed: f
Examining the diff of e and f, we find that the first difference is not in the header but rather at byte 0x367. Based on how far the “>Destroy…” string moved, we notice that 8 bytes were added. These bytes were added immediately after it:
0x08 0x0054 0x6573 0x7420 0x5461 0x67
Note that the first byte is listed alone. This is just how I scraped it out of xxd, and has no semantic meaning. Except that 8 is the length of the string, and 0800 is the byte that appeared. Thus the tag appears to be prefixed with a little-endian 16-bit tag length.
This is 10 bytes, though, not 8. If we look at the 0x20 that appears on 0x3a5 in e, we find that in e it is 69 bytes from the 0x05 at 0x360 (the start of the “T/B’s” tag). In f, it is 77 bytes away. This is what we expected. So it would appear that there is a fixed-size chunk of data, and every time a string is added the data is grown by the size of the string but not the word indicating the length of the string. Based on counting the number of zero words left before the 0x20 at 0x3ad, we arrive at 30 zero bytes, plus the two tags, gives us room for exactly 32 tags. Powers of two are not often coincidences, my guess is that there is a map limit of 32 tags per map.
If we add 30 more tags “a”, then in fact that chunk does fill up exactly: g. Notice how the last “a” tag ends on the 0x20 that ostensibly serves as the marker for the next block.
But what if we add a 33rd tag? I added “ab”, and this is the result: h. The “ab” tag and its length was simply appended, like nothing had happened.
So apparently the map tag block consists of a sequence of strings, each preceeded by a string length. It’s padded to contain at least 32 strings, albeit up to 31 strings may be zero-length. The block is terminated by “0x20 0x00 0x00 0x00”, although this may just be the header for the next block.
There’s just one more set of text strings to investigate, the completion messages, so let’s knock that out. There are always exactly three lines of them, and each line is apparently exactly 64 bytes long, zero terminated. There are no length counters here.
There’s a few fields that we need to mop up. Interestingly, the “Coordinate Set” and “Space/Surface” fields in the GUI don’t appear to map to anything in the file, so these are probably just GUI options. However, under “Map Page Setup,” one can configure the number of lines of text at the bottom of the map. Changing this from 3 to 4 resulted in: i
Apart from the “save counter” that we discovered above, two bytes both changed from 0x67 to 0x5c: The byte at 0x5e and the byte at 0x80.
Both of these occur before the map page section starts. From the first change we made, adding a full text page (going from a to b), we can find that the number of pages occurs at 0x6c in a and b, and based on the “0400 d401 9000” substring we can locate this to be at 0xb8 in i.
We still have to deal with this mystery data, though:
00000050: 0200 0000 0000 0c00 d400 0100 5c00 ..............\. 00000060: 0000 8a00 d400 0100 0000 0000 0000 0000 ................ 00000070: 0000 0000 0000 0000 0000 0000 0c00 0000 ................ 00000080: 5c00 d400 0100 0000 0000 0c00 d400 0100 \............... 00000090: 0c00 0000 8a00 d400 0100 0000 0000 0000 ................ 000000a0: 0000 0000 0000 0000 0000 0000 0000 0c00 ................ 000000b0: 0000 1800 d400 0000
This is what appears between the end of the ships & objects and the beginning of the animation pages. In order to write a parser for .BRF, we need to at least know how to cross this no-man’s land.
Perhaps the ships & objects end with a pointer to the data. Let’s add another ship, a Y-Wing designated “YELLOW” at X=0.1,Y=0.2,Z=0.3 with everything else left at the GUI’s default. The result: j
As predicted, the ship counter incremented from 1 to 2. As not predicted, though, was that the first block of ship data changed. The position of the ship (0.1 maps to a, -0.2 to ecff, and 0.3 to 1e00) was inserted immediately after the position of the first ship group.
A quick side-by-side comparison. Here’s the original:
00000000: 0000 0000 0000 9001 0cfe 5802 0000 0000 ..........X..... 00000010: 0000 0100 0300 0500 0100 414c 5048 4100 ..........ALPHA. 00000020: 0000 0000 0000 0000 0000 4245 5441 0000 ..........BETA.. 00000030: 0000 0000 0000 0000 0000 4741 4d4d 4100 ..........GAMMA. 00000040: 0000 0000 0000 0000 0000 f2ff 0000 4000 ..............@. 00000050: 0000 0200
And here’s the new one:
00000000: 9001 0cfe 5802 0a00 ecff ..........X..... 00000010: 1e00 0000 0000 0000 0000 0000 0000 0100 ................ 00000020: 0300 0500 0100 414c 5048 4100 0000 0000 ......ALPHA..... 00000030: 0000 0000 0000 4245 5441 0000 0000 0000 ......BETA...... 00000040: 0000 0000 0000 4741 4d4d 4100 0000 0000 ......GAMMA..... 00000050: 0000 0000 0000 f0ff 0000 4000 0000 0200 ..........@..... 00000060: 0000 0100 0000 5945 4c4c 4f57 0000 0000 ......YELLOW.... 00000070: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 00000080: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 00000090: 0000 0000 0000 feff 0000 4000 0000 0200 ..........@.....
This looks like the three coordinates of the X-Wing (9001 0cfe 5802) followed by the coordinates of the Y-Wing (0a00 ecff 1e00), followed by three 0 words for each ship, followed by the X-Wing block. The X-Wing block is the type (0100), then the IFF (0300), then the wave size (0500), then the number of waves (0100). Then there are the three strings (ALPHA, BETA, GAMMA), the save count (f0ff), and an unknown (0000 4000 0000). Then we start on the Y-Wing block, which has a type (0200), a IFF (0000), wave size (0100) and wave count (0000), followed by the strings (YELLOW, blank, blank), and then (feff) and the unknown (0000 4000 0000).
Then we start in on the next block, starting with (0200).
In both i and j, this block is (going from the end of the ship data to the start of the map page data) 102 bytes long, and both have the same data.
In the X-Wing OPT formats (used to represent the graphical data, specs), they use a block of pointers at the start of the file to point to the different blocks around the file. Perhaps a similar system is in use here, and this mysterious block of data is a block of offsets.
So what choice do we have, but to add bunches of numbers together and hope to find a pattern. Let’s go back to comparing h and i. In the OPT format, the jump table starts with a few zeroes, then a count of headers, then an offset to the offsets, and then the offsets (wherever they may be). If we follow the same scheme here (presumably they were made by the same people), we drop the first two zeroes, and the 0c00 (12 in decimal) is the number of headers. Then we have an offset to offsets, d400. Nothing special is at 0xd4, so maybe that’s a dead end.
But worth noting is that 0x5c + 0x5e (the location of 0x5c) = 0xba, which is where the first map page is (in i). This doesn’t hold up under scrutiny in h, though, where 0x67 + 0x5e lands us in the middle of the map page block. So another dead end.
Here’s another thought. If we include the zeroes, then there are exactly 50 2-byte words, which is divisible by 5. If we divide the chunk into 10 5-word blocks, we get:
54: 0000 0000 0c00 d400 0100 (1) 5e: 5c00 0000 8a00 d400 0100 (2) 68: 0000 0000 0000 0000 0000 72: 0000 0000 0000 0000 0000 7c: 0c00 0000 5c00 d400 0100 (3) 86: 0000 0000 0c00 d400 0100 (4) 90: 0c00 0000 8a00 d400 0100 (5) 9a: 0000 0000 0000 0000 0000 a4: 0000 0000 0000 0000 0000 ae: 0c00 0000 1800 d400 0000 (6)
There’s certainly a pattern here. Four of the chunks are all zeroes, let’s ignore those for now. The other chunks have been numbered 1 through 6. In these chunks, the fourth word is always d400, and the fifth word is always 0100 except in the last chunk. No two of the non-zero chunks are entirely the same.
Also note that in each chunk, the third word is correlated with the first word in the previous non-zero chunk. Chunks 3 and 4 each have the third word equalling the prior’s first word. Chunk 2 and 5 both have the third word = 8a00 and the prior first word = 0000. Chunk 1 is the first chunk, and chunk 6 has the third word = 1800 but the prior first word = 0c00.
I’m not seeing any patterns. Comparing between the two is just too similar, so let’s look at that chunk from a completely different (and also official) briefing file. If you have the game, this is the block from attack3.brf, broken into 5-word chunks:
3e4: 0000 0000 0c00 d400 0100 (1) 3ee: 7200 0000 8a00 d400 0100 (2) 3fa: 0000 0000 0000 0000 0000 402: 0000 0000 0000 0000 0000 40b: 0c00 0000 7200 d400 0100 (3) 418: 0000 0000 0c00 d400 0100 (4) 420: 0c00 0000 8a00 d400 0100 (5) 42a: 0000 0000 0000 0000 0000 436: 0000 0000 0000 0000 0000 43e: 0c00 0000 7500 d400 0000 (6)
The same sort of setup, to a very large degree. This file has the “lines of text” option set to 2 (‘h’ had it set to 3, ‘i’ had it set to 4). As “lines of text” increases by 1, the number in question decreases by 0xb every time.
The exact semantic of that number is eluding me at the moment, so I’ll move on to some of the minutia of the map screen. Unfortunately, somewhere along the way the mission file I’ve been editing got corrupted, and xmb-edit won’t open it anymore. So I’ll start using the DESUPLY2.BRF file that ships with xmb-edit. This is the hex dump: k. This is the file I based my TEST.BRF that I’ve been using on.
Map animation pages have three page-wide configuration options: Clock Period, Coordinate Set, and Page Type. We already know where the Clock Period goes, and we already theorized where Page Type goes. The only thing left is Coordinate Set.
If we change the coordinate set from 2 to 1, then the word after the “9000” (at 0x44e) changes from 0 to 1. Thus 0 means coordinate set 2, and 1 means coordinate set 1.
The BRF files contain multiple (typically 2) coordinate sets for specifying where ships are. Each ship has a location in each coordinate set. For instance, in this file, the first ship (an X-Wing) has location 1.63,-12.73,0.16 (a300 f904 1000) in set 1 but location 1.63,-9.00,0.00 in set 2 (a300 8403 0). Looking at the beginning of k, we see the first coordinate set where we expect (immediately after the header 0200 0d00 0200), but we also see the second coordinate set at 0x54, after all of the ships have specified their coordinates. This is where the mysterious zeroes were in the previous files we were using. Based on this information, I also posit that the second 2 in the header is the number of coordinate sets in the file.
The fact that there is no easy way to get to the strings, combined with the mysterious 50 bytes, leads me to believe that the 50 bytes are somehow a table of pointers to the text strings.
The last middle byte in the mystery text seems to be meaningful in some way. In l, the last value is 0x18 (at 0x442), and the text starts at 0xb00 (map tag text, not including length counter), and the three words before the completion messages start at 0x5a4. In tm07mx.brf, the last value is 0x72 (at 0x5be) and the text starts at 0xe88, and the completion header starts at 0x76a. In attack3.brf, the last value is 0x75 (at 0x442) and the text starts at 0xade and the completion header at 0x582.
At a loss for better ideas, I lengthened one of the briefing texts to get this: m
Worth noting is that the number of zeroes (or trash data) following the mission briefing string always equals the length of the string itself. Strange but true, even if you look at LucasArts-supplied missions.
Now that we know how to go from one string to the next, we’re just left with the question of how to get to the first string.
If we add a single character to one of the map tag texts, then we get: n, and we note that nothing in the header changed (except for the save counters), even though the texts all moved back by a character. So whatever’s giving directions to the text gives directions to the start of the map tags.
If we add an animation command to a frame, we get: o, which shows that when the completion messages shifted, the mission briefing texts shifted by the same amount, even though nothing in the header changed. Therefore whatever’s giving directions is based on the distance between the end of the completion messages and the beginning of the mission tag text (it could be distance between the beginning of the completion messages just as well, since that’s a fixed 192 bytes). It also tells us that the distance doesn’t (directly) depend on the length of the animation commands.
Now we just have to find that dependence. I spent a few hours playing with numbers, before I got the good sense to plug it all into a spreadsheet and graph it:
So it looks like if you take the number of words in the header, multiply by 15 and add 151, you get the number of bytes between the end of the completion messages and the start of the first text. This is true for four different LucasArts-supplied files as well as the xmb-edit file I’ve been playing with, even though this seems like a very unusual formula.
With this, I think we can parse (if not semantically understand) the entire file, except for the 50 bytes. Let’s summarize. Everything is a little-endian 16-bit short, unless specified otherwise.
- 02 00 - file version (or other identifier)
- Number of ships/objects.
- Number of coordinate systems.
- For each coordinate system i:
- For each ship j:
- X,Y, and Z coordinates of ship j in coordinate system i, each is a 16-bit signed integer in hundredths.
- For each ship j:
- For each ship i:
- Ship type (16-bit enum, specified above)
- IFF of the ship (16-bit enum: 0 = default, 1 = alliance, 2 = imperial, 3 = neutral)
- Wave size
- Number of waves
- Designation, cargo, alternate cargo (each is a 16-byte fixed-length string).
- Unknown short
- 00 00 40 00 00 00 (otherwise unknown)
- 02 00 (unknown).
- 50 bytes (unknown).
- Number of animation pages.
- For each animation page i:
- Clock period
- Page Size (in words)
- Coordinate system
- Unknown (note: potentially this is what denotes the page type)
- The conditions to transition to the next page (0f27 2900 is default, 9001 0100 is wait-for-click).
- 6 bytes (unknown, generally the last 4 bytes are all zero)
- 3 64-byte strings (these are the completion messages, and are shown on message completion)
- 15*x+151 filler bytes to ignore, according to the formula above.
- 20 00
- The map tag texts: Each is preceeded by a 16-bit integer denoting the length of the string, then the string (not null terminated).
- Some more zeros. I believe this pads out to make there be 32 map tags, some of which are zero-length.
- Another 20 00, followed by a 00 00.
- All of the briefing text strings. Each is preceeded by the string length as a 16-bit short. The mission name starts with a “>”.
With this, we should be able to write a short parser for this file format.
Writing a parser and then attempting to parse all of the missions should reveal any mistakes or corner cases that the above analysis missed.
For example, the analysis missed that the “wait for click” command has unusual semantics. The above format doesn’t quite work for missions which have the “click to see the strategy manual.” These missions have a slightly different page block (on the strategy page), which looks like:
000006b0: 0100 0800 0000 0100 0000 0b00 .')............. 000006c0: 0100 0000 0c00 0800 9001 0100 bd02 1200 ................ 000006d0: 0000 0100 0000 0b00 0900 0000 0c00 0a00 ................ 000006e0: 9001 0100 9001 0a00 9001 0b00 0900 9001 ................ 000006f0: 0c00 0b00 0f27 2900
(this is from t5h1wb)
This block starts out with a normal page, but then at 9001 everything becomes different.
Judging from the xmb-edit gui, the above snippet corresponds to two pages: a normal page with two animation commands (“show title text”, param=1 and “show main text”, param=8) followed by a special page with six animation commands (“show title text”, param=9, “show main text”, param=10, “wait for click”, “clear text”, “show title text”, param=9, and “show main text”, param=11)
We can split the above data into the commands:
0100 0800 0000 0100
0000 0b00 0100
0000 0c00 0800
9001 0100 bd02 1200 0000 0100 // Start of special frame. bd02 is the clock period = 701.
0000 0b00 0900 // First show title text
0000 0c00 0a00 // First show main text
9001 0100 // Wait for click (the 9001 is the tick delay, 400 = 0x190)
9001 0a00 // Clear text
9001 0b00 0900 // Show title text
9001 0c00 0b00 // Show main text
0f27 2900 // Footer
We can hack our way around it in the parser, and start a new page when we see that.
However, not all click pages are setup like that.
Another corner case is that there is a briefing (t4m01am) where the mysterious 50 words starts with a 3, not a 2. Also, the 50 words is actually a 75 words. So the first number in that section is apparently a count of something.
There are also a few missions for which not even the xmb-edit knows everything about how to parse them. wotan3bw (which isn’t used in the game) contains an animation command that isn’t recognized (animation command 41)
The other big thing that comes up is that the above formula for calculating the text jump size is wrong when the number of coordinates is not 2. Empirically, it seems that instead of being based on the number of bytes, the jump size actually depends on the number of ships, like so:
jumpSize = 15*(nShips*6+3) + 151 - 192 - 4
Which simplifies to:
jumpSize = 45*(nShips*2+1) - 45 = 45*(nShips*2) = 90*nShips
I don’t know why they chose 90, but this formula seems much less arbitrary.
After much software engineering, I did eventually get a parser working for all but 5 briefings, all of which don’t appear to be used in game (according to the missions list provided with XMB-edit).
The script is here:
https://gist.github.com/lkolbly/de99b551d55038ea50bacadc66ad92f5
If you run it with a filename following it, it will parse the provided briefing file and spit out the fields in a JSON document (the field names correspond roughly to the names used above).
Of course, a mission briefing file format is useless without the mission itself, so here’s a parser for the .XWI file format the the briefing files are paired with:
https://gist.github.com/lkolbly/115ae348c3f148eedde19c15a0951d60
(based on the specs given at http://www.quantumg.net/xwing_format.php)
The usage is the same - provide a filename, and it will spit out a JSON document representing that mission file.