The author later wrote many now-famous articles, including A Trip Through the Graphics Pipeline.
By chance I once sat near him and the rest of Farbrausch at a demoparty, but I was too shy to say hi.
Once upon a time I found one that did the entire round trip, I neglected to bookmark it. So when I eventually forgot how it worked, or wanted to show others, I was stuck. Never did find it.
1. Put the BWT string in the right-most empty column
2. Sort the rows of the matrix such that the strings read along the columns of the matrix are in lexicographical order starting from the top-row????
3. Repeat step 1 and 2 until matrix is full
4. Extract the row of the matrix that has the end-delimiter in the final column
It's the "sort matrix" step that seems under-explained to me.
1. We sorted the first column to get the BWT column. Thereby we created the structure where the BWT column comes before the sorted column.
2. Therefore if we insert the BWT column before a sorted column, the order of row elements is preserved
3. If we now sort again, the order of characters across individual rows is again preserved
4. Going to step 2 again preserves row order
5. Once all columns are populated, therefore all rows are in the correct original order. And thanks to the end marker we can get the original string from any row.
Now I'm wondering: in the era of software 2.0, everything is figured out by AI. What are the chances AI would discover this algorithm at all?
This doesn't follow the original paper very closely, although I think the device of putting an end-of-string marker into the data instead of including the start index as a separate datum is an improvement over the original algorithm. My own "explorable explanation" of the algorithm, at http://canonical.org/~kragen/sw/bwt, is visually uglier, and is very similar to this page in many ways, but it may still be of some interest: it follows the original paper more closely, which is likely to be useful if you're trying to understand the paper. Also, I used an inverse-transform algorithm that was efficient enough that you could imagine actually using in practice, while still being easy to explain (from the original paper)—but, like the page linked, I didn't use an efficient suffix-array-construction algorithm to do the forward transform. I did link to the three that have been discovered.
I forget when I did this but apparently sometime around 02010: https://web.archive.org/web/20100616075622/http://canonical....
I think my JS code in that page is a lot more readable than the minified app.js served up with this page.