Recursion Using Trampoline Functions

Trampoline function is a general term used to describe an additional level of indirection between the calling code and the called code. For example, a trampoline can be used to intercept calls to a function and supply additional functionality that was not present on the original function (see this Microsoft research project [2] to understand what is possible).

To illustrate the idea, here is a simple example of trampoline function that can be used effectivelly to eliminate recusive calls. It is important to do this in languages like C because the compiler is usually unable to optimize tail calls, unlike languages such as Scheme.

The usual recursive way of calculating a factorial would be something like this:

int fact(int x, int total) {
   if (x <= 1) return total;
   return fact(x-1, total*x);
}
int main() {
   printf("result is %d\n", fact(10, 1));
   return 0;
}

The problem with this version is that you are using the function stack to maintain the intermediate results. While this is not really a problem here (because the size of the integer is more of a limitation than the size of the stack), it can be a real concern in other recursive implementations.

Let us see what we can do using a trampoline function. That is, instead of calling the function itself, we call a pointer to a function that is supposed to do what it needs to do, and then return a pointer to a place where we need to go next:

#include <stdlib.h>
#include <stdio.h>

typedef void * (*Func)(int *x, int *total);

void *fact0(int *x, int *total) { return NULL; }

void *fact(int *x, int *total) {
   if (*x <= 1) return fact0;
   *total *= *x;
   *x = *x-1;
   printf("total %d, x: %d\n", *total, *x);
   return fact;
}

int main() {
   int res=1, n = 10;
   Func f = fact;
   while (f != NULL) {
      f = (Func)f(&n, &res);
   }
   printf("result is %d\n", res);
   return 0;
}

This also works, and now there is no problem with the stack size: at each time, the stack holds only the context for the function that is being called. It is always possible to do something like this in a tail recursive function, because we don’t need to store return information for the last call.

In fact, as mentioned by Knuth [1], the method described above can be used to implement any algorithm. This is exactly the direction taken when implementing some interpreters to scheme. The basic idea is to use continuations to store control structure information, so that one never needs to return from a function. However, as stack size is limited in languages such as C, the tactic explained above is used, so one can work with trampoline functions instead of continuations. Tricky but useful when implementing functional languages.

For a similar example in Python, see [3].

[1] Structured Programming With Goto Statements, D.E. Knuth, A.C.M. Computing Surveys, Vol. 6, pp. 261-301.

[2] http://research.microsoft.com/apps/pubs/default.aspx?id=68568

[3] http://mail.python.org/pipermail/tutor/2004-August/031553.html

Using Git Efficiently

I started using Git a few years ago, and to be sure, I cannot say that I know everything about it. However, I know enough of it to say that I am much more efficient using Git than other source control method.

In this post, I will summarize some of the commands everyone need to know in order to start using Git.


Creating a Repository

Creating a repository is just a matter of initialize a directory with git metadata, and add the files to the repository. For example, suppose you want to create a repository of directory /home/john/myproject To do this, you need to use the commands

cd /home/john/myproject
git init
git add .
git commit -m 'first version'

The first git command just initializes the .git directory, where all git metadata is stored. The second command adds everything under the current directory to the repository. Finally, the git commit command creates the initial revision of the project into the repository.


Making Changes

The next common operation in a repository is to make changes to the existing files. After you make the needed changes using your text editor or IDE, the following commands will do the trick:

git add ChangedFile1 ChangedFile2
git commit -m 'description of change'

The git add is necessary even if the file exists, because without it git doesn’t know if the files needs to be updated or not. In git parlance, git add is used to create the “staging area”, which will be written by the commit.

This is a difference from SVN, for example, where anytime we do a commit the system accepts all changes by default. With git you need to be specific about what is being commited.

Optionally, if you prefer to use a GUI to view the changes, you can use the command

git gui

This will invoke the Tk interface, which works both on UNIX and windows (and probably Mac OS X).


Deleting and Recovering

Another common operation is to remove files. This can be done with

git rm FileName

notice that this will work if the files hasn’t been modified. If you need to delete a file that has been changed, use

git rm -f FileName

To recover a file that has been deleted or changed, you can use the following

git checkout -- FileName

Creating Branches

With the commands above you can do pretty much anything you need in terms of normal usage of a VCS. However, creating branches is where the fun is. Git favors the creation of branches, because creating them is very fast and cheap (as measured in memory usage).

To create and use a new branch, type

git branch BranchName
git checkout BranchName

The first command creates the branch, while the second changes to the new branch. You can always go back using

git checkout master

where master is the default branch name.


Merging Files

Git makes it very easy to create branches, but also to merge the results of two or more branches. To merge a branch called “BranchName”, you can use the command

git merge BranchName

Git does a great work of solving conflicts automatically. However, if it can’t solve the conflict, you will be able to edit the files manually and commit them as normal.

You can visualise the results of merges and of other branches using a graphical tool. Just type

gitk --all

and you will be able to see all branches in the repository, along with the history. For a short log of changes, you can also type

git log

Advanced Tricks

There are thousands of advanced tricks you can learn with git, since each part of the system may work independently of the others. I will give you just a flavor of what you can do.

For example, if you want to change the order in which commit appear in a branch, you can use

git rebase -i CommitID

The commit id is the hexadecimal number listed on each commit when you use git log. The command above will call an editor and allow you to edit the commits that have been made since CommitID. You can change the order in which they are applied, merge two or more commits, or even remove some of them. In this way, you can rewrite the history of your commits, and make changes that you wouldn’t be able to do in something as SVN.

Good People Work Hard

Recently, I was looking at some documentation for LaTeX, specifically how the DVI format file was created. I found out that DVI was designed by a student of Knuth called David Fuchs around the end of the 70s, when the first laser printers were being designed.

The whole story of DVI design can be read on an interview by David Fuchs to tug.org. It is a fascinating description of the state of printing 30 years ago. However, the most interesting part of it is a story that shows how really good people work hard to achieve their goals:

I, of course, wrote the DVI-to-CRS software, which also required some tricky font caching (the algorithm for which is the basis of my only published paper, which was actually written by my co-author, guess who?). Those were some intense days of working together, getting it all to work. The CRS was in a cramped basement room, and I had to load and unload each sheet of photographic paper in the darkness, and feed it into the triple-bath developer. Any bug could be potentially in my code or Knuth’s firmware, and as I mentioned, debugging wasn’t easy. After a number of very long days, I came in one morning, and came into Knuth’s office just as he was arriving and handing a stack of legal paper to his secretary. “Here’s a paper I wrote, please type it in,” he told her. I was floored. We’d been working night and day; did he write the paper in his sleep?

It is not uncommon that, in the middle of an important project, you need to work extra hard. But Knuth, despite working all day to make his code work, still spent an unknown amount of time writing papers (maybe during sleep…), so he wouldn’t compromise his scientific output. It just reminds me that to be really good, it is not enough to have good ideas. You also need to work hard.

Next time you feel overwhelmed by your projects, think about this.