Generators and the “Yield” Statement

Python generators are tragically underutilized.

I first heard the term when I was nineteen from a senior dev, and forgot about it all through the rest of my schooling. I recently encountered them again, and the uses they command are formidable. They pack an interesting “memory” capability, one that would require classes or global variables or elaborate argument-passing mechanisms to otherwise emulate.

A generator function is any function that contains the “yield” keyword. The behaviour of generator functions is a bit odd, and is best exemplified through example.

Let’s look at a very simple case. Every time a new row is inserted into a standard SQL database, a new unique row identifier is generated for the row. The row number that is to be added needs to be tracked somehow.

One way that you might do this is with a class that contains the current row number and a function to retrieve it:

class RowNumberHolder: 
    def __init__(self): 
        self.row = 1 
    def get_row(self): 
        self.row += 1 
        return self.row - 1 

So long as the class remains in the proper scope, it can be accessed and a new row number can be retrieved.

An alternative way to achieve this result is via a generator. Here’s what the code might look like:

def row_generator(): 
    row = 1 
    while True: 
        yield row 
        row += 1 

Calls to instances of the generator are done with the “next()” function (or within a for-loop):

>>> test = row_generator() 
>>> next(test) 
>>> next(test) 
>>> next(test) 

This will take quite a bit of explaining, especially that “yield” keyword. What’s up with that, anyways? Shouldn’t this function jus loop forever on execution?

The “yield” keyword can be thought of like a return statement. When execution reaches the “yield” statement, the generator function will cede control back to the calling function, returning whatever value along with the yield statement. In this case, the value that is “yielded” is the row number.

However, when the “next()” function is called again on this generator, the generator function will resume operation not at the beginning of the function, but rather at the statement after the yield that was just run. In this way, generator functions can maintain a state of memory, just like classes.

As mentioned, generators can also be used in for-loops:

>>> test = row_generator() 
>>> for element in row_generator(): 
>>> … print(element) 
>>> …  

In this use, the generator is a stand-in for a list of infinite size. Normally, if you wanted to iterate over an infinite list, you’d need infinite memory. A generator solves that problem by generating the list on the fly, per iteration, with a single piece of memory devoted to where in the list operation currently resides.



AVL Trees and Splay Trees

I’ve got a habit for doing in-depth analyses of very basic data structures or computer science concepts. Such posts are solely for my own benefit because, since having left school, I’ve let this knowledge that I worked so diligently to acquire rot.

Rather than discuss trees – the topic of this post – at a basic level, I thought I’d tackle two more advanced topics: AVL trees and splay trees. To learn about this, I’m assuming foreknowledge of all of the following:

  • Basic definitions from graph theory – nodes, edges, paths, and so forth
  • Tree elements and definitions – root, leaves, children, siblings, parents, depth, and height (I’m writing this comment solely for my own benefit: depth of a node is the length of the path from the node to the root; the height of a node is the length of the longestpath from the node to a leaf)
  • Tree traversal – pre-order, in-order, and post-order traversal
  • Binary search trees – how to search through a tree, and an understanding of the runtime analysis of tree-search algorithms

The motivation between the use of AVL trees and splay trees is due to several use cases for regular binary search trees that result in horribly unbalanced trees. A sorted input to a tree construction method, for example, will result in a linked list, which turns what should be O(log N) retrieval into O(N).

It’s not only bad input that will result in unbalanced trees; straightforward, intuitive implementations of insertion and deletion operations into binary search trees will result in unbalanced trees, also. The adjustment of insertion and deletion operations is the primary motivator behind AVL and splay trees. AVL and splay trees – and all self-balancing trees – sacrifice time during insertions and deletions in order to ensure future operations remain fast. As we’ll see, AVL trees and splay trees do this in very different ways. Let’s start with AVL trees.

AVL Trees

The motivation behind an AVL tree is something called a balance condition. A balance condition is some condition that a tree must maintain, with the intention that the tree will remain as balanced as possible on the left and right sides.

Let’s begin with a simple, illustrative, and very incorrect case: requiring that the left and right subtrees of the root node have the same depth:

Figure 1

Oh, dear. The balance condition is too simple; a tree whose root’s subtrees have the same depth possibly results in a tree that is not shallow at all.

Going to the other end of the spectrum on the rigidity of the balance condition: let’s say that all of the nodes of the tree must have subtrees of the same height. It shouldn’t be difficult to convince yourself that this balance condition is ridiculous: only trees with 2k – 1 nodes would possibly satisfy such a condition. The balance condition is too rigid; a balance condition that enforces some kind of depth restriction, but does not restrict the number of nodes that can possibly be inserted into the tree, needs to be devised.

An AVL tree is a binary search tree with a specific balance condition: for every node in the tree, the height of its left and right subtrees can differ by, at most, 1. (Relevant note: the height of a leaf has a height of 0, as might be expected; the height of an empty subtree has a height of -1, by definition.)

What does this balance condition do? Well, let’s take a look at a sample of an AVL tree:

Figure 2

This is an example of an AVL tree – in fact, it is the AVL tree that has the fewest possible nodes and does not violate its balance condition with height 5. (In this case, the letters mean nothing.)

The first thing that I said when I saw this diagram was, “Hey! What’s this about? The tree’s not balanced!” And, if you were to say the same thing, you’d be right: the tree is not perfectly balanced. But, as we stated earlier, maintaining perfect balance in a tree is not only largely unfeasible, it’s unreasonable.

Of note are two things:

  1. First, though the shortest branch (the far left branch) and the longest branch (the far right branch) of the tree have a difference in height of 2 (which will only get bigger as more items get inserted into the tree), the balance condition – that every node’s left and right subtrees heights’ differ by just one – is never violated. Go ahead. Try to find two subtrees of a node that have differences in height of more than one. Go on. I dare you.
  2. Second, though the tree’s height is not, in fact, minimized, the tree is certainly quite shallow. In fact, all operations – with the possible exception of insertion, which we’ll get to in the next paragraphs – can be performed in O(log N) time.

So, let’s say I want to update the tree. Let’s say I insert a node that, in the diagram above, puts the node below “T”. Then what? Inserting a node will violate the AVL balance condition. What happens then? Do we just curl up in a ball and cry?


As mentioned before, insertion is the step that possibly takes longer than O(log N) time, because the tree potentially requires rebalancing after every insertion. These rebalances are accomplished through a mechanism called rotation, which is a very apt name for the mechanism.

Let’s think of a hypothetical node X and its subtrees. Remember that, in order to violate the AVL property, a node’s subtrees must differ in height by two (strictly speaking, two or more). There are four such cases where rebalancing may be necessary:

  1. Inserting into the left subtree of X’s left child
  2. Inserting into the right subtree of X’s left child
  3. Inserting into the left subtree of X’s right child
  4. Inserting into the right subtree of X’s right child

This is trivial. What is slightly less trivial is that the two “outside” cases – i.e. a left-left insertion and a right-right insertion – can be treated identically, and the two “inside” cases – i.e. a left-right insertion and a right-left insertion – can be treated identically. The “outside” cases are handled by a single rotation, and the “inside” cases are handled by a double rotation.

Let’s look at how a single rotation works first. Say there’s a BST tree with six nodes, labeled A through F (Figure 3). And, say that a new node, G, is added to the tree.

Figure 3

In the graph to the left, the tree no longer satisfies the AVL property: the right subtree of node E has height 1, but the left subtree of node E has height -1. This is rectified by “rotating” the nodes.

The nodes that have to be rotated will only ever be along the insertion path, since those are the only branches that will have their balances affected. This means that, at the most, a subtree of the root will have to be rotated to the root. Realizing this makes it easy to visualize how the rotations work. For the moment, erase everything above node E, stick your finger onto node F, and then allow gravity to pull every node downwards. Then, reattach node F – now fully balanced – to the rest of the tree.

While there will be code to come, see if the following rebalances make intuitive sense.

Figure 4

The single rotation works for case 1 and case 4 (left-left and right-right addition). The two examples are a stark reminder that the left-left and right-right cases apply only to the node that becomes unbalanced; it does not mean that nodes are inserted into the far left or far right of the tree. In the first example, B is the node that becomes unbalanced, because a node is inserted into the right subtree of B’s right child (right-right). In the second example, H is the node that becomes unbalanced, because a node is inserted into the left subtree of H’s left child (left-left).

So, what about cases two and three? What happens during right-left or left-right insertion? Let’s take a look at what happens when single rotations are applied to left-right and right-left insertion cases:

Figure 5

In the first example, B is imbalanced because a node is inserted into the left subtree of B’s right child, so a single rotation is performed. This does nothing, leaving the tree just as imbalanced as before. In the second example, H is imbalanced because a node is inserted into the right subtree of H’s left child, so a single rotation is performed. This does nothing, leaving the tree just as imbalanced as before. For both examples, both trees still violate the AVL property following the single rotation operation. The solution: the double rotation.

If we think of our hypothetical AVL-violating node X again: cases 1 and 4 rotate a child node of X. In double rotation, we first rotate a grandchild node of X, and then rotate that node – now a child of X – once more. Two rotations. Hence, “double rotation”.

Let’s re-examine our trees once again, this time with a double rotation:

Figure 6

After the double rotation, both trees satisfy the AVL property.

Here’s something interesting. Compare the middle-stage trees of Figure 6 to the beginning-stage trees of Figure 4. Notice anything strange?

They’re the same.

The first rotation in double rotation swaps around one of the subtrees so that, in effect, the added node contributes to a left-left or right-right insertion, after which a single rotation is applied to rebalance the tree.

Here’s a simple implementation of an AVL tree data structure using recursion.

class AvlNode:
    def __init__(self, element, left=None, right=None, height=0):
        self.element = element
        self.left = left
        self.right = right
        self.height = height
    def __str__(self):
        return str(self.element)

def height(AvlNode):
    if AvlNode is None:
        return -1
        return AvlNode.height

def insert(AvlTreeRoot, element):
    if AvlTreeRoot is None:
        AvlTreeRoot = AvlNode(element)
    elif element < AvlTreeRoot.element:
        AvlTreeRoot.left = insert(AvlTreeRoot.left, element)
        if height(AvlTreeRoot.left) - height(AvlTreeRoot.right) == 2:
            if element < AvlTreeRoot.left.element:
                AvlTreeRoot = rotateWithLeftChild(AvlTreeRoot)
                AvlTreeRoot = doubleWithLeftChild(AvlTreeRoot)
    elif AvlTreeRoot.element < element:
        AvlTreeRoot.right = insert(AvlTreeRoot.right, element)
        if height(AvlTreeRoot.right) - height(AvlTreeRoot.left) == 2:
            if element < AvlTreeRoot.right.element:
                AvlTreeRoot = rotateWithRightChild(AvlTreeRoot)
                AvlTreeRoot = doubleWithRightChild(AvlTreeRoot)
    AvlTreeRoot.height = max( height(AvlTreeRoot.left), height(AvlTreeRoot.right) ) + 1
    return AvlTreeRoot

def rotateWithLeftChild(AvlNode):
    newAvlSubtreeRoot = AvlNode.left
    AvlNode.left = newAvlSubtreeRoot.right
    newAvlSubtreeRoot.right = AvlNode
    AvlNode.height = max ( height(AvlNode.left), height(AvlNode.right) ) + 1
    newAvlSubtreeRoot.height = max ( height(newAvlSubtreeRoot.left), AvlNode.height ) + 1
    return newAvlSubtreeRoot

def rotateWithRightChild(AvlNode):
    newAvlSubtreeRoot = AvlNode.right
    AvlNode.right = newAvlSubtreeRoot.left
    newAvlSubtreeRoot.left = AvlNode
    AvlNode.height = max ( height(AvlNode.left), height(AvlNode.right) ) + 1
    newAvlSubtreeRoot.height = max ( height(newAvlSubtreeRoot.right), AvlNode.height ) + 1
    return newAvlSubtreeRoot

def doubleWithLeftChild(AvlNode):
    AvlNode.left = rotateWithRightChild(AvlNode.left)
    AvlNode = rotateWithLeftChild(AvlNode)
    return AvlNode

def doubleWithRightChild(AvlNode):
    AvlNode.right = rotateWithRightChild(AvlNode.right)
    AvlNode = rotateWithLeftChild(AvlNode)
    return AvlNode

A few things to note about this implementation:

  1. The insertion is accomplished through recursion. A carefully-coded non-recursive version will be faster, but there are non-trivialities that have to be worked out. Coding a recursive version is easier.
  2. Height information is stored explicitly. Since we really only need to determine whether or not heights differ by 2 or more, we could get away with using a 2-bit variable to store this information, though it would sacrifice clarity.
  3. Deletion is tricky, because deletions can also result in imbalances. This will be worked out in another post.

That’s enough about AVL property and rotations. Let’s move on.

Splay Trees

As we’ve seen, the average-case runtime for binary search trees is O(log N). Given a sufficiently imbalanced tree, the worst-case runtime is O(N). The motivation behind splay trees is that a series of M retrievals on a deep node on an imbalanced tree will result in an O(MN) runtime. (It turns out that, in the average case, repeated retrieval on a deep node in an imbalanced tree is very common.)

Splay trees guarantee that M retrievals will complete in O(M log N) time, meaning that the average runtime for M operations is O(log N). Some will take more time; some will take less. On average, however, an operation will complete in O(log N) time.

The mechanism behind this is quite simple: whenever a node is accessed, a series of tree rotations are performed that rotate the accessed node into the root. That’s it. That’s all that is special about splay trees. Splay trees do not have balancing conditions (unlike AVL trees).

Because splay trees have no balancing condition, a poorly implemented splay tree will also result in O(N) average access time. No good. The tree rotation strategy, called the “splaying” strategy, must be implemented in such a way that trees become more balanced after every retrieval, rather than less.

Let’s took at a naive implementation of splaying trees. After node B is accessed, it is rotated to the root of the tree by a series of rotations:

Figure 7

While node B is cheap to access, node C has been pushed to a depth nearing that of B’s former depth. In fact, with this basic rotation strategy, it’s quite obvious that M accesses on the deepest nodes in the tree will result in a running time of Ω(MN). So, nothing has changed. Another strategy for rotating the accessed nodes to the root of the tree needs to be employed.

The splaying strategy on a non-root node X, which correctly implements the splay tree data structure and ensures O(log N) amortized runtime for access, is as follows:

  1. If node X’s parent is the root, do a single rotation on X and the root and stop. Otherwise, X has a parent, P, and a grandparent, G.
  2. The zig-zag case: if X is a right child of P and P is a left child of G, OR X is a left child of P and P is a right child of G, perform a double rotation on X.
  3. The zig-zig case: if X is a right child of P and P is a right child of G, OR X is a left child of P and P is a left child of G, first make P a child of X, and then make G a child of P.

Let’s take a look at how the zig-zag case works, for an access on node D:

Figure 8

And now, let’s look at how the zig-zig case works, for an access on node B:

Figure 9

Some notes:

  • Zig-zig is not two single rotations. Two single rotations would make G the child of X, and P the child of P. The zig-zig step is different.
  • Both zig-zig and zig-zag make trees shallower. It may be difficult to see, but both the zig-zig and zig-zag steps significantly decrease the depth of the nodes along the access path. Let’s take a look at a tree that has had nodes inserted A, B, C, …, J, with an access on A afterwards:

Figure 10

Deletion in splay trees is trivial. First, the node that needs to be deleted is rotated to the root. Then, the node is deleted, leaving a left tree and a right tree. The largest element from the left tree, L, is rotate to the root of the left tree (which will have no right child, since it is the largest node in the left tree). Then, the right tree is added as a child of L.

Lastly, splay trees are easier to program. They have less balance information to maintain and fewer cases to deal with.

Let’s take a look at a sample implementation of splay trees:

class SplayNode:
    def __init__(self, element, left=None, right=None, parent=None): 
        self.element = element
        self.right = right
        self.left = left
        self.parent = parent
    def __str__(self):
        return str(self.element)

def insert(element, root):
    if root is None:
        root = SplayNode(element)
        return root
    if contains(element, root) is None:
        parentToBe = None
        parentToBeChild = root
        while parentToBeChild is not None:
            if parentToBeChild.element > element:
                parentToBe = parentToBeChild
                parentToBeChild = parentToBeChild.left
                parentToBe = parentToBeChild
                parentToBeChild = parentToBeChild.right
        if parentToBe.element > element:
            parentToBe.left = SplayNode(element)
            parentToBe.left.parent = parentToBe
            parentToBe.right = SplayNode(element)
            parentToBe.right.parent = parentToBe
        root = splay(element, root)
    return root

def delete(element, root):
    node = retrieve(element, root)
    if node is not None:
        largestInLeft = findLargest(node.left)
        node.left = splay(largestInLeft.element, node.left)
        node.left.right = node.right
        node.left.right.parent = node.left
        node.left.parent = None
    return node.left

def contains(element, root):
    currentNode = root
    while currentNode != None and currentNode.element != element:
        if currentNode.element > element:
            currentNode = currentNode.left
            currentNode = currentNode.right
    return currentNode

def retrieve(element, root):
    node = contains(element, root)
    if node is not None:
        splay(element, root)
        return node
        return None

def splay(element, root):
    currentNode = contains(element, root)
    if currentNode is not None:
        # Should we zig, zig-zig, or zig-zag?
        if currentNode.parent is not None:
            parent = currentNode.parent
            grandParent = parent.parent
            while grandParent is not None:
                # Zig-zig or zig-zag until the grandParent is None, at which point
                # a zig is executed if necessary.
                if grandParent.element > parent.element and parent.element > element:
                    currentNode = zigZigRight(currentNode)
                elif grandParent.element < parent.element and parent.element < element:
                    currentNode = zigZigLeft(currentNode)
                elif grandParent.element < parent.element and parent.element > element:
                    currentNode = zigZagRight(currentNode)
                elif grandParent.element > parent.element and parent.element < element:
                    currentNode = zigZagLeft(currentNode)
                parent = currentNode.parent
                if parent is not None and parent.parent is not None:
                    grandParent = parent.parent
                    grandParent = None
            # Two cases: either currentNode is at the root, or not.
            if currentNode.parent is not None:
                currentNode = zig(currentNode)
            # Done.
    return currentNode

def zigZigLeft(currentNode):
    parent = currentNode.parent
    grandParent = parent.parent
    currentNode.parent = grandParent.parent
    grandParent.right = parent.left
    parent.right = currentNode.left
    parent.left = grandParent
    grandParent.parent = parent
    currentNode.left = parent
    parent.parent = currentNode
    if grandParent.right is not None:
        grandParent.right.parent = grandParent
    if parent.right is not None:
        parent.right.parent = parent
    return currentNode

def zigZigRight(currentNode):
    parent = currentNode.parent
    grandParent = parent.parent
    currentNode.parent = grandParent.parent
    grandParent.left = parent.right
    parent.left = currentNode.right
    parent.right = grandParent
    grandParent.parent = parent
    currentNode.right = parent
    parent.parent = currentNode
    if grandParent.left is not None:
        grandParent.left.parent = grandParent
    if parent.left is not None:
        parent.left.parent = parent
    return currentNode

def zigZagLeft(currentNode):
    parent = currentNode.parent
    grandParent = parent.parent
    currentNode.parent = grandParent.parent
    parent.right = currentNode.left
    grandParent.left = currentNode.right
    currentNode.left = parent
    parent.parent = currentNode
    currentNode.right = grandParent
    grandParent.parent = currentNode
    if parent.right is not None:
        parent.right.parent = parent
    if grandParent.left is not None:
        grandParent.left.parent = grandParent
    return currentNode

def zigZagRight(currentNode):
    parent = currentNode.parent
    grandParent = parent.parent
    currentNode.parent = grandParent.parent
    parent.left = currentNode.right
    grandParent.right = currentNode.left
    currentNode.right = parent
    parent.parent = currentNode
    currentNode.left = grandParent
    grandParent.parent = currentNode
    if parent.left is not None:
        parent.left.parent = parent
    if grandParent.right is not None:
        grandParent.right.parent = grandParent
    return currentNode

def zig(currentNode):
    parent = currentNode.parent
    # Are we going from the left, or the right?
    if parent.element > currentNode.element:
        currentNode.parent = parent.parent
        parent.left = currentNode.right
        currentNode.right = parent
        if parent.left is not None:
            parent.left.parent = parent
        if currentNode.right is not None:
            currentNode.right.parent = currentNode
        currentNode.parent = parent.parent
        parent.right = currentNode.left
        currentNode.left = parent
        if parent.right is not None:
            parent.right.parent = parent
        if currentNode.left is not None:
            currentNode.left.parent = currentNode
    return currentNode

def findLargest(currentNode):
    while currentNode.right is not None:
        currentNode = currentNode.right
    return currentNode

This particular implementation of splay trees is non-recursive. It requires overhead in the form of nodes pointing to their own parents. Another implementation could be done using a stack, which would record the access path whenever insertions or retrievals are performed. Both are limited in that they require additional space. An alternative implementation is a “top-down” splay tree implementation, which requires O(1) additional space, rather than the O(N) additional space required in the given implementation or O(log N) additional space required for a stack-based implementation. Additional research on these is left as an exercise for the reader (or, perhaps, me, if I choose to revisit this topic).

Deadlift: Form Criticism on Heavy Lifts

A common issue with the deadlift is lower-back rounding. This is something that I really struggle with.

Of course, for lighter deadlifts, it’s easy to maintain form. Once the weight starts reaching twice or more one’s own bodyweight, back-rounding becomes an issue. Evidently, there is quite a bit of variation with how much the back can and should round during the deadlift. As a trainer who doesn’t use a weightlifting belt, I worry about injuries.

Here’s a video of a 335lb deadlift I performed. Pay close attention to how rounded my back becomes:

As soon as I begin pulling the weight up, my back rounds. I didn’t injure myself, but I felt an enormous strain in my lower back, and none in my glutes. I had to take some time off weightlifting after this was filmed in order to recover.

So, what’s the problem? Why did my lower back round? Is that even a big problem?

There are a lot of opinions about lower back rounding, whether it’s good or bad, why it happens, how much it should or shouldn’t happen, and so forth. What became clear to me after a lot of practice and a lot of research are two things.

1. The hips must be located as far backwards as possible.

A lot of elite powerlifters round their backs on their deadlift. Period. They don’t get injured and they set records. Back rounding is not horrible. However, back rounding is often a sign that the lifter is relying more on the lower back than the gluteus maximus in order to complete the lift. Since the lower back will be involved in the lift, no matter what, and the gluteus maximus muscle is bigger than the lower back muscles, correcting form to involve the gluteus is necessary to increase one’s lift numbers.

In order to promote the involvement of the gluteus maximus in the lift, it is necessary to move the hips as far back as possible. If the back is maintained in a static position, nearly all of the work will be performed by the straightening of the hips, rather than the extension of the back. Take a look at a poor starting position, with hips quite far forward:


No good. The hips are very forward. A tell is the fact that the calves are not perpendicular to the ground, but are in fact forward. This position will cause a very quick contraction of the hamstrings, sending the hips into the air while the weight is still low, and forcing the lower back to do the extension work.

Here’s a much better starting position, with the hips backward:


The hips are back, and the calves are perpendicular to the ground. The body’s centre of gravity is so far backward that if the lifter is not in contact with the bar, he or she will fall backwards. If the straightness of the back is maintained, the hip extension will drive the strength of the lift, rather than any work done by the back.

2. During the lift, the weight must be pulled backwards, not upwards.

The big reason why people get injured when doing the deadlift is because the bar moves away from their bodies. The farther the bar is from your body, the more your lower back will become involved in stabilizing your body. If the bar is closer to your body, your lower back won’t have to do as much work to stabilize your body. However, when the bar lifts off the ground, the lifter’s centre of gravity naturally shifts forward (since the weight is heavier than the lifter, usually), causing the bar to swing away from the lifter’s body. To correct this, the lifter should pull the weight backwards, rather than lift the weight upwards.

This is more of a mental aid than an actual physical action, but it’s very relevant. With the hips pulled as far backwards as possible, the lifter’s centre of gravity is balanced when the weight is on the floor. However, as soon as the weight lifts off the floor, even a little bit, the lifter and the weight’s combined centre of gravity goes forward immediately. In order to minimize the action of the lower back and engage the gluteus maximus in the lift, the lifter has to shift his or her centre of gravity backwards again. A good mental aid for this is to imagine that one is pulling the weight backwards, not upwards. This will keep the weight close to the body and the centre of gravity back, on the lifter’s heels.

I incorporated these two ideas into my lift, and tried again, this time with 295. My back remained quite a bit straighter.

Even in this video, you can see my weight shift forward as soon as the weight comes off the ground, and me trying to combat that by pulling backwards. Also, 295lb isn’t too heavy, so my form wasn’t seeing a huge sacrifice due to the heavy weight. Gotta keep practicing.

So, to sum up: pull the hips back in the starting position, and pull the weight backwards, not upwards.

Java’s Garbage Collector

I first heard of Java’s garbage collector when I was 20. I was on a job interview, and the interviewer asked me a very simple question (how to reverse a string, I think). I didn’t know much about Java, so I gave him a solution in C, using pointers. I ended my example code by freeing up the memory I’d allocated. He laughed, saying “Delete? In Java, we have the garbage collector, so we don’t have to worry about freeing allocated memory.”

Immediately after the interview, I researched this topic. Not worrying about memory management? What a concept. My naive understanding of programming couldn’t fathom not worrying about deleting allocated memory spaces.

And yet, that’s exactly what Java’s garbage collector does. Like a wonderful little housemaid, it chases after all of the memory you’ve left unreferenced, cleaning it up so you don’t run out of memory. How nice.

I think that many programmers would be satisfied with the knowledge that they didn’t have to deallocate their objects and leave it at that. For all practical, everyday purposes, this is enough. But there’s more to the story. That’s what this post is about. We’re also going to find out more about what kinds of flags we can pass to the JVM that govern the behaviours of the garbage collector. For example, if you have Eclipse on your computer, take a peek inside eclipse.ini, and you’ll see the following lines (interspersed among others):


By the end of this post, you’ll know what all of those are for.


Let’s start with what the garbage collector actually does.

Java’s garbage collector is a component of Java’s mechanism of automated memory management, allocating and deallocating the necessary memory from the application heap space. I phrased this very specifically for a reason: the name “garbage collector” implies that it takes care of recycling objects as its principle operation. It is easy to arrive at this conclusion, because a big – perhaps the biggest – benefit of the garbage collector is the elimination of the use of deallocation operations (like free or delete) and the reduction of memory leaks (in theory).

But the garbage collector process is just the opposite: the garbage collector tracks objects that are still alive, not the dead objects, and assumes everything else is dead. After all, the flip side of Java’s memory management system is true as well: not only can we let go of any worries about deletions, but we can also let go of any worries about allocation. A lot of programmers who wax poetic about the wonders of the garbage collector only seem to remember that the frees from C are gone. What they leave out is that the allocations are gone as well:

struct FunnyStruct *thingy =
    (struct FunnyStruct *) malloc(sizeof(struct FunnyStruct));

The clear benefit of the garbage collector is that developers can forget about allocating and deallocating memory and leave memory management up to the virtual machine.

So, we know the function of the garbage collector, and we know how it impacts programmers. But how does garbage collection actually work?

How to Trigger the Garbage Collection Process

You can’t.

Boo hoo.

The garbage collector is part of Java’s so-called automated memory management system for a reason: it’s because the application developer is not allowed to control it. For all intents and purposes, garbage collection in Java is non-deterministic. We could get into a whole big debate about whether the garbage collector certain operating system JVMs operates in a lexically deterministic way and so on, but the raw fact is that you can’t trigger the garbage collector (System.gc() doesn’t trigger the garbage collector immediately, or even at all, so get that out of my face).

Glad that’s out of the way. Let’s go on.

Mark & Sweep

“Mark & sweep” is the name given to the algorithm that cleans up all of the garbage in the heap. As you might imagine, it’s a two-step process:

  1. Mark: objects that are still alive are marked as being alive.
  2. Sweep: objects not marked as alive are deleted (swept into a waste-bin, or some other such metaphor).

Both steps are surprisingly involved and will be examined in detail.


The “mark” step has the task of finding dead objects and recycling them (more specifically, of finding objects that are alive and not deleting them). Objects that are being referenced by a pointer are considered alive. The default behaviour of the sweep step (to follow) is to scrub the entire heap, freeing it all for new allocation. The mark step needs to find all referenced objects and designate them as being alive. At first, this seems fairly straightforward: when the garbage collection process begins, all it should do is find all of the variables currently being referenced in live stack frames and delete everything else.

This misses other threads. Darn. The variables of live threads have to be tracked, too. And, while we’re at it, static variables have to be tracked, as well. And so do JNI references, since JNI references are Java objects that have been created by JNI calls, and the JVM can’t tell if the generated Java bytecode is still being referenced by the native code. So, that’s got to be tracked, too.

If we think about it, these four things – local variables, active Java threads, static variables and JNI references – are part of a special set we can call “garbage collection roots”. In Java, every object must be referenced by some other object. All objects are either garbage collection roots (or GC roots), or they are not. GC roots are always reachable at any time, so we can use these GC roots as the roots of a tree traversal, hitting object references on the way. Any object that is not part of a tree beginning at a GC root is not being referenced and can be garbage collected.

Why are the four members of the GC roots set so special?

  • Local variables: local variables are always available for reference, either on the local stack frame or when we unroll the stack. If the examined stack is alive, the variables are alive, and we’ll always be able to reach their referenced objects.
  • Active Java threads: this is essentially a cover for the odd case of thread-local variables, since these are missed whenever we unroll the stack, which only covers the currently-running thread. It ensures that we’re not missing things due to parallelism.
  • Static variables: since static variables do not need instances of their objects to be referenced, they can be referenced at any time.
  • JNI references: this one is really a stretch, but makes perfect sense. The native code that made the JNI call could refer to the generated Java bytecode at any time, and because the native code is not in the JVM, the JVM can’t figure out if future references to the JNI bytecode will occur. The JVM therefore doesn’t clean it up, so it can always be referenced.

If we start an object traversal from any one of these roots, we will wind up hitting every object that still has a reference to it:


When the garbage collector is triggered, the non-reachable objects will not be marked by the “mark” step, and once the sweep begins, well, the fat lady will be singing for those poor, unfortunate non-reachable objects.

This works perfectly well for the sorts of memory leaks that we’re used to in manually memory-managed languages. However, in Java, we are subject to a different, much worse kind of memory leak: the memory leaks caused by forgotten references.


This is exactly what it sounds like: sometimes, programmers forget about references, and the references stay alive. Forever. Memory leak. In fact, this should be called a memory tsunami. This isn’t just a hole in the side of the ship, because there is no programmatic way to detect these forgotten references. Some analysis softwares can point out suspicious objects, but cannot state with certainty that an object is being referenced when it shouldn’t be.

There’s no way to escape programmer error except rigorous code review, so programmer fallibility will always result in memory leaks of some kind.

Let’s move on.


Recall that memory in the heap space is allocated sequentially:


After the mark step of the mark & sweep algorithm, certain parts of the heap will be marked:


Remember: the mark step marks memory that is referenced, and leaves unreferenced heap space unmarked. The sweep step will free all unmarked memory.

The sweeping mechanism currently in usage is a generational garbage collector. I’ll get into what that is in just a second, but let’s take a look at how a simple deletion-and-compacting mechanism would work.

First, the memory not marked would be designated as free, i.e. the memory manager would allow new objects to be allocated in those spaces:


Then, the allocated memory would be defragmented, to make future allocations more efficient:


Naturally, having to compact the entire heap space is very inefficient and very slow. And, in fact, there is one caveat of which we can take advantage that will significantly speed up the sweeping step: most objects exist for very short periods of time. This makes intuitive sense, especially in an object-oriented language, where we’re constantly going into and out of objects to execute code. It has also been verified empirically, to the point where the nature of the garbage collector has been altered to take advantage of this fact.

The generational garbage collection model enforces a structure on the heap: the “young generation” space, the “old generation” space, and the “permanent generation” space:


A very quick overview before a more in-depth analysis: all new objects are created in the “young generation” space. If they are still alive and after a certain time period has elapsed, objects are moved into the old generation space. Garbage collections are kicked off whenever the young or the old generation spaces are maxed out.

The young generation space is actually divided into three parts: “Eden” and two “survivor” spaces. All objects that have just been allocated go into the Eden space:


When the Eden space becomes full, a minor garbage collection is triggered. A minor garbage collection only targets objects in the young generation space. The objects in the old generation space are left alone. A minor garbage collection goes through the mark and sweep step described above, including compacting the memory. However, the memory isn’t compacted in the Eden space; everything leftover moves into the first survivor space. In this example, two objects from the Eden space are still being referenced, so they get moved to S0:


The surviving objects are also given a number, which is incremented every time a minor garbage collection occurs. The next time the Eden space fills up and a minor garbage collection occurs, the objects in the Eden space and the first survivor space are subject to mark and sweep. The surviving objects then get moved into the second survivor space. In this example, one object from the Eden space and one object from S0 are still be referenced, so they get moved to S1:


Note that the object moved from the Eden space has an age of 1, and the object from S0 has an age of 2.

This same process of ageing continues until a surviving object reaches a requisite age. Once the object or objects reach that age, they are moved to the “old generation” space.

The management of the old generation space is much simpler. Once the old generation becomes full, a major garbage collection is triggered. A major garbage collection causes a mark & sweep operation on the entire heap (not just the old generation space), which causes the young generation to go through an ageing cycle as well as frees and compacts the old generation space. The slowdown caused by this over the old generation space is tolerable, because major garbage collection events occur so infrequently, but major garbage collection events are very, very slow. Comparatively speaking, minor collection events are unnoticeable.

So, that’s it for the young and old generation spaces. What about the “permanent generation” space? The JVM loads the permanent generation space with metadata. The JVM can unload the permanent generation space if necessary, and the permanent generation space is subject to sweeping during a major garbage collection event, so it’s not really all that permanent, but it’s about as permanent as we’ll get.

One last note: for any kind of garbage collection – minor or major – all threads in the entire application are completely paused until the garbage collection completes. This isn’t so bad for minor garbage collections, but is a significant event for major garbage collections.

Configuring the Garbage Collector and Heap Space

For your average prototype, the following settings make no difference in the performance of an application:


Setting these correctly in a production environment, however, can prevent showstopping slowdowns. Here’s what they each mean:

  • -Xms: The initial heap size when the JVM starts.
  • -Xmx: The maximum heap size.
  • -Xmn: The size of the young generation space.
  • -XX.PermSize: The initial heap size of the permanent generation.
  • -XX.MaxPermSize: The maximum size of the permanent generation.

That is all there is about garbage collectors. So, go forth, and collect.