The Importance of Being Python: April 2015

Saturday 4 April 2015

Throwback Sunday: Looking back at "Adapting in_between() for Binary Search Trees"

Alex makes a good point about art and programming in her SLOG. I couldn't agree more with her. Programming can be a tricky mistress sometimes, yet always creative and fun. This last semester I have had some experience helping my wife with her Art & Code projects and, though Java is not my area of expertise, I did learn a lot about what programming can do for the arts. Moreover, being creative can lead you to new ways to solve old problems. Hence, we will look back at one of the problems seen previously, Adapting in_between() for Binary Search Trees, and see how new ideas on recursiveness could shape a new answer to this question, namely Easy Recursion.

Since adapting some code by adding new lines is hardly groundbreaking, let's come back with a new solution involving our triad of One Function, One Helper and One Global Variable. The question was how to look for values inside a BST that satisfies a group of parameters (ie range of numbers, multiple of 3). This can be done rather easily with the triad. Take a look at the code below:

_gather_values = []

def in_between(node, start, end):

    _gather_values.clear()
    
    _get_values(node, start, end)
    
    return _gather_values


def _get_values(node, start, end):
    
    if node.data > start and node.data < end and node.data % 3 == 0:
        _gather_values.append(node.data)
        
    if len(node.children) != 0:
        for x in node.left:
            _get_values(x, start, end)
        for y in node.right:
            _get_values(y, start, end)

Now, it does have a major flaw: it does not take advantage, nor tries to take advantage of, the properties of BSTs. The code at the end of the helper function could be changed to cover the properties of BSTs (left branch goes down, right branch goes up in value). Regardless, this could be improved quickly. The point was to show that this code, which solves the problem, could be constructed easily.

Looking back at that old post, I can see how many different ways you can solve a problem. Creatively, or not, it is always exciting. Whilst I aim for writing elegant code, this does not negate the power of quick, easy solutions. Let us keep coming with creative solutions, and improve upon our code writing.

Holy Mutations! Things getting dicey from now on...

We have done a lot work traversing trees and linking nodes. While it has been all fun, we now need to step things up a notch: we need to mutate the contents inside these trees and linking nodes. Let us remember that the data in each node is mutatable, and thus can be changed at any spot in time. Let us observe at one of the examples of mutation, in this case, of binary search trees.

Delete in Binary Search Trees
Algorithm
(Note: Remember that in a BST, the left child is smaller than the node, and the right child is bigger than the node)

If the node is not a node, then there is nothing to do
If data is less than the data in the node, go down the left (which has smaller values)
If data is more than the data in the node, go down the right (which has bigger values)
If node has fewer than two children, then return the existing one (which essentially brings down the only child if the deleted node has only got one child)
Otherwise, if both children exist, replace the data with the data in the largest child in the left subtree and recursively work on the data from the left node, and finally return that node

On this final step, it is important to notice that, when deleting an element from a BST, you need to pull the item from the left child since the right child is always bigger than the node. To make this clear, we could follow this example:

8
6
5
4
3
2
1

Let's delete the five (5). Then we need to to grab a value to replace the five, and it only makes sense to take the largest child from the left. In this case it would be the four (4). The four is still smaller than the rest of the right children, and bigger than the left children. Then we would get the following:

8
6
4
3
2
1

Code

This code was taken from class, so credit to whoever wrote it! It is shown here to give more context to what was done above.

def delete(node, data):
    ''' (BTNode, data) -> BTNode

    Delete, if it exists, node with data and return resulting tree.

    >>> b = BTNode(8)
    >>> b = insert(b, 4)
    >>> b = insert(b, 2)
    >>> b = insert(b, 6)
    >>> b = insert(b, 12)
    >>> b = insert(b, 14)
    >>> b = insert(b, 10)
    >>> b = delete(b, 12)
    >>> print(b)
            14
        10
    8
            6
        4
            2
    <BLANKLINE>
    '''
    return_node = node
    if not node: # Step 1 from the algorithm
        pass
    elif data < node.data: # Step 2 from the algorithm
        node.left = delete(node.left, data)
    elif data > node.data: # Step 3 from the algorithm
        node.right = delete(node.right, data)
    elif not node.left: # Step 4 from the algorithm
        return_node = node.right
    elif not node.right: # Step 4, again, from the algorithm
        return_node = node.left
    else: # Step 5 from the algorithm
        node.data = _find_max(node.left).data
        node.left = delete(node.left, node.data)
    return return_node

There is a helper function that traverses the right children for the highest value. It is used on the function above so that the highest value from the left child branch is brought and used as the new data of the node, and everything is brought down the left children. After all, the highest value from the left child will be found on the left child's right branch.

def _find_max(node):
    ''' (BSTNode) -> BSTNode
    
    Return maximal node in the BST rooted at node.
    Assume node is not None.
    
    >>> b = BTNode(8)
    >>> b = insert(b, 4)
    >>> b = insert(b, 2)
    >>> b = insert(b, 6)
    >>> b = insert(b, 12)
    >>> b = insert(b, 14)
    >>> b = insert(b, 10)
    >>> _find_max(b).data
    14
    >>> _find_max(b.left).data
    6
    '''
    return _find_max(node.right) if node.right else node

The Challenge
I wish I could say this could is easy to understand. Once you analyze it, it makes sense how it works and what it does. Regardless, there is a difference between knowing what to do and knowing how it needs to be done. The algorithm is not that intuitive, and requires close attention to notice exactly why things happen, and then how.

Friday 3 April 2015

Easy Recursion: One Function, One Helper, One Global Variable

Just like Andrew said in his SLOG, when working on recursive code, it is often the case that simple instructions will not solve a problem. But what if I were to tell you that there is a simple way to achieve recursion? You would probably tell me I am lying, or otherwise say that I have probably broken Python (just like Dima told me when I mentioned I found a way). So what is the solution I found? It can be broken down into these elements:

One Global Variable: Holds the information gathered by recursive code
One Helper Function: Uses recursion to gather information into the global variable
One Function: Clears the global variable, executes the helper function, and returns the global variable

Of course I would be penalized for efficiency, but bear in mind that this is an easy solution to what can seem a murky task to many. It uses recursion, yes, and it solves a problem, yes. Later on when you have more experience programming you can ditch this, but as a temporary solution, it should work great.

Let's put things into perspective. This is what we have seen in class:

def list_all(t):
    ''' (Tree) -> list

    Return list of values in t.

    >>> t = Tree(0)
    >>> list_all(t)
    [0]
    >>> t = descendents_from_list(Tree(0), [1, 2, 3, 4, 5, 6, 7], 3)
    >>> L = list_all(t)
    >>> L.sort()
    >>> L
    [0, 1, 2, 3, 4, 5, 6, 7]
    '''
    # implicit base case when len(t.children) == 0
    return [t.value] + gather_lists([list_all(c) for c in t.children])

(Credit to whoever did this code from Lab 05 solution)

This code is elegant, short, beautiful. It uses a helper function gather_lists(), which concatenates all sublists, and that is about it. The return line is only one line long. Now this will look like a travesty: I will do exactly the same, but using the method described before.

# Global variable here will hold all the info gathered
_all_elements = []

def list_all(t):
    ''' (Tree) -> list

    Return list of values in t.

    >>> t = Tree(0)
    >>> list_all(t)
    [0]
    >>> t = descendents_from_list(Tree(0), [1, 2, 3, 4, 5, 6, 7], 3)
    >>> L = list_all(t)
    >>> L.sort()
    >>> L
    [0, 1, 2, 3, 4, 5, 6, 7]
    '''

    # Clear the global variable first
    _all_elements.clear()

    # Call helper function
    _get_elements(t)
    
    # Return global variable
    return(_all_elements)
    
def _get_elements(t):
    ''' (Tree) -> None
    
    Gather all objects from the tree rooted at t in the global
    variable _get_elements
    '''
    
    # Append value to global variable
    _all_elements.append(t.value)
    
    # If there are children...
    if len(t.children) != 0:
        # Recursively work with the children
        for x in t.children:
            _get_elements(x)

So it does the same, but it is easier to understand, and easier to construct. It is also longer, less elegant, and probably not very efficient. It might be vulnerable to outside interference since it relies on using a global variables as a temporary vehicle.

Challenges

Recursion can be very challenging, but having a way to split the steps to visualize it better, while achieving results, can be encouraging. I encourage everyone to apply this method when unsure about what to do.

The challenge, hence, is how to get from this method to the more elegant, efficient one. And also the question on whether it consumes much more computation power than the elegant method. Perhaps it is not way better, but only slightly better.