Creating simple JSON using recursion

Fabian (TreeFam's fearless leader) had another email request: how do I generate simple JSON by traversing a tree object? The end result needed to be a nested pseudo-object structure where each object has a 'name' field and, optionally, a 'children' field that holds a list of similar pseudo-objects. This is the input format for a rather attractive tree viewer widget that is explained here and that might be adopted for future releases of TreeFam.

A solution for this is given in the code below. The neat thing here is that this is a very simple way of showing recursion being useful. Once you "get" it you'll see very many applications for it but it takes a while to just trust the magic and not overanalyze how it is actually possible.

(As a side note: I first learned how to use recursions on trees when Wayne Maddison tried to teach me how to make myself useful with the Mesquite source code, which uses recursion a lot. He convinced me to just have faith and not think too much about it. Thanks Wayne!)

#!/usr/bin/perl
use strict;
use warnings;
use Bio::Phylo::IO 'parse_tree';
use JSON;

# this just reads the newick string in the __DATA__ section below
my $tree = parse_tree(
    '-handle' => \*DATA,
    '-format' => 'newick',
);

# here we produce a nested data structure for which a simple mapping
# to JSON is provided by the standard CPAN module. 
my $result = traverse( $tree->get_root );

# this produces the output in pretty printed format
print JSON->new->pretty->encode($result);

# this is the important bit: this subroutine is first called with
# the root as argument (on line 15), and subsequently on the children 
# and the children's children, and so on
sub traverse {
    my $node = shift;
    
    # initializes a data structure equivalent to the JSON syntax 
    # for the focal node
    my $result = { 'name' => $node->get_name };
    
    # the 'children' property is optional, so only create one for
    # internal nodes
    if ( my @children = @{ $node->get_children } ) {
    
        # here the 'children' property is magically filled in
        # recursively
        $result->{'children'} = [ map { traverse($_) } @children ];
    }
    return $result;
}

__DATA__
(((CaenA,CaenB)Caenorabditis,(DrosA,DrosB)Drosophila)Deuterostomia,
(Mus_musculus,(Homo_sapiens,Pan_troglodytes)Homininae)Protostomia)Metazoa;

Which produces the following output:

{
   "name" : "Metazoa",
   "children" : [
      {
         "name" : "Deuterostomia",
         "children" : [
            {
               "name" : "Caenorabditis",
               "children" : [
                  {
                     "name" : "CaenA"
                  },
                  {
                     "name" : "CaenB"
                  }
               ]
            },
            {
               "name" : "Drosophila",
               "children" : [
                  {
                     "name" : "DrosA"
                  },
                  {
                     "name" : "DrosB"
                  }
               ]
            }
         ]
      },
      {
         "name" : "Protostomia",
         "children" : [
            {
               "name" : "Mus_musculus"
            },
            {
               "name" : "Homininae",
               "children" : [
                  {
                     "name" : "Homo_sapiens"
                  },
                  {
                     "name" : "Pan_troglodytes"
                  }
               ]
            }
         ]
      }
   ]
}

No comments:

Post a Comment