Archive for December, 2013


Perl: Processing a CSV or Tab-Delimited File into a Hash of Hashes

Saturday, December 28th, 2013

Perl is a great language when it comes to processing data. The biggest reason is because its powerful regular expression support makes it easy to find the data you’re looking for.

When you have a file that is in CSV (comma-delimited), tab-delimited, or some other delimited format, Perl is the perfect option for reading in the file and doing something with its data. In this example we will store it into a hash of hashes.

Let’s jump right into the code.

First, we initialize an empty hash where we’ll store the data, then open the file for reading:

%file_data = ();
unless (open (IN, "tab_file.txt"))
{
    print "ERROR: Could not open input file tab_file.txt: $!\n";
    exit;
}

Now we will read in each line in the file:

while ($line = <IN>)
{

Get rid of the ending newline:

    chomp($line);

Here’s the key to getting the data on the line. We’re splitting the line (here on the tab character, but it can be any character) into parts and storing those parts into an array:

    @line_info = split(/\t/, $line);

Let’s say the key field to the line’s information is in the first field, so we store that temporarily:

    $key_fld = $line_info[0];

Now we’re going to iterate through the tab-delimited fields from the line and store them into a hash of hashes for easy retrieval later on. The first line here is converting the array into a reference. The second line is then iterating through that array reference. The line inside the loop stores each field into a hash of hashes: the first key field is the first field on the line, which we defined above, and the second key is just the number of the field. (This might not be the most practical way to store the data, but you get the idea.)

    $line_info = \@line_info;
    for ($i = 0; $i <= $#$line_info; $i++)
    {
        $file_data{$key_fld}{$i} = $line_info->[$i];
    }
}

We close our input file, since we’re done reading it:

close(IN);

Now if we want to iterate through the hash of hashes we created with the data, we can do something like this:

foreach $key (keys %file_data)
{
    print "Key field: $key\n";
    foreach $key2 (keys %{$file_data{$key}})
    {
        print "- $key2 = $file_data{$key}{$key2}\n";
    }
}

And that’s it–we’re done!

Welcome to PHP!

Saturday, December 28th, 2013

This category will contain information about PHP.

Welcome to Perl!

Saturday, December 28th, 2013

This category will contain information about Perl.

Welcome to JavaScript!

Saturday, December 28th, 2013

This category will contain information about JavaScript.

Welcome to HTML/CSS!

Saturday, December 28th, 2013

This category will contain information about HTML/CSS.

Welcome to AJAX!

Saturday, December 28th, 2013

This category will contain information about AJAX.