Tag: Perl

More on Bio-lab Automation – Software for Controlling FIAlab Devices for Microfluidics

Posted by – October 9, 2009

Perl software to control lab syringe pump and valve device, for biology automation, initial version finished today. Works great.  Next, need to add the network code, it can be controlled remotely and in synchronization with other laboratory devices, including the bio-robot.  This software will be used in the microfluidics project.  The software is also part of the larger Perl Robotics project, and a new release will be posted to CPAN next week.
FIAlab MicroSIA Valve and Syringe System FIAlab MicroSIA Experimental Setup

More details on the software follow:


Perl Bio-Robotics module, Robotics.pm and Robotics::Tecan

Posted by – July 30, 2009

FYI for Bioperl developers:

I am developing a module for communication with biology robotics, as discussed recently on #bioperl, and I invite your comments. Currently this module talks to a Tecan genesis workstation robot. Other vendors are Beckman Biomek, Agilent, etc. No such modules exist anywhere on the ‘net with the exception of some visual basic and labview scripts which I have found. There are some computational biologists who program for robots via high level s/w, but these scripts are not distributed as OSS.

With Tecan, there is a datapipe interface for hardware communication, as an added $$ option from the vendor. I haven’t checked other vendors to see if they likewise have an open communication path for third party software. By allowing third-party communication, then naturally the next step is to create a socket client-server; especially as the robot vendor only support MS Win and using the local machine has typical Microsoft issues (like losing real time communication with the hardware due to GUI animation, bad operating system stability, no unix except cygwin, etc).

On Namespace:

I have chosen Robotics and Robotics::Tecan. (After discussion regarding the potential name of Bio::Robotics.)  There are many s/w modules already called ‘robots’ (web spider robots, chat bots, www automate, etc) so I chose the longer name “robotics” to differentiate this module as manipulating real hardware. Robotics is the abstraction for generic robotics and Robotics::(vendor) is the manufacturer-specific implementation. Robot control is made more complex due to the very configurable nature of the work table (placement of equipment, type of equipment, type of attached arm, etc). The abstraction has to be careful not to generalize or assume too much. In some cases, the Robotics modules may expand to arbitrary equipment such as thermocyclers, tray holders, imagers, etc – that could be a future roadmap plan.

Here is some theoretical example usage below, subject to change. At this time I am deciding how much state to keep within the Perl module. By keeping state, some robot programming might be simplified (avoiding deadlock or tracking tips). In general I am aiming for a more “protocol friendly” method implementation.

To use this software with locally-connected robotics hardware:

    use Robotics;
    use Robotics::Tecan;

    my %hardware = Robotics::query();
    if ($hardware{"Tecan-Genesis"} eq "ok") {
    	print "Found locally-connected Tecan Genesis robotics!\n";
    elsif ($hardware{"Tecan-Genesis"} eq "busy") {
    	print "Found locally-connected Tecan Genesis robotics but it is busy moving!\n";
    	exit -2;
    else {
    	print "No robotics hardware connected\n";
    	exit -3;
    my $tecan = Robotics->new("Tecan") || die;
    $tecan->attach() || die;    # initiate communications
    $tecan->home("roma0");      # move robotics arm
    $tecan->move("roma0", "platestack", "e");    # move robotics arm to vector's end
    # TBD $tecan->fetch_tips($tip, $tip_rack);   # move liquid handling arm to get tips
    # TBD $tecan->liquid_move($aspiratevol, $dispensevol, $from, $to);

To use this software with remote robotics hardware over the network:

  # On the local machine, run:
    use Robotics;
    use Robotics::Tecan;

    my @connected_hardware = Robotics->query();
    my $tecan = Robotics->new("Tecan") || die "no tecan found in @connected_hardware\n";
    $tecan->attach() || die;
    # TBD $tecan->configure("my work table configuration file") || die;
    # Run the server and process commands
    while (1) {
      $error = $tecan->server(passwordplaintext => "0xd290"); # start the server
      # Internally runs communications between client->server->robotics
      if ($tecan->lastClientCommand() =~ /^shutdown/) {

    $tecan->detach();   # stop server, end robotics communciations

  # On the remote machine (the client), run:
    use Robotics;
    use Robotics::Tecan;

    my $server = "heavybio.dyndns.org:8080";
    my $password = "0xd290";
    my $tecan = Robotics->new("Tecan");
    $tecan->connect($server, $mypassword) || die;

    ... same as first example with communication automatically routing over network ...
    $tecan->detach();   # end communications

Some notes for those who may also want to create Perl modules for general or BioPerl use:

  • Use search.cpan.org to get Module-Starter
  • Run Module-Starter to create new module from module template
  • Read Module::Build::Authoring
  • Read Bioperl guide for authoring new modules
  • Copy/write perl code into the new module
  • Add POD, perl documentation
  • Add unit tests into the new module
  • Register for CPAN account (see CPAN wiki), register namespace
  • Verify all files are in standard CPAN directory structure
  • Commit & Release

Playing with the $100K Robots for Biology Automation

Posted by – June 26, 2009

The Tecan Genesis Workstation 200: It’s an industrial benchtop robot for liquid handling with multiple arms for tray handling and pipetting.

The robot’s operations are complex, so an integrated development environment is used to program it (though biologists wouldn’t call it an integrated development environment; maybe they’d call it a scripting application?), with custom graphical scripting language (GUI-based) and script verification/compilation. Luckily though, the application allows third party software access and has the ability to control the robotics hardware using a minimal command set. So what to do? Hack it, of course; in this case, with Perl. This is only a headache due to Microsoft Windows incompatibilities & limitations — rarely is anything on Windows as straightforward as Unix — so as usual with Microsoft Windows software, it took about three times longer than normal to figure out Microsoft’s quirks. Give me OS/X (a real Unix) any day. Now, on to the source code!


Don’t Train the Biology Robot: Have the Machine Read the Protocol and Automate Itself

Posted by – June 3, 2009

Imagine reading these kinds of instructions and performing such a task for a few hours: “Resuspend pelleted bacterial cells in 250 µl Buffer P1 and transfer to a micro-centrifuge tube. Ensure that RNase A has been added to Buffer P1. No cell clumps should be visible after resuspension of the pellet. If LyseBlue reagent has been added to Buffer P1, vigorously shake the buffer bottle to ensure LyseBlue particles are completely dissolved. The bacteria should be resuspended completely by vortexing or pipetting up and down until no cell clumps remain. Add 250 µl Buffer P2 and mix thoroughly by inverting the tube 4–6 times. Mix gently by inverting the tube. Do not vortex, as this will result in…” (The protocol examples used here are from Qiagen’s Miniprep kit, QIAPrep.)

Wait a minute!  Isn’t that what robots are for?  Unfortunately, programming a bioscience robot to do a task might take half a day or a full day (or more, if it hasn’t been calibrated recently, or needs some equipment moved around).   If this task has to be performed 100 or 10,000 times then it is a good idea to use a robot.  If it only has to be done twice or 10 times, it may be more trouble than it’s worth.  Is there a middle ground here?

If regular English-language biology protocols could be fed directly into a machine, and the machine could learn what to do on it’s own, wouldn’t that be great?  What if these biology protocols could be downloaded from the web, from a site like protocol-online.org ?   It’s possible! (Within the limited range of tasks that are required in a biology lab, and the limited range of language expected in a biology protocol.)

Biology Protocol Lexical Analyzer converts biology protocols to machine code for a robot or microfluidic system to carry out

Biology Protocol Lexical Analyzer converts biology protocols to machine code for a robot or microfluidic system to carry out

The point of this prototype project is this: there are thousands of biology protocols in existence, and biologists won’t quickly transition to learning enough engineering to write automated language themselves (and it is also more effort than should be necessary to use a “easy-to-use GUI” for training a robot). The computer itself should be used to bridge the language gap. Microfluidics automation platforms (Lab on Chip) may be able to carry out the bulk of busy work without excessive “training” required.


In-Depth Review, Part 3 of 5: “Beginning Perl for Bioinformatics” by James Tisdall

Posted by – November 3, 2008

In my previous write-ups of Part 1 and Part 2, I traced the Perl code and examples in the first half of the book, Beginning Perl for Bioinformatics, by James Tisdall, highlighting different approaches to bioinformatics in Perl.  As I mentioned before, Perl provides many different (and often stylistic) methods to solving a software problem.  The different methods usually differ in execution speed, code size, code scalability, readability / maintainability, simplicity, and advanced Perl symantics.  Since this is a beginning text, the advanced Perl isn’t covered.. that means templates, which could be useful for parsing bioinformatics data, are one of the topics not included here.

Often, the fastest code is the smallest code, and contains subtle code tricks for optimization. This is a perfect setup, because, in Chapter 8, Tisdall starts parsing FASTA files.  With Perl’s parsing engine, the subtly of the tricks leaves a lot of room for optimizing software.

FASTA & Sequence Translation

Tisdall offers a software problem based on the FASTA data, so time to solve it:

Tisdall: When you try to print the “raw” sequence data, it can be a problem if the data is much longer than the width of the page. For most practical purposes, 80 characters is about the maximum length you should try to fit across a page. Let’s write a print_sequence subroutine that takes as its arguments some sequence and a line length and prints out the sequence, breaking it up into lines of that length.

Compare his solution to mine:

# Solution by Tisdall
# print_sequence
# A subroutine to format and print sequence data

sub print_sequence {

    my($sequence, $length) = @_;

    use strict;
    use warnings;

    # Print sequence in lines of $length
    for ( my $pos = 0 ; $pos < length($sequence) ; $pos += $length ) {
        print substr($sequence, $pos, $length), "\n";

The above is a straightforward, strings-based approach. I chose a regex approach, which took a couple minutes to work out, though should be faster during run-time:

sub dna_print {
  my $str = $_[0];
  do {
    $str =~ s/^([\w]{0,25})//;
    print "$1\n";
  } until (!length($str));

The above relies on the following method:


In-Depth Review, Part 2 of 5: “Beginning Perl for Bioinformatics” by James Tisdall

Posted by – September 8, 2008

My Part 1 of 5 review of the book, Beginning Perl for Bioinformatics, by James Tisdall, left off at Chapter 8, just before Tisdall explains associative arrays, gene expression, FASTA files, genomic databases, and restriction sites.

Tisdall: “For simplicity, let’s say you have the names for all the genes in the organism and a number for the expressed genes indicating the level of the expression in your experiment; the unexpressed genes have the number 0. Now let’s suppose you want to know if the genes were expressed, but not the expression levels, and you want to solve this programming problem using arrays. After all, you are somewhat familiar with arrays by this point. How do you proceed?”

Perl’s associative arrays are one of the most powerful aspects of the language.  This is a good problem to examine using hashes.  Solutions to this kind of problem in other languages (C or matlab) might create an N-dimensional array (or even NxM) as a matrix representation of the problem.  In C, it might be solved using a lookup table possibly using a linked list, and the code to drive that needs to be written from scratch or borrowed from an external library.  Perl has a built-in method to solve these kinds of problems.

The solution is to use a hash:

$gene_name = "triA";
$level = 10;
$expression_levels{$gene_name} = $level;  # save 'level' on per-gene basis

This leads Tisdall to review biological transcription and translation, including code for DNA->RNA and RNA->protein data conversion.  The code is given in long form and then optimized in further examples for speed using associative arrays.  Recall the central dogma of biology:


In-Depth Review, Part 1 of 5: “Beginning Perl for Bioinformatics” by James Tisdall

Posted by – September 6, 2008

As a specialized field, Bioinformatics is rather young.  It can be difficult to find universities which teach bioinformatics.  Bioinformatics can refer to many different types of tasks — from using programs and data without any computer science knowledge, to implementing database or web software, to writing data conversion programs which modify file formats between database storage methods, to writing algorithms for modeling and visualizing research problems.  Most of the work is described best as “computational biology”.

In the context of Perl (the famous computer language which runs underneath most web pages), Bioinformatics means computing text data retreived from biological databases.

The book, Beginning Perl for Bioinformatics, by James Tisdall, is for learning introductory software techniques in Perl, with a very brief biology review.  For biologists who have rarely programmed and need a starting language or need to learn Perl, this is a good place to start.  For technologists, note the copyright date on the book, to see how dated the information may be; since bioinformatics is still a young field, standards and technology are evolving rapidly.

Tisdall: “A large part of what you, the Perl bioinformatics programmer, will spend your time doing amounts to variations on the same theme as Examples 4-1 and 4-2. You’ll get some data, be it DNA, proteins, GenBank entries, or what have you; you’ll manipulate the data; and you’ll print out some results.”   (Chapter 4)

For software engineers or computer programmers, the biology field is also a completely new realm which is tough to get a handle on, and has it’s own language: Biology as a field (at least to me) has not yet differentiated itself between “soft, life science” and an engineering science.  For example, as a software engineer, the most basic software question is, “I need to write a look-up table for these elements, what are the all the possible strings for the field values?”  Yet this simple question can be very difficult to answer by consulting a biology textbook.  It is important to keep in mind that data manipulation for biology can involve massive amounts of information: also known as, very, very large strings; the strings represent DNA sequences which may range in practical usage from 10k to 100k.

Perl Bioinformatics Introductory Examples

The author states,

Tisdall: How do you start from scratch and come up with a program that counts the regulatory elements in some DNA? Read on.”

In chapter 4, there are the first simple Perl examples:  convert the DNA sequence to the corresponding RNA sequence.  In biology, the DNA uses A, T, G, C (representing the chemical names, of course); whereas RNA uses U instead of T.  Simple string manipulation provides the answer:  s/T/U/g;