Tag: Software

More on Bio-lab Automation – Software for Controlling FIAlab Devices for Microfluidics

Posted by – October 9, 2009

Perl software to control lab syringe pump and valve device, for biology automation, initial version finished today. Works great.  Next, need to add the network code, it can be controlled remotely and in synchronization with other laboratory devices, including the bio-robot.  This software will be used in the microfluidics project.  The software is also part of the larger Perl Robotics project, and a new release will be posted to CPAN next week.
FIAlab MicroSIA Valve and Syringe System FIAlab MicroSIA Experimental Setup

More details on the software follow:

More

Don’t Always Trust Open Source Software. Why Trust Open Source Biology?

Posted by – August 7, 2009

The software you are happily using may be.. unnecessarily brittle. Recently I’ve been developing a little bit of high-level software using open source libraries.  Sometimes it amazes me that open source software works at all.  Here’s an excerpt from the internals I found in the open source library when I looked at why it might not be working properly:

        if (Pipe){
                while(iFlag){
                        vpData = Pipe->Read(&dLen);
                        iFlag = 0;
                                //      If we have more data to read then for God's sake, do it!

                                //      I don't know if this will work ... it would return an
                                //      array. This may not be good. Hmmmm.
                        if(!vpData && GetLastError() == ERROR_MORE_DATA){
                                iFlag = 1;
                        }

                        if(dLen){
                                XPUSHs(sv_2mortal(newSVpv((char *)vpData, dLen)));
                        }else{
                                sv_setsv(ST(0), (SV*) &PL_sv_undef);
                        }
                }

The standard responses from the “open source rah-rah crowd” are something like the following:

  • “Yeah, that’s a crazy comment, but at least you can see it!  In proprietary software, there’s the same problems, it’s just hidden!”
  • “At least you’re given the source code so you can fix it!  In proprietary software, you’re never given access to the source code so you couldn’t fix it if you wanted to!”

These responses miss the big point that commercial software is often much more fully tested for it’s specific environment, and undergoes a much more rigorous design process.  (Beyond the designed-for environment, things might break.  However, the environment is usually described.)

Having something that works — even if it isn’t “great” software — is better than not having anything at all; so on the whole, we can’t complain too much.  Open source is expected to evolve, over the long term (meaning, decades), into a better system: it’s assumed that eventually, most of the oddities will be ironed out.  The Linux kernel itself contains similar comments (I’ve seen them in debugging the UDP/IP stack) which is astounding considering that non-professionals consider Linux to be “stable”.  Kernel hackers know the truth — it “mostly” works (with “mostly” being better than “nothing”)..   Next time someone offers you “free software” take a moment to think:  how much do I have to trust that software to work in a situation which may be different than the author’s original working environment?  How much of the code’s architecture might have comments such as “Hmm.. This isn’t supposed to work or might not work..”?  How much is it going to cost ($$$) to find the oddities and dig into the internals to fix them?

The connection to Biology here is that these crazy design comments like “Hmm.. It really isn’t proper design to build it this way.. but it seems to be work” in synthetic life will be too small to ever read.  (In the RNA or DNA.)   At least with open source software, there’s a big anti-warrantee statement; don’t use the software if there is liability involved.  As I posted last year, the “Open Biology License” hasn’t touched on liability issues at all — only patent issues.  How much can Open Biology be trusted, how much might it cost ($$$) to dig in to find the strange biological behavior, and attempt to fix them?   Debugging biology is much, much harder than debugging software.

Perl Bio-Robotics module, Robotics.pm and Robotics::Tecan

Posted by – July 30, 2009

FYI for Bioperl developers:

I am developing a module for communication with biology robotics, as discussed recently on #bioperl, and I invite your comments. Currently this module talks to a Tecan genesis workstation robot. Other vendors are Beckman Biomek, Agilent, etc. No such modules exist anywhere on the ‘net with the exception of some visual basic and labview scripts which I have found. There are some computational biologists who program for robots via high level s/w, but these scripts are not distributed as OSS.

With Tecan, there is a datapipe interface for hardware communication, as an added $$ option from the vendor. I haven’t checked other vendors to see if they likewise have an open communication path for third party software. By allowing third-party communication, then naturally the next step is to create a socket client-server; especially as the robot vendor only support MS Win and using the local machine has typical Microsoft issues (like losing real time communication with the hardware due to GUI animation, bad operating system stability, no unix except cygwin, etc).

On Namespace:

I have chosen Robotics and Robotics::Tecan. (After discussion regarding the potential name of Bio::Robotics.)  There are many s/w modules already called ‘robots’ (web spider robots, chat bots, www automate, etc) so I chose the longer name “robotics” to differentiate this module as manipulating real hardware. Robotics is the abstraction for generic robotics and Robotics::(vendor) is the manufacturer-specific implementation. Robot control is made more complex due to the very configurable nature of the work table (placement of equipment, type of equipment, type of attached arm, etc). The abstraction has to be careful not to generalize or assume too much. In some cases, the Robotics modules may expand to arbitrary equipment such as thermocyclers, tray holders, imagers, etc – that could be a future roadmap plan.

Here is some theoretical example usage below, subject to change. At this time I am deciding how much state to keep within the Perl module. By keeping state, some robot programming might be simplified (avoiding deadlock or tracking tips). In general I am aiming for a more “protocol friendly” method implementation.

To use this software with locally-connected robotics hardware:

    use Robotics;
    use Robotics::Tecan;

    my %hardware = Robotics::query();
    if ($hardware{"Tecan-Genesis"} eq "ok") {
    	print "Found locally-connected Tecan Genesis robotics!\n";
    }
    elsif ($hardware{"Tecan-Genesis"} eq "busy") {
    	print "Found locally-connected Tecan Genesis robotics but it is busy moving!\n";
    	exit -2;
    }
    else {
    	print "No robotics hardware connected\n";
    	exit -3;
    }
    my $tecan = Robotics->new("Tecan") || die;
    $tecan->attach() || die;    # initiate communications
    $tecan->home("roma0");      # move robotics arm
    $tecan->move("roma0", "platestack", "e");    # move robotics arm to vector's end
    # TBD $tecan->fetch_tips($tip, $tip_rack);   # move liquid handling arm to get tips
    # TBD $tecan->liquid_move($aspiratevol, $dispensevol, $from, $to);
    ...

To use this software with remote robotics hardware over the network:

  # On the local machine, run:
    use Robotics;
    use Robotics::Tecan;

    my @connected_hardware = Robotics->query();
    my $tecan = Robotics->new("Tecan") || die "no tecan found in @connected_hardware\n";
    $tecan->attach() || die;
    # TBD $tecan->configure("my work table configuration file") || die;
    # Run the server and process commands
    while (1) {
      $error = $tecan->server(passwordplaintext => "0xd290"); # start the server
      # Internally runs communications between client->server->robotics
      if ($tecan->lastClientCommand() =~ /^shutdown/) {
        last;

    }
    $tecan->detach();   # stop server, end robotics communciations
    exit(0);

  # On the remote machine (the client), run:
    use Robotics;
    use Robotics::Tecan;

    my $server = "heavybio.dyndns.org:8080";
    my $password = "0xd290";
    my $tecan = Robotics->new("Tecan");
    $tecan->connect($server, $mypassword) || die;
    $tecan->home();

    ... same as first example with communication automatically routing over network ...
    $tecan->detach();   # end communications
    exit(0);

Some notes for those who may also want to create Perl modules for general or BioPerl use:

  • Use search.cpan.org to get Module-Starter
  • Run Module-Starter to create new module from module template
  • Read Module::Build::Authoring
  • Read Bioperl guide for authoring new modules
  • Copy/write perl code into the new module
  • Add POD, perl documentation
  • Add unit tests into the new module
  • Register for CPAN account (see CPAN wiki), register namespace
  • Verify all files are in standard CPAN directory structure
  • Commit & Release

Software for Biohackers

Posted by – July 30, 2009

Some open source software collections of biology interest are noted here. I’ll update this list as time goes on. If you would like to have your project listed too, leave a comment with all the fields of the table and I’ll add your project. If any of these links do not work, let me know too.

Name Status Field Language Description
Eclipse Stable Programming, editing, building, debugging Java, C, C++, Perl, .. Eclipse is the most widely adopted software development environment in terms of language support, corporate support, and user plugin support. It is open source. It’s the “Office” suite for programming.
BioPerl Stable Bioinformatics Perl, C BioPerl has many modules for genomic sequence analysis/matching, genomic searches to databases, file format conversion, etc.
BioPython Stable Bioinformatics Python, C BioPython has many modules for computational biology.
BioJava Stable Bioinformatics Java BioJava has many modules for computational biology.
BioLib Stable Bioinformatics C, C++ BioLib has many modules for file format conversion, integration to other Bio* language projects, genomic sequence matching, etc.
Bio-Linux Stable Operating System with Bundled Bioinformatics Applications Many “A dedicated bioinformatics workstation – install it or run it live”
DNA Linux Stable Operating System with Bundled Bioinformatics Applications Many “DNALinux is a Virtual Machine with bioinformatic software preinstalled.”
Several Synthetic Biology editors, simulators, or suites, listed at OpenWetWare Computational Tools, such as:
Synthetic Biology Software Suite (SynBioSS), BioJADE, GenoCAD, BioStudio, BioCad,TinkerCell, Clotho
Work In Progress Synthetic Biology Moslty Java, some Web based, some Microsoft .NET Pathway modeling & simulation for synthetic biology genetic engineering, editing, parts databases, etc
APE (A Plasmid Editor) Stable Genetic engineering Java DNA sequence and translation editor

Low Cost Microcontroller-based Digital Microfluidics using “Processing”

Posted by – July 1, 2009

I’ve now tested the digital microfluidics board via microcontroller. The digital microfluidics board moves a liquid droplet via Electrowetting-on-Dielectric (EWOD).  The microcontroller switches the high voltage via a switching board (pictured below, using Panasonic PhotoMOS chips), which controls the +930VDC output by the HVPS (posted earlier), and runs over USB using no cost Processing.org software.  This is alpha stage testing.. cleaner version to be built.  The goal of course is to scale the hardware to allow automation of microbiology protocols.

Labview is quite expensive, and industrial-grade high voltage switching boards are also quite expensive.  So I built my own hardware and the Processing.org language is an easy way to test things.  The Processing.org language is a free, open source graphics/media/IO layer on top of Java (as posted previously here).

What follows is the super simple test software written in Processing.org & Java.

More

Playing with the $100K Robots for Biology Automation

Posted by – June 26, 2009

The Tecan Genesis Workstation 200: It’s an industrial benchtop robot for liquid handling with multiple arms for tray handling and pipetting.

The robot’s operations are complex, so an integrated development environment is used to program it (though biologists wouldn’t call it an integrated development environment; maybe they’d call it a scripting application?), with custom graphical scripting language (GUI-based) and script verification/compilation. Luckily though, the application allows third party software access and has the ability to control the robotics hardware using a minimal command set. So what to do? Hack it, of course; in this case, with Perl. This is only a headache due to Microsoft Windows incompatibilities & limitations — rarely is anything on Windows as straightforward as Unix — so as usual with Microsoft Windows software, it took about three times longer than normal to figure out Microsoft’s quirks. Give me OS/X (a real Unix) any day. Now, on to the source code!

More

Analog Data Acquisition from USB Microcontroller using the “Processing” Language

Posted by – March 25, 2009

Building on the previous two mini-projects, I have a mini-graphical data acquisition project now running under the Processing language, getting real-world signals from the USB microcontroller (which is a Microchip PIC on a UBW Board from Sparkfun).  Source code below the screenshot.

USB microcontroller sends data to Processing application, which graphs the data

USB microcontroller sends data to Processing application, which graphs the data

More

Blinky LED ‘Hello World’ using USB Microcontroller in ‘Processing’

Posted by – March 24, 2009

Every good embedded systems hardware project begins with a blinking LED (or toggling level as seen on the oscilloscope).  In Processing.org language, there’s the opportunity for both, since the built-in graphics allow for data display as well as the USB microcontroller interface.  (There’s several Processing projects for Arduino, BTW.)   Source code is below.

USB Microcontroller blinks happily under Processing.org program

USB Microcontroller blinks happily under Processing.org program

More

Using the Processing.org Language with Microcontrollers

Posted by – March 22, 2009

Media-technology engineers at MIT have created a computer language and easy-to-use runtime environment called Processing, hosted at processing.org.  I wrote a small code snip for accessing the PIC microcontroller from a USB port, using Processing; it’s pasted below.

This PIC microcontroller connects to USB on a PC, Mac, or Linux machine

This PIC microcontroller connects to USB on a PC, Mac, or Linux machine

More

Apple iPhone 3.0 as next generation Biomedical device

Posted by – March 17, 2009

Apple’s developer preview today, of iPhone 3.0 software, included the interesting news of support for external accessories, either connected through the physical docking connector or through Bluetooth wireless.


A spokesman from Johnson & Johnson announced an iPhone-blood-pressure-monitor accessory, which provides health biometrics and allows the biometrics to be sent over the iPhone’s network connection as an emergency alert.  Their goal is to make diabetes monitoring easier.

The details of the new iPhone interface are in a thin draft document, External Accessory Framework Reference. This doesn’t include the hardware details necessary to connect arbitrary devices, though once it does, I’ll be hooking lots of different devices to the “iPhone-smart-phone-turned-general-purpose-minicomputer”.

I’m sure the game companies already have external joysticks in the works. A recent interview with Pangea software owner revealed their earnings of $1.5 million from downloads of a single iPhone game (Enigmo), with over 800,000 downloads. His biggest complaint: “no D-pad game controller.” Rest assured, that will be solved soon.

Games aside, the iPhone (or iTouch) offers a solid software environment which includes graphical presentation, ease of data entry, network support, wireless roaming, audio support, and now external device data accessories. This is exactly the kind of tool that medical and bioscience needs to help with a deluge of patients.

Stanford University: Programmable Microfluidics (2007) – Video

Posted by – March 2, 2009

October 3, 2007 lecture by Bill Thies for the Stanford University Computer Systems Colloquium (EE 380). Bill Thies provides an overview of microfluidic technologies from a computer science perspective, highlight areas in the which computer science researchers can contribute to this field; he will also describe recent work in developing new architectures, programming languages, and CAD tools for the microfluidic domain.



EE 380 | Computer Systems Colloquium:
http://www.stanford.edu/class/ee380/

Play Fold.it, the “Tetris-On-Steroids” game that solves protein folding

Posted by – January 29, 2009

“Protein folding” is what again?

It’s this: Foldit (curiously, at the web address: “fold.it”).  And it’s fun to play.  Addictive, really.  Check out the picture:

After I had been playing a while, my 8-year old niece came over to my laptop to see what the cute sound-effects were all about.  After a minute of watching, she said:  “Tell me the web site, I want to play too!”   Yeah, no kidding.

More

In-Depth Review, Part 3 of 5: “Beginning Perl for Bioinformatics” by James Tisdall

Posted by – November 3, 2008

In my previous write-ups of Part 1 and Part 2, I traced the Perl code and examples in the first half of the book, Beginning Perl for Bioinformatics, by James Tisdall, highlighting different approaches to bioinformatics in Perl.  As I mentioned before, Perl provides many different (and often stylistic) methods to solving a software problem.  The different methods usually differ in execution speed, code size, code scalability, readability / maintainability, simplicity, and advanced Perl symantics.  Since this is a beginning text, the advanced Perl isn’t covered.. that means templates, which could be useful for parsing bioinformatics data, are one of the topics not included here.

Often, the fastest code is the smallest code, and contains subtle code tricks for optimization. This is a perfect setup, because, in Chapter 8, Tisdall starts parsing FASTA files.  With Perl’s parsing engine, the subtly of the tricks leaves a lot of room for optimizing software.

FASTA & Sequence Translation

Tisdall offers a software problem based on the FASTA data, so time to solve it:

Tisdall: When you try to print the “raw” sequence data, it can be a problem if the data is much longer than the width of the page. For most practical purposes, 80 characters is about the maximum length you should try to fit across a page. Let’s write a print_sequence subroutine that takes as its arguments some sequence and a line length and prints out the sequence, breaking it up into lines of that length.

Compare his solution to mine:

# Solution by Tisdall
# print_sequence
#
# A subroutine to format and print sequence data

sub print_sequence {

    my($sequence, $length) = @_;

    use strict;
    use warnings;

    # Print sequence in lines of $length
    for ( my $pos = 0 ; $pos < length($sequence) ; $pos += $length ) {
        print substr($sequence, $pos, $length), "\n";
    }
}

The above is a straightforward, strings-based approach. I chose a regex approach, which took a couple minutes to work out, though should be faster during run-time:

sub dna_print {
  my $str = $_[0];
  do {
    $str =~ s/^([\w]{0,25})//;
    print "$1\n";
  } until (!length($str));
}

The above relies on the following method:

More

“SynBioSS: The Synthetic Biology Modeling Suite”

Posted by – October 20, 2008

SynBioSS (Synthetic Biology Software Suite) is a suite of software for the modeling and simulation of synthetic genetic constructs. SynBioSS utilizes the registry of standard biological parts, a database of kinetic parameters, and both graphical and command-line interfaces to multiscale simulation algorithms. SynBioSS is available under the GNU General Public License. Anthony D. Hill, Jonathan R. Tomshine, Emma M. B. Weeding, Vassilios Sotiropoulos, and Yiannis N. Kaznessis, Bioinformatics 2008 24(21):2551-2553; doi:10.1093/bioinformatics/btn468

Sounds neat, let’s try it. Interestingly, the iGEM participants and biologists, in discussions of modeling, have thrown their hands in the air & state that it is difficult or impossible to model biology. Maybe SynBioSS can do the impossible?  Except: There is no specific installer available for OS/X (as of this writing) and it seems there are many assorted packages required.

Here are my install summary/notes/fixes for getting SynBioSS (version 1.0.1) running on OS/X (Leopard 10.5.5):
More

In-Depth Review, Part 2 of 5: “Beginning Perl for Bioinformatics” by James Tisdall

Posted by – September 8, 2008

My Part 1 of 5 review of the book, Beginning Perl for Bioinformatics, by James Tisdall, left off at Chapter 8, just before Tisdall explains associative arrays, gene expression, FASTA files, genomic databases, and restriction sites.

Tisdall: “For simplicity, let’s say you have the names for all the genes in the organism and a number for the expressed genes indicating the level of the expression in your experiment; the unexpressed genes have the number 0. Now let’s suppose you want to know if the genes were expressed, but not the expression levels, and you want to solve this programming problem using arrays. After all, you are somewhat familiar with arrays by this point. How do you proceed?”

Perl’s associative arrays are one of the most powerful aspects of the language.  This is a good problem to examine using hashes.  Solutions to this kind of problem in other languages (C or matlab) might create an N-dimensional array (or even NxM) as a matrix representation of the problem.  In C, it might be solved using a lookup table possibly using a linked list, and the code to drive that needs to be written from scratch or borrowed from an external library.  Perl has a built-in method to solve these kinds of problems.

The solution is to use a hash:

$gene_name = "triA";
$level = 10;
$expression_levels{$gene_name} = $level;  # save 'level' on per-gene basis

This leads Tisdall to review biological transcription and translation, including code for DNA->RNA and RNA->protein data conversion.  The code is given in long form and then optimized in further examples for speed using associative arrays.  Recall the central dogma of biology:

More

In-Depth Review, Part 1 of 5: “Beginning Perl for Bioinformatics” by James Tisdall

Posted by – September 6, 2008

As a specialized field, Bioinformatics is rather young.  It can be difficult to find universities which teach bioinformatics.  Bioinformatics can refer to many different types of tasks — from using programs and data without any computer science knowledge, to implementing database or web software, to writing data conversion programs which modify file formats between database storage methods, to writing algorithms for modeling and visualizing research problems.  Most of the work is described best as “computational biology”.

In the context of Perl (the famous computer language which runs underneath most web pages), Bioinformatics means computing text data retreived from biological databases.

The book, Beginning Perl for Bioinformatics, by James Tisdall, is for learning introductory software techniques in Perl, with a very brief biology review.  For biologists who have rarely programmed and need a starting language or need to learn Perl, this is a good place to start.  For technologists, note the copyright date on the book, to see how dated the information may be; since bioinformatics is still a young field, standards and technology are evolving rapidly.

Tisdall: “A large part of what you, the Perl bioinformatics programmer, will spend your time doing amounts to variations on the same theme as Examples 4-1 and 4-2. You’ll get some data, be it DNA, proteins, GenBank entries, or what have you; you’ll manipulate the data; and you’ll print out some results.”   (Chapter 4)

For software engineers or computer programmers, the biology field is also a completely new realm which is tough to get a handle on, and has it’s own language: Biology as a field (at least to me) has not yet differentiated itself between “soft, life science” and an engineering science.  For example, as a software engineer, the most basic software question is, “I need to write a look-up table for these elements, what are the all the possible strings for the field values?”  Yet this simple question can be very difficult to answer by consulting a biology textbook.  It is important to keep in mind that data manipulation for biology can involve massive amounts of information: also known as, very, very large strings; the strings represent DNA sequences which may range in practical usage from 10k to 100k.

Perl Bioinformatics Introductory Examples

The author states,

Tisdall: How do you start from scratch and come up with a program that counts the regulatory elements in some DNA? Read on.”

In chapter 4, there are the first simple Perl examples:  convert the DNA sequence to the corresponding RNA sequence.  In biology, the DNA uses A, T, G, C (representing the chemical names, of course); whereas RNA uses U instead of T.  Simple string manipulation provides the answer:  s/T/U/g;

More