class will provide hands on overview of tools and resources for
executing a typical bioinformatics project. The intention is to introduce
and give a flavor of the practical programming skills and design considerations.
To achieve this we will be covering PERL, BioPERL, R, MySQL, XML along
with the EMBOSS application suite, BLAST and machine learning tools (WEKA).
has several features that make it the language of choice for bioinformatics
applications. Because PERL is an interpreted language, writing and modifying
programs is quick and easy; thus lending itself very well for developing
small "quick fix" programs for analysis. This also leads to
the use of PERL as the "glue" tying and automating various
applications written in different languages (such as C, C++, FORTRAN).
- To avoid reinventing
the wheels of "bioinformatics" we will use BioPERL modules
to read, write sequence data and parse output from analysis programs
- Sequence data and
associated information is usually stored in relational databases. We
will cover the creation and use of databases using MySQL, a powerful
Open Source database system, at the same time learn ways to retrieve
data from public repositories like NCBI/GenBANK and store it in MySQL
using PERL and BioPERL. Microsoft
Access is a popular desktop database; we will use Access as a front
end to generate reports and data entry forms with MySQL as the back
Analysis, aggregation and visualization of data are an integral part
of any sequence analysis pipeline. We will be using R to carry out some
rudimentary analysis and create plots
is fast becoming the default format for data exchange. We will cover
the use of SOAP, XML-RPC to obtain and parse data from public repositories
- With the vast amount of data being generated ability to mine data using machine learning techniques is crucial we will explore the use of WEKA a popular machine learning environment
project will utilize skills and knowledge acquired using above mentioned
are encouraged to bring real case scenarios to class for possible
students/users not enrolled for the class:
- You are free to
browse through the slides and use them in your class.
- If you are looking
to learn basic PERL, there are many "other" amazing resources
on the net that you can use (instead of this).
- To follow my sessions
all you need is access to PERL, BioPERL, MySQL, R all of which can be
downloaded and installed on a local machine (MS,OSX,LINUX). Your mileage
may vary but I recommend any LINUX or BSD, OSX based system.
- I cannot give you
a account on our machines.
- If you need sample
data set/examples I use in the class, please e-mail me at:
nirav at arl arizona edu
I will send you the files
Larry Wall, Tom Christiansen, Jon Orwant
(UA library has both books available through netlibrary.com for online
in a Nutshell : A Desktop Quick Reference
Siever, Ellen.; Spainhour,
Stephen.; Patwardhan, Nate
you do want to buy a book check
Perl CD Bookshelf 2.0 CD-ROM includes 5 books for ~ US $60
is not necessary for this class. We will be using documentation from various