| In addition to being the home of Dolly, the most famous sheep in the world, Roslin Institute also has an extensive and proud reputation for scientific achievement in a wide range of disciplines. In particular, farm animal genomics and genetics have provided a strong focus for the work of the Institute over the past few years. As genomic technologies develop, the sheer volume and complexity of the data involved present considerable management challenges. The recent initiation of a project to establish a UK-wide resource facility for micro-array studies on farm animal species (ARK-Genomics) provided the impetus to install a Laboratory Information Management System (LIMS) as part of a strategy to meet these challenges. Nautilus LIMS from Thermo LabSystems has been selected.
Located 11km south of Scotland's capital city of Edinburgh, Roslin employs over 340 staff, students and visiting scientists who work in a wide range of disciplines including molecular and cell biology, quantitative genetics, endocrinology, developmental biology, animal behavior and nutrition. It is one of eight research institutes in the UK sponsored by the Biotechnology and Biological Sciences Research Council (BBSRC).
The research of Roslin Institute has concentrated historically on developing new methods to improve livestock and identifying the genes that control traits such as meat quality, disease resistance and reproductive fitness. This has focused on the genetics of each of the major farm animal species (cattle, pigs, chickens and sheep). Work such as this remains a significant aspect of the Institute's role but Roslin's expertise in many areas of science has relevance in other fields, particularly in biomedical applications.
The genome projects of farmed animal species have tended to lag behind that of the Human Genome Project in sequencing terms, primarily because of the differences in budgets allocated to each by governments and funding agencies. However, this has resulted in different approaches being developed to address problems and a particular emphasis on the way that traits are influenced by an animal's genetics. Efforts to map the genomes of these farm animal species have concentrated on the development of linkage or radiation hybrid maps which rely on the positioning of "landmarks" across the genome, rather than the identification of the entire genomic sequence. The calculation of these maps relies on the known pedigree of a well-defined set of animals and the careful characterization of "markers" across all of those animals. If traits are also recorded for those animals then it is possible crudely to determine the location within the genome of genes responsible for controlling some aspect of that trait. Examples of traits studied include growth rates and efficiencies, fat deposition, disease resistance, egg production and other commercially important traits.
As more genes have been mapped in farm animal species and as the sequence of the human genome been elucidated it has emerged that the chromosomes of man and animals show a remarkable level of conservation. Consequently it is potentially possible to use data from the human genome to "fill in the gaps" in the animal maps. Conversely, it is equally possible to utilize animal models to identify genes that control susceptibility to disease states of interest to human health. For example, studies of the genetics of osteoporosis in chickens may provide insights into the same condition in humans.
Genomics is increasingly an information-driven science and the more information that can be obtained on a given disease or trait, the better the estimations of the effects of the genes that might be involved in controlling it. Despite the relative power of animal crosses in identifying regions of the genome containing important genes, the process is still expensive and time consuming and does not identify specific genes. One of the newer technologies that Roslin intends to make extensive use of in the coming years to examine the role of specific genes in more detail is that of micro-arrays. Micro-arrays are a methodology that studies the expression of hundreds or thousands of genes at one time. However, the equipment and resources required to run micro-array experiments are non-trivial. Therefore, the UK’s Biotechnology and Biological Sciences Research Council (BBSRC), a Government Science Funding agency, sought to establish several centers of excellence to carry out such work. Consequently as the result of a successful bid for funding to the BBSRC, the UK Center for Functional Genomics in Farm Animals (ARK-Genomics) was established in September 2000 at Roslin. The remit of the center is to develop micro-array technology and support scientists throughout the UK with an interest in farm animal genomics and gene expression. As a resource used by collaborators outside the Institute, the ARK-Genomics Project is operated as a distinct facility at Roslin.
Driven by a local management committee, ARK-Genomics facilities are available to any group of researchers investigating farm animal genomics and who submit a research proposal approved by the ARK-Genomics Steering Committee, an independent overseeing body appointed by the BBSRC.
Requirement for a LIMS
Although the Institute has wide-ranging experience of data handling, the service-like element of the new project together with the volume and complexity of the data involved meant that Quality Control and Assurance were key issues. The initial grant application provided for two informatics posts, one to provide local data tracking and capture support and the other, based at the University of Manchester, to provide expertise in analysis of the micro-array data generated. As is often the case with such projects, the need for the informatics support pre-empts the actual generation of the data – it would be of little use to deliver a LIMS system at the end of the project lifetime when two or three years of laboratory work had already been completed. Consequently the decision was taken to specify and source a commercial LIMS system and devote the informatics post to customizing it to fit the specific needs of the facility.
LIMS Search & Selection
Roslin solicited quotations from vendors and received demonstrations of competitive LIMS during 2000. Advice was also received from contacts at the Human Genome Mapping Project Research Center (HGMPRC) and the Sanger Center both near Cambridge, UK. These centers have been heavily involved in the development distribution of resources for genome mapping and the sequencing of the human genome respectively. Several systems were rejected prior to the requests for quotations on the grounds that they were heavily orientated to chemical testing where systems were well defined ahead of time and the range of information required for a sample and the relationships between samples were limited. Genomics is a fast-moving field where today’s cutting-edge technology is tomorrow's antique and flexibility is therefore a key requirement. In addition, biology is a complex science with many inter-relationships between samples. The ability to model those inter-relationships and then to use them to track through the system accumulating information is also key to the success of the overall project. Nautilus LIMS from Thermo LabSystems was considered the most advanced solution that closely matched Roslin's requirements, both from the functionality and implementation perspectives. The complexity of the task is best illustrated by an example. Micro-array experiments rely on densely packed spots of DNA on glass slides. Each spot represents the product of a distinct gene. These spots are derived from individual clones extracted from a cDNA library, a collection of molecules representative of the genes being expressed in a particular tissue. If a sample of labeled messenger RNA (mRNA) is layered over the immobilized spots, the molecules in the mRNA will hybridize to the spot that corresponds to their particular gene product. The intensity of the label localizing to each spot therefore is an indicator of the level at which the gene is being expressed in the tissue from which the mRNA was derived. By layering two different mRNA samples from two different tissues or states, each labeled with a different dye, an indication of which genes are expressed at different levels in the two tissues, and hence which genes may be responsible for the differences between the two tissues, can be derived. However, it is important to know what gene each differentially expressed spot corresponds to. The clone removed from the specified well on the chosen plate and immobilized on the slide may not have been directly characterized itself. However, a different aliquot from a parent, sister or daughter plate may have been sequenced. That sequence might overlap with a second sequence from a different clone in a different well on a separate plate. That sequence in turn might show similarity to a sequence in a different species. The identity of the actual clone may therefore need to be derived from information several levels removed from the actual sample used on the slide.
LIMS Flexibility
A classic LIMS that rigidly enforced the “Sample in : Perform task: Sample finished” mode of operation would be of little use in this environment. In contrast, Nautilus’ flexible workflow functionality allows much of this sample tracking to be built in to the core of the system. The workflow feature enables the user to graphically map the actual laboratory workflow and sample life cycle onto the LIMS, without software coding or the necessity for IS (Information Services) specialists. This process concludes in the automated assignment or creation of all relevant dilutions, aliquots, analyses, results and associated reports. A second requirement was that each step in the process be fully auditable. When providing a collaborative service function to external laboratories, it is vital that the facility is able to guarantee the history of each and every data point and again, Nautilus’ built-in audit functions were a positive point.
Also key to Roslin’s decision was that Nautilus could be effectively configured and managed by the research staff rather than an IS resource. This means that ownership of the system lies with the researchers, without the need for those research personnel to understand technology such as ORACLE Forms.
Instrument Integration
At the cutting edge of genomics technology, instrumentation changes rapidly. With much of its laboratory equipment, Roslin is a number of generations further on from the versions it had originally envisaged using. A number of key manufacturers are prepared to supply Roslin the latest models of their technology on a rolling basis as they develop. In addition Thermo LabSystems is continuing to develop its integration plans with specialist instrument vendors. Therefore, bespoke integration solutions between instruments and the Nautilus LIMS are considered impractical and costly to develop and maintain while instrument technology continues to mature. Certainly in the short-term, Roslin remains committed to using standard comma-separated or XML-formatted files as a transport mechanism between Nautilus, as the central spine of the system, and each of the pieces of equipment within the facility. For example, the control systems that drive the robotic equipment used to “cherry-pick” selected clones can be supplied with comma-separated values to control the process and in turn generate comma-separated files to confirm the actions. This option, Roslin considers, is preferable to writing routines to communicate directly with equipment using low-level RS232 interfaces.
Systems Integration
Similarly, processes driven and monitored by well-defined file exchange can readily be implemented to interact with computational analysis steps, extending the LIMS into an Analysis Information Management System (AIMS). Many of the data generated as part of the characterization of the clones in the cDNA libraries (e.g. the sequence similarities described above) are derived from computational analyses, each of which results in an often cryptic text file. An added bonus therefore was the ability of Nautilus to easily import and export data and analysis files. With other LIMS that Roslin assessed, importing files seemed a difficult process which required re-coding each time a new format was introduced. Electronic data capture within Nautilus has eased this process and allowed the integration of external information, from a wide variety of sources, with the data on specific clones and wells held in Nautilus. By simply providing an example file to the Nautilus Parsing Editor and effectively stepping the system through the parsing of that file, a script is generated that can automatically parse files of that format and assign the relevant data into the database. This adds an extra element of flexibility since upgrades to analysis packages frequently alter the output file format. The ease with which parsers can be generated means that new analysis versions and algorithms can rapidly be integrated into Nautilus. Export routines will also be written to integrate the system with other analysis packages. The primary export will be to a micro-array analysis package, initially the MaxD system from the University of Manchester. Thermo LabSystems is currently working with the commercial rights holders of MaxD, Sagitus Solutions, to integrate the MaxD system directly with the Nautilus LIMS using XML. This should simplify the integration further.
Wider Application of Nautilus at Roslin
The flexibility of the Nautilus system may also provide benefit to Roslin in areas beyond the ARK-Genomics facility. The Institute has many data systems that track things like animal pedigrees on the experimental farms, laboratory data, genotypings etc. Often data from a single animal is held in different systems and must be integrated through specifically written applications. Nautilus has no built in concept of sample tube and task to be performed. Consequently, the “thing” being tracked may be an animal rather than a DNA clone in a plate, and the task may be a mating with the outcome being offspring rather than a chemical assay giving a defined result. The ARK-Genomics implementation is therefore a pilot for a potentially wider deployment at Roslin as Nautilus may eventually replace many of these legacy systems. This should help reduce training and support costs and encourages the sharing of data and knowledge between departments and projects.
In Summary Ultimately the measure of success will be the ability of the facility to deliver the data from experiments in full and in a timely fashion. The Institute is a non-profit organization and terms like Return on Investment are not directly relevant. Certainly the purchase of the Nautilus system has allowed the Roslin team to concentrate on fitting the system to their needs, rather than spend time developing an in-house system whilst simultaneously worrying about data being generated and the issues of long-term maintenance and support should the development team move on. The intention is to have the first projects under way before September 2001, i.e. within a year of project initiation. The Institute has long been a leader in Farm Animal genomics and intends to maintain that position. Scientific reputations are on the line.
Author Biographies
Dr Andrew Law received his BSc degree in Animal Science from the Faculty of Agricultural Sciences at Nottingham University in 1986 and a PhD in cattle reproductive physiology from Edinburgh University in 1991. His post-doctorate career has included studies of the molecular biology of chicken growth factors, chicken QTL mapping and the building of generic databases to support QTL mapping. He was appointed Head of Bioinformatics at Roslin Institute in January 2000. This includes overall responsibility for the development and maintenance of several key Farm Animal Genomics databases (resSpecies, ARKdb) used by scientists worldwide as well as providing support within the Institute for sequence analysis, the internal farm databases, and overseeing the development of the Micro-Array facility informatics capability. Andy is also responsible for driving the Institute’s bioinformatics strategy to meet new challenges as they arise.
Mr John Bowman received his BSc in Physics from Liverpool University in 1979, and an MSc in Computer Science with Applications from Aston University in 1984. He has 21 years experience in computer application development and support in the Telecommunications, Computer Manufacturing, Construction, and Pharmaceutical Industries. His present role is to define and deploy the LIMS within the Ark-Genomics project at Roslin.
Mr Dave Hunter received a BSc degree in Electronics and Electrical Engineering from Glasgow University in 1979. Following 8 years working for Ferranti PLC he returned to full time education, gaining a BSc in Genetics from Aberdeen University in 1992. He then spent 6 years working on the genetics of breast cancer. He was appointed as the Technical Manager of the UK Center for Farm Animal Genomics at the Roslin Institute in Sept 2000. He is responsible for equipment specification and the day-to-day running of the ARK-Genomics Facility.
Dr. David Burt received his BSc in Molecular Biology at Edinburgh University in 1977 and a PhD in molecular genetics from Leicester University in 1980. His post-doctoral training included research on molecular genetics of a wide range of species (phage, bacteria, mouse, rat and human) and research areas (transcription circuits, cDNA cloning of growth factors, renin-angiotension system and hypertension, QTL mapping and chicken genomics). He has worked in the ICI-Joint Lab at Leicester University, Harvard Medical School (USA), Clinical Research Center (London) and Roslin Institute. He was appointed Head of Avian Molecular Biology in 1988 and ARK-Genomics in 2000. Responsibilities include: management of the Institutes poultry genome program and the UK center for functional genomics in farm animals.
Alan Archibald is Head of Genomics and Bioinformatics at Roslin Institute. His research interests are in comparative and functional genomics with the emphasis on pigs. He trained in biochemistry, completed his PhD in biochemical genetics in the Animal Breeding Research Organization. He acquired molecular biology skills at the European Molecular Biology Laboratory. On returning to Roslin he worked in the group that developed transgenic sheep for the production of pharmaceutical proteins in milk before making the switch to farm animal genomics.
*An edited version of this article was published in August 2001 edition of American Genomics/Proteomics Technology.
|