The word “customization” has negative connotations in software marketing. Many software customers equate the term with additional (expensive) consulting services or other programming assistance that software vendors may sell in addition to out-of-the-box software. Practically, however, software engineers understand the power of customization. When application programming interfaces (APIs) are built using standard, familiar software tools and architectural styles, programmers have the ability to develop tailored, unique applications. Such capability is essential in modern, next-generation genomics labs, where the pace of technological change demands that programmers and bioinformaticians think outside “out-of-the-box.”
Labs must consider three elements in evaluating genomics laboratory information management system (LIMS) to support the ever-changing workflow characteristics of next-generation sequencing. First, they should expect and demand systems that mesh with the way they do science and codify best practices in next-generation genomics. Second, they should understand the difference between configurability and customization and how each plays a critical role in transforming out-of-the-box software into adaptable, custom solutions. Third, they should empower the entire lab—from scientists and technicians to bioinformaticians and scientific programmers—to define and implement the parts of the system that they know best. In this way, software becomes an enabler—yet another example of the innovative output that sets cutting edge labs apart.
Figure 1 – An API provides control and flexibility, while enabling scientific programmers to focus on high-value customization projects. |
1. Know What You Need—Then Demand It
Even in labs performing cutting edge research such as next-generation sequencing (NGS), certain laboratory tasks and workflows are universal. All labs need ways to manage samples, monitor experiments, and collect, analyze, and report data. Since the 1980s, labs have turned to LIMS to handle these types of tasks;LIMS is now a mature class of scientific software employed across diversefields.
Not surprisingly, many vendors are now introducing LIMS targeted to NGS. The challenge for these vendors, though, is the adaptability and extensibility that an NGS LIMS must possess. Many commercial LIMS designed specifically for NGS research can be rigid and prescriptive about how work proceeds—and changes to the out-of-the-box configuration are discouraged and often impossible. Labs also have the option to work with broad enterprise LIMS vendors, who will build tailored systems using a combination of custom components coded for a requesting lab and components developed by the vendor for other customers. Each customer receives a system built to its specifications, but at a cost—initial implementations using this approach are costly and slow, and when a lab’s needs change (and in next-generation sequencing, change is guaranteed), the vendor will need to update the system.
Faced with these choices, some labs opt to build their own LIMS, tempted by the promise of being able to design and implement exactly the system they want. But most labs soon discover the drawback of going it alone—when workflows and needs inevitably change, the system will need to change just as quickly, requiring a critical investment of time, money, and personnel. Obviously, a leading-edge, NGS lab has no interest in becoming an expert in software design and development.
There’s an easy answer to whether to build or buy: Both. Effectively implementing this hybrid approach requires that labs first select a LIMS that matches their specific science and workflows. Then labs build the parts they are best suited to build, but no more. With targeted NGS options available from a variety of vendors, labs should expect preconfigured, out-of-the-box workflows that codify and support NGS best practices. LIMS should be able to accommodate a lab’s preferred instrumentation. Each type of instrumentation comes with vendor-specified kits and protocols to optimize use and performance, which changes the way LIMS integration can occur. Some systems, for instance, use sample import sheets to run instrumentation, while others work as a “black box” that can pull information directly from the LIMS to conduct experimental runs. A flexible LIMS enables labs to select from out-of-the-box methods to configure their system—no special expertise or programming required.
2. Learn How Configuration and Customization Empower Users
For software to be truly adaptable, it must be both configurable and customizable. Software marketing often conflates these terms, but engineers understand they are distinct—and the distinction is an essential aspect of implementing software to streamline NGS research.
Configuration refers to changes in existing software that can be made via the user interface by any user. As mentioned above, many systems offer preconfigured, out-of-the-box set ups that scientists can use to add new lab methods, collect records off a new instrument, or specify a particular sample preparation procedure. Easy configuration empowers scientists, who best understand the requirements and how the system needs to work, to make the changes they need.
More importantly, configuration frees developers to focus on more high-value projects, which typically require customization. Engineers and programmers understand that customization is quite simply changing the software logic via scripts or code to enable it to do something new or different. Practically, anyone armed with the appropriate programming expertise, software tools, and APIs can make these types of modifications. Bioinformaticians and scientific programmers work best when they have the power and control afforded through systems that let them use familiar tools to adapt software to accommodate the unique needs to their labs. Software that supports both configuration by scientists and lab technicians and customization by bioinformaticians and scientific programmers enables labs to efficiently implement systems that better match their current and future informatics requirements.
3. Consider Open, Flexible APIs for Genomics
Many vendors claim that their systems support easy customization, but ultimately lock down code and APIs in proprietary formats that can be difficult to modify without experience or training. True adaptability leverages tools already familiar to scientific programmers, such as modern architectural styles and familiar open source and commercial scripting languages such as Python, PERL, or Groovy.
One API architectural style that is well-suited to next-generation genomics is representational state transfer (REST). RESTful services establish communication between clients and servers by defining resources and applying methods that designate the state of these resources. RESTful services differ from simple object access protocol (SOAP)-based services in several ways:
- REST offers only four methods (verbs), but can accommodate a wide range of resources (nouns)
- REST organizes resources hierarchically into hyperlinks that represent items in the physical world such as samples, lab processes, and derivatives of samples from lab processing
- REST is simple; hyperlinks not only describe the data collected, but provide access to it
Either SOAP or REST services can be used to create interfaces to genomics data. SOAP interfaces, however, might become excessively complicated with operations (verbs) specialized to the types of samples and experiments run in a lab. Most genomics labs organize data hierarchically: They perform work for multiple institutions or partners, which will be organized into different discrete projects, which each include associated samples that are run in specific experiments. RESTful APIs are therefore a natural choice for designing interfaces to genomics data; they are configured to handle the nouns that are so important to genomics research by organizing projects, samples, experiments—whatever the data type—into human-friendly, readily understood hyperlinks.
A LIMS API that can accommodate the scripting languages familiar to bioinformaticians, such as Python, PERL, or Groovy, gives labs the freedom to build and modify systems almost on the fly. Scientific programmers can build scripts to handle a variety of tasks, such as automating sample tracking or quality control procedures, initiating computational processing, or interfacing with instrumentation or custom analytics. They can also deploy handy plug-ins that trigger scripts to run directly in the lab technician’s user interface, providing control and flexibility without disrupting the scientific workflow.
The benefits of empowering programmers through customization are apparent at two high profile genomics centers that have extended a commercial LIMS to handle unique lab workflows. The Northwest Genomics Centers at the University of Washington configured its LIMS to support existing sample quality control, exome capture, and sequencing workflows in just four months and has made multiple workflow changes in its quantitative polymerase chain reaction (qPCR) methods by developing custom scripts. The USC Epigenome Center at the University of Southern California selected an adaptable commercial LIMS to accommodate its heavy investments in internally developed systems for analyzing and processing epigenomic data, which are written in a variety of programs such as PERL, Java, and R. The integrations also enable scientists working in the Epigenome Center’s workflow execution program, Pegasus, to obtain specific information from the LIMS about problematic sequencing cycles or flow cell tiles, sample details, or library construction and lab processing steps.
In a fast-paced, next-generation lab, scientists succeed by pushing the boundaries of innovation—and they cannot afford to be constrained by the software they implement to manage data and laboratory workflows. A next-generation LIMS, built on standard tools and architectural styles, enables rapid implementation of today’s technologies and empowers scientists and scientific programmers to modify the system to quickly embrace the technologies favored for doing tomorrow’s genomics research.
Mike Sanders is product manager, LIMS platform, for the GenoLogics LIMS, a lab information management system developed to meet the specific needs of the next-generation genomics lab.