Taking on the Data Deluge to Improve R&D Productivity
Data is at the crux of today’s biological research. Every year, more than 25 million references are added into PubMed and over 1 million new articles are published; this is overwhelming researchers who are trying to find relevant information hidden within the scientific literature. In addition, effective literature search is a complex and lengthy process, potentially taking many months. The overall success rests upon the researcher’s ability to harness the data and information available.
Data’s potential to improve R&D; outcomes
The potential of big data in scientific research is huge. It can improve the understanding of the structure of a disease by determining the complex molecular relationships that underlie them, and it enables scientists to more accurately predict a patient’s response to a drug. Big data is driving precision medicine, using information on patients’ genetic make-up and co-existing conditions to create more effective, tailored treatments. The great promise of precision medicine, however, only comes to fruition when scientists are able to filter through the ‘noise’.
To effectively address the challenges of spiraling costs and increasingly competitive markets, the tools used in scientific research are becoming particularly astute at screening, aggregating and integrating large data sets to successfully uncover unique insights. Additionally, the ease of cloud computing and the processing power of supercomputers has reduced the cost and time involved in crunching terabytes of data. The sheer volume and density of published scientific research continues to increase, and the life sciences space is learning how to manage increasingly complex and diverse data.
To ensure successful and productive research and development, pharmaceutical and life science companies must invest more in solutions that support the early preclinical phases of the drug discovery process to help researchers better understand the biology of diseases. Only through this approach, and with a focus on early research, will pharmaceutical companies be able to increase their productivity and their R&D; return on investment.
Mitigating risk by analyzing data from varied sources
Bringing a new drug to market takes 10 to 15 years and costs up to $5 billion – the stakes are high and success is not guaranteed. Mitigating risk in R&D; is therefore crucial, and, in life sciences, the ‘fail early and fail cheap’ principle is followed closely. To help mitigate risk throughout the R&D; process, researchers need to access many forms of data from a multitude of sources, as early as possible. Published research results and data from varied fields, such as chemistry and biology, provide researchers with the necessary foundation to make confident, better informed decisions about the development of the drug.
Less obvious data sources can also be extremely valuable – for example, the data that pharmacovigilance (PV) and drug safety teams hold can be effective in drug development. While pharmacovigilance and drug safety teams are typically seen as the bearers of bad news, this perception is at odds with the actual benefits that PV data can deliver to drug researchers; for example, helping scientists to avoid any mistakes made in previous studies, or identifying potential adverse reactions before a drug goes to clinical trial.
Read more: Six Reasons to Add Object Storage to Your Genomics Lexicon
In short, it is crucial that researchers access and make use of data from multiple sources to ensure that they are covering all bases and that they are making the most out of the large amount of data available to them, to maximize the success and efficiency of their study.
The combination of science and technology
Handling immense data sets from various sources requires a combination of scientific and technological skills; structure, quality, actionability and standardization are all issues with big data. Over 85 percent of medical data is unstructured, yet still clinically relevant. Organizations have had to seek out solutions that allow data to be investigated in a standard, repeatable and structured manner. With the pervasiveness of social media the increased use of devices connected to the Internet of Things (IoT), the amount of unstructured data being generated is increasing significantly.
In science and research, text-mining tools and the use of relevant vocabularies and taxonomies are essential. Without taxonomies, the only way to find comparable data points is to compute the distance of this point to every other point in the space, which is a huge number of computations. Taxonomies combined with semantic technology and text-mining tools are an efficient way to discover the relevant content and extract key data from multiple and disparate data sources. The resulting nuggets of data can help users make associations where none existed before. Ultimately, it’s the right mix of scientific knowledge and technological know-how that helps researchers be more productive and successful.
Embracing the power of data
The power underlying big data is its ability to unlock the intrinsic value of data-driven decision-making, which is a principle that applies across industries. In pharmaceutical research this translates into the ability to mine and analyze different literature and data sources to make relevant associations between genes and proteins in the search for new drugs and treatments. With the data available, it’s possible that answers to important long sought queries could finally be reached within a fraction of the time, limiting the amount of late stage failures. Researchers should not be daunted by the deluge of data at their fingertips, but rather embrace the benefits it brings.
R&D; 100 AWARD ENTRIES NOW OPEN:
Establish your company as a technology leader! For more than 50 years, the R&D; 100 Awards have showcased new products of technological significance. You can join this exclusive community! Learn more.