Wikimedia: Janice Haney Carr

Today, we are living through the onset of a microbial revolution. With the threat of pandemics on the horizon, keeping ahead in the race against evolving diseases is paramount.

Humans have recruited big data as their latest weapon in this war. Using protein structure modelling, scientists can derive the structure of a protein from its amino acid sequence. From this, they can design novel drugs that fit into the protein’s unique pockets to inhibit its activity.

Super-fast sequencing has revolutionised this field. The National Center for Biotechnology Information, founded in 1988, continues to be the deposit of sequence information. All the incoming DNA sequences are translated into protein sequences, pointing to clues about the survival and pathogenicity of dangerous organisms.

A big data approach to this problem has transformed the discipline into an analysis of micro-level patterns on the macro-level, a massive operation made possible by constantly improving computational power. Basic Local Alignment Search Tool looks for sequence similarities between known proteins and an unknown protein of interest and assigns proteins to families based on their similarities. Upon the protein's structural elucidation, drugs can then be designed to fit into its pockets and exert desired effects.

Data science can only go so far with the aim of generating antimicrobial drugs. Life is far more complex than any statistical model, meaning whilst big data approaches can channel research, they must be coupled with laboratory studies to validate any claims. These recent innovations beckon a new era of pharmaceutical innovation