NextBio and Intel Announce Collaboration to Optimize Use of Hadoop Stack And Move Forward With Big Data Technologies in Genomics
Posted Jul 12 2012 11:37am
Of course big data belongs in genomic research and here we have an example of both Intel and NextBio putting it together. If you have not heard of Hadoop, it’s the next greatest data solution since SQL server. In addition there are several platforms built on top, such as HBase and Cloudera. Below is a post from a few months ago talking about the Cleveland Clinic developing their Hadoop endeavors.
A few months ago Intel announced the opening of their new Science and Technology Center for Big Data which is headquartered at MIT. We can expect big things from the center with new designs in both hardware and software. Sophisticated algorithms working on the cloud and of course there’s that need of the software on the processing chip to keep things operating at the required levels and with plenty of cores to make it happen.
To see all that NexBio has for analytics and information for clinical and academia use, the FAQ pages outlines each area pretty thoroughly. One short paragraph from the site about the genome browser example below:
“Genome Browser is an easy-to-use, interactive application ("app") that you can use to view the physical relationships across biosets and different types of genomic elements. Some of these elements include genes, miRNA targets, CNVs, CpG islands, SNPs, GWAS associations, and LD blocks. ”
SANTA CLARA, Calif.--NextBio and Intel announced today a collaboration aimed at optimizing and stabilizing the Hadoop stack and advancing the use of Big Data technologies in genomics. As a part of this collaboration, the NextBio and Intel engineering teams will apply experience they have gained from NextBio's use of Big Data technologies to the improvement of HDFS, Hadoop, and HBase. Any enhancements that NextBio engineers make to the Hadoop stack will be contributed to the open-source community. Intel will also showcase NextBio's use of Big Data.
"NextBio is positioned at the intersection of Genomics and Big Data. Every day we deal with the three V's (volume, variety, and velocity) associated with Big Data – We, our collaborators, and our users are adding large volumes of a variety of molecular data to NextBio at an increasing velocity," said Dr. Satnam Alag, chief technology officer and vice president of engineering at NextBio. "Without the implementation of our algorithms in the MapReduce framework, operational expertise in HDFS, Hadoop, and HBase, and investments in building our secure cloud-based infrastructure, it would have been impossible for us to scale cost-effectively to handle this large-scale data."
Today, NextBio is used by researchers and clinicians in over 40 top commercial and academic institutions including the University of Southern California, Sanford-Burnham Medical Research Institute, Celgene, Eli Lilly, Genzyme, Johnson & Johnson, Merck, Regeneron, Scripps Research Institute, Stanford University, University of California at Berkeley Takeda and many others.