168大数据

标题: NIH投资近3200万美元用于生物大数据处理 [打印本页]

作者: 乔帮主    时间: 2014-10-14 09:30
标题: NIH投资近3200万美元用于生物大数据处理
2013年12月,NIH发布了生物大数据倡议(the Big Data to Knowledge initiative,BD2K),这项倡议旨在推进新的技术和方法来处理和整合日益复杂的生物医学大数据。根据这项倡议,NIH将在2014年投资近3200万美元用于大数据处理,期望支持新的数据处理技术、软件和工具以便研究者能更好地处理和利用生物医学大数据,进而更好地为人类健康事业做研究工作。NIH将利用这笔资金建立12个数据处理中心,每个中心将会承担特定的数据处理任务。
根据NIH发表的声明,BD2K资金将主要用于四个方面:
1、大数据计算方案的优化
NIH将开放大数据相关的新的处理方法、软件和工具。在关注具体研究内容的基础上,整合、利用和分析基因组数据以及电子健康记录上的管理数据。匹兹堡大学和威斯康辛大学都参与这方面的研究。
2、BD2K-LINCS数据整合
BD2K将支持细胞特性相关集成数据库(the Common Fund’s Library of Integrated Network-based Cellular Signatures program,LINCS)的相关研发项目。
3、BD2K数据索引整合联盟
这个项目旨在从整体角度对生物医学数据进行挖掘分析和索引归类。
4、专业人员的培训
用于培训大数据处理分析所需要的专业技术人员。
NIH主任Francis S. Collins博士在一份声明中表示:现如今新的数据信息每天都在以指数形式增长。生物医学研究的数据越来越庞大。这些资助将帮助我们对这些庞大的数据进行分析处理。这些数据的潜在价值,对人类生命健康的意义是不可估量的。
NIH表示,到2020年BD2K倡议的资助将达到6亿美元。
英语原文:
NIH makes $32 million in awards to mine big data
Hoping to tame the torrent of data churning out of biology labs, the National Institutes of Health (NIH) today announced $32 million in awards in 2014 to help researchers develop ways to analyze and use large biological data sets.
The awards come out of NIH’s Big Data to Knowledge (BD2K) initiative, announced last year after NIH concluded it needed to invest more in efforts to use the growing number of data sets—from genomics, proteins, and imaging to patient records—that biomedical researchers are amassing. For example, in one such “dry biology” project, researchers mixed public data on gene expression in cells and patients with diseases to predict new uses for existing drugs.
The BD2K awards “will help us overcome the obstacles to maximizing the utility of the mammoth data sets that are emerging at an accelerated pace,” said NIH Director Francis Collins in a call today with reporters. The grants, he said, will support computational tools, software, standards, and methods for sharing and using large data sets.
Eleven centers of excellence will receive $2 million to $3 million a year over 4 years to develop tools and methods for everything from modeling cell signaling in cancer to integrating data from mobile sensors worn by volunteers in health studies. Another center award will support a global brain data-collection effort called ENIGMA, which aims to unearth the genetic roots of psychiatric disorders.
One of ENIGMA’s aims is to allow neuroscientists and geneticists to pool hundreds of thousands of DNA samples in the hope of finding genetic variants underlying diseases such as major depressive disorder. Smaller studies have failed to turn up anything of statistical significance for this disorder, perhaps because many different genes contribute minute effects to depression risk that have previously been too small to detect, says Paul Thompson, a neuroscientist at the University of Southern California in Los Angeles. He will lead the ENIGMA Center for Worldwide Medicine, Imaging and Genomics.
Neuroimaging studies have also long struggled with insufficient data, says Hugh Garavan, a cognitive neuroscientist at the University of Vermont in Burlington who recently joined ENIGMA. Roughly “95% of all imaging studies have maybe 20 participants per group,” largely because of the cost of brain scans, which can run roughly $500 to $600 per person, he says. Garavan’s group plans to use the pooled data to explore the genetics and neurobiology of addiction, he says.
ENIGMA may also help scientists study differences in the thickness of the human cortex, the wrinkly layer of tissue that lies on the brain’s surface and performs most of our higher-level thinking. Normally, it takes at least 24 hours to extract information about the cortex’s thickness from an MRI scan—one must digitally strip off the skull, separate the white matter from the gray matter, and delete the cerebrospinal fluid. Access to supercomputer clusters will allow neuroscientists to process much more quickly that type of data set for hundreds of thousands of patients, he says.
Although bigger data sets do raise a risk of getting more false positive results and missing rare variants, overall the data-pooling strategy “makes perfect sense,” says psychiatrist Jack McClellan of Seattle Children’s Hospital in Washington, who is not involved in the ENIGMA project.
The BD2K program will also fund a “data discovery” coordinating center at the University of California, San Diego, that will work with projects at eight other institutions to find ways to make it easier for researchers to find and use data sets. Right now, “you can’t Google scientific data very successfully,” says Philip Bourne, who this past January became the first NIH associate director for data science.
Finally, a set of training awards will support courses and the work of young scientists working on big data projects.
NIH expects to commit a total of $656 million by 2020 to the BD2K initiative.







欢迎光临 168大数据 (http://www.bi168.cn/) Powered by Discuz! X3.2