About the Center for Big Data in Health Sciences (CBD-HS)
通过大数据彻底改变公共卫生。
卫生科学大数据中心是来自得克萨斯州医学中心的教职员工联盟,包括公共卫生学院,生物医学信息学学院,医学博士Anderson Cancer Center,McGovern医学院等,他们正在共同努力,他们共同努力solve public health problems with one of science’s most untapped resources—Big Data.
目标
- 通过开发/促进最先进的大数据分析方法和技术来构建针对生物医学和健康科学的国家/国际大数据研究beplay苹果手机能用吗计划
- Build a data-driven research platform to bridge the gap between the computational/quantitative scientists and biomedical/health investigators
- Support development of data science education programs to train next generation of health data scientists
- 通过大数据改善诊断,治疗和预防疾病和伤害,与行业与行业建立合作伙伴关系,以促进个人健康和社区福祉
会员资格
我们正在寻找具有以下领域专业知识的CBD-HS成员:
- 大数据分析的统计方法
- 生物信息学数据分析和建模:OMICS数据分析和集成
- Biomathematical modeling and computational biology
- 大数据分析software development
- Data mining and machine learning
- Expertise and experience in novel data types: text documents, audio, video, EMR, EHR, mHealth, imaging data, EEG, sensor-based data, wearable device data, GPS data, location-based data, social media data, network data et.
- 高性能计算:并行计算,云计算,高性能计算算法,数值优化算法
- 任何有兴趣使用大数据进行研究和实践的临床,生物医学和健康科学研究者beplay苹果手机能用吗
当前的研究计划beplay苹果手机能用吗
If you are interested in learning more about any of the research initiatives we are working on, or would like to get involved, please contact凯文·班克斯(Kevin Banks)。
GEO Big Data Project
目标:
- 开发可扩展的大数据分析管道,以分析从GEO数据存储库中分析大量时间课程基因表达数据集
- 开发一个基于Web的协作平台,与遗传和生物医学合作者共享大量分析结果,以提取科学见解并通过出版物从可扩展的分析管道中传播大量发现。
EHR Collaboration Working Group
Promoting collaborations between statisticians/data scientists and biomedical/clinical/epidemiological investigators to use EHR/EMR and medical insurance claim data to develop predictive models for disease risks and evaluate effects of clinical treatments to provide treatment recommendations based on the real-world evidence
EHR Methodology Research Working Group
为EHR和医疗保险索赔数据开发新颖的统计方法和预测模型,以解决临床和公共卫生问题
UK Biobank Research Working Group
开发新颖的预测模型和统计方法,以整合英国生物银行研究的异质和不同类型的数据,以解决流行病学和公共卫生问题
我们该怎样帮助你?
Contributions to UTHealth community: Collaboration/consulting service and support
我们为生物医学和健康科学研究者提供协作支持和咨询服务:
- Design research projects and tools/strategies for Big Data collection
- 开发用于大数据管理的数据库或数据仓库
- Big Data harmonization and integration
- Big Data visualization
- 大数据分析
- 大数据建模和预测
- A Big Data research platform for Big Data identification, management, integration, visualization, analytics, modeling and prediction will be developed to support the Big Data research at UTHealth.
行业参与
We will actively develop collaborations and partnerships with related industries, including local companies, national and international corporations who may own Big Data and need analytic support. This will not only benefit our Center's faculty for research purpose, but also this is good for our students to get more opportunities for summer internships and jobs.
CBD-HS available resources
Data Resources, Cerner Health Facts
欧洲核子研究中心健康数据库涵盖了所有的事实health care records for 85 systems with 750 facilities in the United States from 2000 to 2018. The patient-level data in Cerner includes longitudinal encounters with detailed records of diagnoses, medications, clinical events, procedures and lab procedures. It represents a total of 69 million unique patients across the United States. Of the 69 million patients, 52% are female and 42% are male (6% are gender-unidentified). The racial makeup of the 69 million patients is 49.5% Caucasian, 11.8% African American, 2.9% Hispanic, 1.8% Asian and Native American, less than 1% Pacific Islander, Middle Eastern Indian, and 16.4% racial status unidentified. Patient marital status is 33% married, 22.6% single, 3.3% divorced, 3% widowed, and others are marital status unidentified. The mean patient age is 46.8 years old, with a range of 0-90 years old. In total, the database includes 487 million unique encounters with 939 million diagnoses, coded in International Classification of Diseases (ICD-9) codes. The database has 674 million medication records, 118 million procedure records, 5.3 billion clinical event records and 4.2 billion lab procedure records.
硬件/软件资源
Infrastructure
生物统计学和数据科学系拥有几种最先进的高性能计算设备。两家最近获得的HPE服务器各有36个内核72个螺纹,768GB内存和2 x NVIDIA V100 GPU/16GB。这两个服务器与2 x 10Gbps光纤连接到192 TB容量的HPE 3PAR存储节点,并聚集到Hadoop/HBase/Spark系统进行大数据分析。该部门还显示了下图中显示的其他3台服务器。[照片]
技术人员
A team of highly skilled technical staff provides support for computing, data management, and networking. The Department has a programmer analyst/system administrator to install, maintain, and manage all hardware, software, and networks. A database manager to assist with database creation, manipulation and retrieval. The School of Public Health also provides additional assistance through the IT Department with computer services, network services, telecom, administrative support, and help desk.
Texas Advanced Computing Center (TACC)
德克萨斯州高级计算中心(TACC)是UT研究人员可用的一项服务,可帮助利用强大的高级计算技术。beplay苹果手机能用吗TACC设计并部署了世界上最强大的高级计算技术和创新软件解决方案。TACC的环境包括高性能计算,可视化,数据分析,存储,存档,云,数据驱动的计算,连接,工具,API,API,算法,咨询和软件的综合网络基础结构生态系统。他们为研究人员提供系统和软件支持,并在全国和全球350多家机构中的1000多家研究人员在3beplay苹果手机能用吗000多个项目中进行了研究,以解决科学概念以改善生活质量。TACC有许多HPC簇,包括“踩踏”,具有6400个计算节点,102,656个核心,205个记忆力和10个PETAFLOPS(PF)的峰值性能(PF),在世界上排名第10,2015年11月500名超级计算机,“ LONESTAR”,“ LONESTAR),“ LONESTAR“哪个UT系统机构调查人员可以独家访问1901个计算节点,22,256个内核和302个TF理论峰值性能,“ Corral”是主要位于TACC的存储和数据管理资源的集合,其中有5 pbypytes在UT中安装在UT中在TACC和Arlington的数据中心,以及用于低延迟应用的未复制存储的额外babyte。
Additional databases and centers
Center for Biostatistics Collaboration and Data Services
联系我们
凯文·班克斯(Kevin Banks)
Research Coordinator
kevin.j.banks@uth.tmc.edu
(713)500-9584