Data Management Specialist / Data Engineer

Full Time
Cambridge, MA 02142
Posted
Job description
Job Description
Background:
The Stanley Center for Psychiatric Research at the Broad Institute, in partnership with the Department of Epidemiology at Harvard T.H. Chan School of Public Health, has launched a major global psychiatric genetics initiative in Africa entitled
Neuropsychiatric Genetics in African Populations
. The goal of this partnership is to advance the genetic analysis of serious mental illness and contribute to global mental health equity by expanding the infrastructure and research findings from large-scale psychiatric genetic epidemiology to Africa. One project within this initiative is NeuroDev—a study of neurodevelopmental disorders in Sub-saharan Africa, led by Dr. Elise Robinson, in collaboration with local African institutions.

NeuroDev aims to 1) expand knowledge of the genetic architecture of neurodevelopmental disorders in Africa through large-scale sample collection, analysis and follow up, and 2) increase understanding of the genetics of African populations. Currently, this sample collection effort spans two countries, South Africa and Kenya, with total collection goals of 1880 child cases, 1,800 child controls, and 580 trios (mother-father-child) through 2023. Current funding for this project comes from the National Institute of Mental Health, National Institute of Child Health and Human Development, and Simons Foundation.

Job Summary
The candidate will work with the project team to gather requirements to extract data from multiple sources and load target data repositories for monitoring statistics of ongoing data collection and use in reporting and analysis. Additionally, the candidate will be responsible for building reports for use by the project team as well as system health. The candidate will be responsible for managing timelines for deliverables. The candidate will assist in writing and updating data management plans.

In addition, the candidate may assist in data curation activities, working with teams to organize and manage data storage (on prem and cloud), managing shared data sets used in research, and the transfer of data sets to and from public data repositories (UK Biobank, dbGaP, NIMH, etc).

This candidate will be collaborative, organized, able to multitask, and able to operate in a team-oriented environment. This role will be afforded the opportunity to gain experience in data management in a highly collaborative environment.

Responsibilities:
  • Work with project teams in gathering and tracking all project/reporting requirements for assigned projects, including change requests that occur throughout the project
  • Maintain data pipeline and database architecture, including quality control / logic checks on data ingested and stored in the central database
  • Work with stakeholders (project managers, clinical site leads, researchers) to identify data availability requirements and assist in building delivery solutions
  • Assemble data sets that meet functional / non-functional business requirements
  • Identify and design internal process improvements, i.e., automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
  • Work with stakeholders (project managers, clinical site leads, researchers) to assist with data-related issues
  • Ensure data access restrictions across different data centers
  • Assist (or lead) documenting data management plan for assigned project
  • Assist in exporting/moving study data into secure repositories (e.g., the NIMH Data Archive or similar) in line with project data sharing goals and requirements
  • Assist in the management of various repositories (i.e., source code, file shares, web projects)
  • Attend project meetings to stay up to date with project activities and report on project deliverables

Requirements:
  • Previous experience in a data centric role
  • Experience working with databases
  • Hands-on experience in a programming language (Python, bash, etc)
  • Hands-on experience with SQL
  • Experience building reports
  • Office documents - Google Office, MS Word, MS Excel or equivalent
  • Must be able to effectively work on multiple tasks and prioritize appropriately
  • Strong communication and team skills are essential

Experience preferred:
  • Work with data repositories in multiple locations
  • Work with data collection for research
  • GitHub
  • Cloud infrastructure, Google Cloud preferred
  • Electronic Data Capture tools such as REDCap
  • Interacting with APIs
All Broad employees, regardless of work location, must be fully vaccinated for COVID-19 by Tuesday, October 12, 2021. Requests for exemption for medical or sincerely held religious beliefs will be considered.
All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability or protected veteran status.
EEO is The Law - click here for more information
Equal Opportunity Employer Minorities/Women/Protected Veterans/Disabled
Check out this video for a look into our community!

colinoncars.com is the go-to platform for job seekers looking for the best job postings from around the web. With a focus on quality, the platform guarantees that all job postings are from reliable sources and are up-to-date. It also offers a variety of tools to help users find the perfect job for them, such as searching by location and filtering by industry. Furthermore, colinoncars.com provides helpful resources like resume tips and career advice to give job seekers an edge in their search. With its commitment to quality and user-friendliness, colinoncars.com is the ideal place to find your next job.

Intrested in this job?

Related Jobs

All Related Listed jobs