Research Support: Data Management

 

Research Support: Data Management

Protect your data

The Health Sciences Library offers researchers on this campus support in the areas of data management, sharing, and curation.

Explore the sections below to see what the HSL can do for you.


 

By Mushonz (Own work) [<a href="http://creativecommons.org/licenses/by-sa/4.0">CC BY-SA 4.0</a>], <a href="https://commons.wikimedia.org/wiki/File%3AThe_Data_Lifecycle.jpg">via Wikimedia Commons</a>

By Mushonz (Own work) [CC BY-SA 4.0 (http://creativecommons.org/licenses/by-sa/4.0)], via Wikimedia Commons

Data Management can occur during any point in the research data lifecycle, from data collection till the "Data Afterlife". Let the Health Sciences Library help you navigate this process so that your data is in the best shape it can be in for analysis and reuse.

Data Management means:

-Storage

-Backup

-Organization

-Documentation

-Sharing

-Preservation

 


The first step to managing your research data is coming up with a plan: a Data Management Plan (DMP). 

University of Colorado Anschutz Medical Campus is a partner institution of a service called DMPTool. This tool is free to use and allows our users to be linked to local data management resources at the Health Sciences Library. It also contains custom templates and requirements from a variety of funding agencies to get you started on your data management plan. Going through this process is beneficial even if your funding agency does not require it because it prompts the user to consider what will happen to your data long term. Having this perspective while you're generating data will make the process more efficient by integrating data curation and preservation into the data generation process. 

  • The value of Data Management Plans (video)
  • Log into DMP Tool (Make sure to select University of Colorado Anschutz Medical Campus)
  • NIH Guidance on data sharing

The most straightforward way of sharing your data is submitting to a repository. This strategy is often preferable to creating your own web interface because it does not require maintenance on the researcher's part, which can be cost an time prohibitive. Below are examples of different types of data repositories recommended by the Health Sciences Library. 


Digital Collections of Colorado: Our Institutional Repository

The Health Sciences Library provides repository services to researchers on our campus. This repository is an excellent option for researchers who want to share their data freely for an indefinite period of time.

  • Provides 1 TB of storage per user
    • 2 GB per individual file
  • Provides a unique identifier (Handle, DOI) for your dataset
    • Allows people to cite your data
  • Can be used to share descriptive information (metadata) about your data and link to external data sources
  • Permanent archiving
  • Contact Heidi Zuniga to discuss depositing material into the digital repository

Have clinical data that you can't openly share? Store it in REDCap.

REDCap (Research Electronic Data Capture) is a secure, HIPAA-compliant web-based application designed for data collection for research studies. REDCap provides data entry with validation, import feature, automated export to popular statistics packages and more.


Molecular data: NCBI databases

Likely, the repositories that people are most familiar with are the databases at the National Center for Biotechnology Information. Many publishers already require authors to submit sequence data for publication. The databases are separated on what type of data are contained (genomic sequences, expression datasets, clinical data, etc) and requests appropriate metadata along with the submission. After submission, NCBI makes the data searchable and links them to related data sources in other databases.


All-purpose data repositories

Many data repositories will house essentially any type of data and provide long term preservation, unique identifiers for data citation, and provide guidance regarding metadata. They typically charge a fee to archive and allow your data to be searchable . This fee varies by the size of the dataset and the repository. They can be more flexible than the NCBI databases because they allow you to associate multiple types of data and analysis tools in one place. Here are a few examples.

     


Discipline-specific Repositories

Other repositories limit themselves to data from a particular (sub)discipline and/or datatype. This is advantageous because the metadata requirements are more straightforward, and your data will reach a target audience of others who are interested in the same types of data that you are. A comprehensive list of discipline specific databases can be found on our Bioinformatics Research Support page. 

Data is not very informative without context. Thus, applying appropriate contextual information about your data (a.k.a. metadata) is key to making your research data reusable, for yourself and for the public. 

But how do you know what information to include? Luckily, there are many existing metadata standards out there. Some of them are generic, like Dublin Core. Others are field or datatype specific, like the Minimum Information About a Microarray Experiment (MIAME). These metadata often use standard terminologies  (a.k.a. ontologies) like Medical Subject Heading (MeSH) for medical terminology or Gene Ontology (GO). The development of metadata standards in many biomedical disciplines is still in its infancy, but existing standards can be modified to fit your data. A good place to start is to look at the repository you'd like to share your data in, and find out what their standards are. 

Still not sure where to start? Contact Wladimir Labeikovsky to set up a free data curation consultation. 

Immunology Seminar: Dario Vignali, PhD
Monday, January 22, 2018, 12:00 pm
Research Complex 1 North
Hensel Phelps East

Pharmacology Seminar: Jay Debnath, MD
Monday, January 22, 2018, 12:00 pm
Research Complex 1 North
6107

Cardiology Research Conference
Monday, January 22, 2018, 12:00 pm
Academic Office One
Room 7000

Voyage Lecture: Arthur Gutierrez-Hartmann, MD
Tuesday, January 23, 2018, 4:00 pm
Research Complex 2
RC2-3109