Skip to Main Content
University of Minnesota

Kathryn A. Martin Library

Research Data Management (RDM)

  • How will you protect privacy, security, confidentiality, and intellectual property?
  • How does an audience access the data?
  • Who controls the data?

Conditions for sharing YOUR data

Numerous factors impact your ability to sharing "Your Data"  - This includes: 

  • Contracts
  • Grants
  • Institutional Affiliations
    • Intellectual Property Policies
    • IRB agreements
    • Data Retention
  • Collaborators and Colleagues
  • Advisors and Supervisors
  • Competitors
  • Cultural Practices and Expectations of field/lab

We recommend speaking with your faculty advisor and lab directors to the ultimate status and ownership of data created in your work

Data Ownership

Be Aware - The University is sole owner of all intellectual property created on University property during working hours:

  • Created by University employees in the course of their employment;
  • Created by students or postdoctoral or other fellows in the course of their academic duties or appointments; or
  • Created by individuals, including employees, students, or postdoctoral or other fellows, using substantial University resources. 

From the University Office for Technology Commercialization
University Policy: Research Data Management: Archiving, Ownership, Retention, Security, Storage, and Transfer

Data Privacy

When should research data be kept private?
  • When data is protected by law (HIPPA, FERPA, FISMA)
  • When IRB restrictions obligate you to secrecy/confidentiality To protect privacy of subjects 
  • If data has binding agreements or de-identification
  • To protect IP rights
  • When data has potential commercial value
  • To protecting digital or physical security
  • To keep from competitors, malicious attackers (student records?), or accidental release.
  • Before publishing a paper/release of results (embargo)

De-Identification of Data

If you want to share research data that contains Human Subject Data, you will have to De-Identify that information from spreadsheets, transcripts and media sources.  Below is a definition and short list of content that must be de-identified before being deposited.
 
IRB Definition from Washington State University (http://www.irb.wsu.edu/definitions.asp) (Retrieved on 8/6/2015)
DIRECTLY OR INDIRECTLY IDENTIFIABLE:

Identities of individual subjects are kept by the investigator. If subjects' identities are inseparable from data, then data is directly identifiable. If subjects' identities are kept separate from data, with information connecting them maintained by codes and a master list, then data is indirectly identifiable. In either case, the investigator must assure that confidentiality will be maintained, and must explain how subjects' identities will be protected.

  • Direct identifiers: Direct identifiers in research data or records include names; postal address information ( other than town or city, state and zip code); telephone numbers, fax numbers, e-mail addresses; social security numbers; medical record numbers; health plan beneficiary numbers; account numbers; certificate /license numbers; vehicle identifiers and serial numbers, including license plant numbers; device identifiers and serial numbers; web universal resource locators ( URLs); internet protocol (IP) address numbers; biometric identifiers, including finger and voice prints; and full face photographic images and any comparable images.
  • Identifiable data or records: contains information that reveals or can likely associate with the identity of the person or persons to whom the data or records pertain. Research data or records with direct identifiers removed, but which retain indirect identifiers, are still considered identifiable.
  • In-direct identifiers: Indirect identifiers in research data or records include all geographic identifiers smaller than a state , including street address, city, county, precinct, Zip code, and their equivalent postal codes, except for the initial three digits of a ZIP code; all elements of dates ( except year ) for dates directly related to an individual, including birth date, admission date, discharge date, date of death; and all ages over 89 and all elements of dates ( including year) indicative of such age, except that such age and elements may be aggregated into a single category of age 90 or older.

Steps in The DeIdentification Process

General Guidelines:

    Don't “identify” in the data creation process

    • When recording audio, do not use names, place of employment, etc.
    • Arrange with participants on using pseudonyms, instead of “my uncle John,” use “an uncle of mine.”
    • Audio/Video: Bleep out names or blur faces.
    Mark replacements of text clearly, either by using [brackets] or tags:
    • <anon> …  </anon>
    Keep a secure copy of the non-anonymized data.
  • Create a log of all the replacements, aggregations, or removals made in each data file. Store this log file separately from the de-identified data.
  • Get consent to share from participants.
Remember: Document the steps taken to remove identifiers.