Back to Solutions

Dataverse

Harvard University Institute for Quantitative Social Science (IQSS)
United States

About

Launched: 2007
Record Updated: Apr 25, 2024
Repository software

The Dataverse Project is an open-source web application to share, preserve, cite, explore, and analyze research data. It facilitates making data available to others and allows you to replicate others' work more easily. Researchers, journals, data authors, publishers, data distributors, and affiliated institutions all receive academic credit and web visibility.

A Dataverse repository is the software installation, which then hosts multiple virtual archives called Dataverse collections. Each Dataverse collection contains datasets, and each dataset contains descriptive metadata and data files (including documentation and code that accompany the data). As an organizing method, Dataverse collections may also contain other Dataverse collections.

Mission

The mission of the Dataverse Project is to revolutionize the way data is managed and shared by automating tasks traditionally carried out by professional archivists. Our goal is to empower data creators by providing services that allow them to receive proper credit for their data while ensuring long-term preservation. We aim to eliminate the dilemma researchers faced in the past, where they had to choose between control and credit or preservation. With the Dataverse Project, we break this dichotomy by creating a Dataverse collection on your website that maintains your branding and URL, offers academic citation for the data, and provides full credit and visibility. Dataverse in addition to what is mentioned above, also has fully-fledged support for quality control, e.g., in the form of curation workflows. Simultaneously, our Dataverse repositories, backed by institutions, guarantee long-term preservation.

Key Achievements

  • The Dataverse UI is being redesigned as a single-page application (SPA) in which internal and external Dataverse services will be delivered solely through improved APIs. This change will improve UI responsiveness and empower the Dataverse community to develop their own UIs and applications using extended Dataverse API endpoints. Read more about our roadmap for the Dataverse infrastructure redesign in: Restructuring the Dataverse UI as a Single-Page Application: https://docs.google.com/document/d/19pbENuYyHErEmblbFGQ47_uJpTfqVKbn9O0QftVqeeU/edit#heading=h.9b7lzr4a7odc
  • Generalist Repository Ecosystem Initiative (GREI). Multi-year grant starting in 2022 to develop collaborative approaches for data management and sharing through inclusion and enrichment of generalist repositories in the NIH data ecosystem, including the Harvard Dataverse Repository. The announcement has details of areas being explored: https://datascience.nih.gov/data-ecosystem/generalist-repository-ecosystem-initiative
  • CAFE GRANT:The BUSPH-HSPH Climate Change and Health Research Coordinating Center (CAFÉ) is a three-year cooperative agreement between the Boston University School of Public Health, the Harvard T.H. Chan School of Public Health, and the National Institutes of Health. CAFÉ aims to Convene, Accelerate, Foster, and Expand the climate and health community of practice, both in the US and globally. This collection serves the climate and health COP as a repository for datasets of any kind that enable broad, interdisciplinary research in the area of climate and health: https://github.com/Climate-CAFE/
  • Technical Attributes

    Open Code Repository

    Implemented

    Maintenance Status

    Actively maintained

    Technical Documentation

    Implemented

    Open Product Roadmap

    Implemented

    Open API

    Implemented

    Open Data Statement

    Implemented

    Content Licensing

    By default, all datasets added in a Dataverse repository are granted the CC0 Public Domain Dedication. The Dataverse software uses the CC0 waiver by default for all datasets (4.0 and on) because of its name recognition in the scientific community, making it a familiar option for data (for which in general copyright does not apply), and is in use by repositories as well as scientific journals that require the deposit of open data. For more information on the CC0 waiver, please visit the Creative Commons website (https://creativecommons.org/share-your-work/public-domain/cc0). Data depositors can opt-out of using the CC0 waiver for their datasets, if needed.

    Standards Employed

    OAI-PMH OAuth OIDC SWORD Signposting Ostatus Dublin Core, Data Documentation Initiative Codebook, DataCite, OpenAIRE, Schema.org, Open Archives Initiative Object Reuse and Exchange (OAI-ORE)

    Hosting Options & Service Providers

    Hosting Strategy

    Hosting through third party only

    Service Providers

    What other tools and projects does your project interact with?

    RServe Binder Whole Tale OSF RSpace GitHub GitLab Renku OJS Archivematica RedCap iRODS Duracloud Dropbox

    Community Engagement

    Code of Conduct

    Implemented

    Community Engagement

    Implemented

    Contribution Guidelines or Fora

    Implemented

    Organizational Commitment to Community Engagement

    Dataverse engages extensively with its user community through a variety of media.

  • Monthly Community Calls are held on Zoom to discuss upcoming releases, development contributions from the community, and other topics relevant to our community: https://dataverse.org/community-calls.
  • Dataverse Community and Developer Mailing Lists provide open fora for community members.
  • Annual Community Meetings provide an opportunity to engage around themes such as sustainability of services and infrastructures and Indigenous Data Sovereignty: https://dataverse.org/events.
  • The Global Dataverse Community Consortium (GDCC) is dedicated to providing international organization to existing Dataverse community efforts and will provide a collaborative venue for institutions to leverage economies of scale in support of Dataverse repositories around the world: https://dataversecommunity.global/.
  • DataverseTV highlights fantastic video content from the Dataverse community: https://dataverse.org/dataversetv.
  • Dataverse google group: https://groups.google.com/g/dataverse-community
  • Engagement with Values Frameworks

    User Contribution Pathways

    • Contribute to code
    • Contribute to documentation
    • Contribute to education or training
    • Contribute to working groups or interest groups

    Policies & Governance

    Governance Summary

    Dataverse is developed at Harvard's Institute for Quantitative Social Science (IQSS), along with many collaborators and contributors worldwide. Harvard has two governing boards, the Board of Overseers and the Harvard Corporation. There is also a Global Dataverse Community Consortium.

    Policies

    Commitment to Equity & Inclusion

    Implemented

    Privacy Policy

    Implemented

    Web Accessibility Statement

    In Progress

    Open Data Statement

    Implemented

    Governance Structure & Processes

    Implemented

    Additional Information

    Organizational History

    The Dataverse Project is being developed at Harvard's Institute for Quantitative Social Science (IQSS), along with many collaborators and contributors worldwide. The Dataverse Project was built on our experience with our earlier Virtual Data Center (VDC) project, which spanned 1997-2006 as a collaboration between the Harvard-MIT Data Center (now part of IQSS) and the Harvard University Library. Precursors to the VDC date to 1987, comprising such entities as pre-web software to automatically transfer cataloging information by FTP to other sites across campus automatically at designated times, and before that to a stand-alone software guide to local data.

    Organizational Structure

    Business or Ownership Model

    Fiscal sponsorship (academic institution)

    Full-time Staff

    18.0

    Funding

    Primary Funding Source

    Other