Datasets

Files

Uploading and managing files in a dataset.

Dataset files are stored in Azure Data Lake Storage. Draft datasets use a separate storage container from published datasets.

For eye imaging datasets, contributors are encouraged to organize data in a CDS-aligned layout and provide DICOM-formatted files when possible.

Uploading Files

Go to /app/datasets/[datasetId]/upload to upload files to your dataset.

Files uploaded while the dataset is in draft state are written to draft storage. When a dataset is published, files are moved to published storage as part of the finalization step.

Viewing Files

The Files tab at /app/datasets/[datasetId]/files lists all files currently associated with the dataset, including their names, sizes, and paths.

Participants

For datasets that include participant-level data, the Participants tab at /app/datasets/[datasetId]/participants provides a view of participant records associated with the dataset.

Processing

The Processing tab at /app/datasets/[datasetId]/processing shows the status of any data processing jobs running against the dataset files.

Notes

  • File uploads require editor or higher role on the dataset.
  • Very large uploads may take time to complete. Do not close the browser tab during an active upload.
  • There is no file versioning within a single dataset version. To update files, create a new dataset version.

De-identification and PHI Handling

Contributors are primarily responsible for de-identifying data before upload, including image files, filenames, and embedded metadata.

Envision Portal workflows may include additional checks for obvious PHI indicators in text-based fields and metadata. These checks are a safety layer, not a replacement for contributor-side de-identification and institutional compliance review.

Before upload, confirm:

  • Participant identifiers are removed or transformed per your approved policy
  • DICOM tags containing direct identifiers are removed or appropriately anonymized
  • Filenames and folder paths do not expose patient identifiers
  • Consent and data use constraints are reflected in metadata access settings
Copyright © 2026