Datasets on Blackfynn Discover are publicly accessible using any of the AWS tools that are available. In addition, datasets that are smaller than 5GB can be downloaded at no costs through the browser.
Downloading datasets using the browser
If a dataset is smaller than 5GB. clicking on the Get Dataset
button in Blackfynn Discover will provide an option to start downloading the data. This provides an easy way to access datasets that are relatively small. However, this option is not available for larger datasets, which are only accessible by directly interfacing with the AWS ecosystem.
Downloading datasets from AWS
All datasets can be accessed by directly interacting with AWS using your own AWS account. All data for a dataset is stored in a publicly accessible Amazon S3 Bucket. You will have to provide your own AWS credentials to access the data as downloading data can have costs associated with it.
There are 2 easy steps to configure your computer for downloading a dataset:
Download a dataset to a local machine
After setting up an AWS account, and configuring your computer to use this account with the AWS-CLI, you can use the following command to download a dataset to a local folder.
aws s3 cp s3://[discover-dataset-bucket] [local-path] --request-payer requester --recursive
This will download the dataset to the [local-path]
on your computer.
Example: the following command will download the dataset shown in the image above to the current folder.
aws s3 cp s3://blackfynn-discover-use1/9/3/ . --request-payer requester --recursive
Note: By including the request-payer requester
attribute, you acknowledge that any costs associated with downloading the data will be charged to your AWS account. For transfer pricing information, visit the AWS S3 Pricing documentation. The relevant section is Data Transfer OUT From Amazon S3 To Internet
.