Deploy a Database Cluster
PrecisionFDA provides AWS Aurora RDS database clusters that are accessible from Apps and Workstations. You will need to request DB Cluster access for your precisionFDA username in order to use this capability.
Create the Database
Select the Databases tab in My Home and click the Create Database button.

Create a “Workstations and Databases Tutorial” database, “password”, PostgreSQL 11.16 on the smallest available database instance type, and click the Submit button.

Refresh the database status using the refresh button until the database is available.

Click on the Workstations and Databases Tutorial database to open the detail page and copy the host endpoint URL.

Connect to the cluster DB from pgadmin
In the pgadmin web service, add a new server for the Workstations and Databases Tutorial DB cluster using the host endpoint, user root, and the password specified when the database was created.

Note that we now have connections to both the local database on the data analysis workstation, and the cluster database.

Create a new database and tables
Connect to the cluster database from psql in the data analysis workstation shell.
Using psql, create a new database.
Connect to the new database and create two tables.
Load the cluster database from delimited text files
Although the workflow illustrated here may seem over-engineered for loading two data files, the techniques presented here were used to reliably and efficiently transfer tens of thousands of files and 15+ TB of data to precisionFDA.
In the data analysis workstation shell, create a datafiles directory
Create and upload delimited data files
On your local client (i.e. laptop), create file patients.txt with the following content:
Create file observations.txt with the following content:
In My Home / Files use the Add Files button to upload the two files to your private area.


Create and upload a manifest of data file IDs
Click into patients.txt and observations.txt details pages and copy their file IDs into a file named manifest.txt file on your local client.


Use the Add Files button to upload the manifest.txt file to your private area. Click into the details for the uploaded file and copy the file ID.

Download the files in the manifest to the Data Analysis Workstation
Using pfda CLI in the data analysis workstation shell, download the manifest.txt file to the workstation filesystem.
Iterate through manifest and download data files
In the data analysis workstation shell install and run dos2unix on the manifest.txt file to ensure there are no cross-OS end-of-line issues.
Copy the data into the cluster DB tables
Connect to the workstations_and_databases_tutorial_db cluster database
Using the database host endpoint, connect to the workstations_and_databases_tutorial_db cluster database using psql on the data analysis workstation:

Copy the patients and observations data into the cluster DB
In psql:
Observe the new tables and data in the pgadmin Workstations and Databases Tutorial server connection.

Connect RStudio to the cluster DB
In the RStudio console: