With OmicSoft Server + Server on the Cloud can execute parallel NGS analyses on your AWS Cloud environment. Data stored on your S3 buckets are available for analysis, just like any other data on your server.


This add-on feature only requires one-time configuration by your OmicSoft Server administrator, described on OmicSoft Server Configuration with Cloud .This feature can work with OmicSoft Server installations on physical machines, or Cloud-based virtual machines, as long as the machines can reach AWS services.



OmicSoft Server on the Cloud basics


The basic logic for OmicSoft Server on the Cloud to trigger cloud-based NGS analysis is:
  • Input is S3-based data within a mapped S3 folder
  • Output is a mapped S3 folder
  • OmicSoft Server creates temporary cloud compute “EC2” instances in your AWS environment, transfers input data from S3 to EC2, performs analysis, and transfers output data to S3
    • Summarizations and analysis tables will be automatically downloaded to your OmicSoft Server project
  • Each sample will be analyzed on a separate EC2 instance; there is rarely a reason to limit the number of parallel jobs

Example Server on the Cloud workflow

Create or Open a Server-based project

  1. Connect to your OmicSoft Server

  1. Open or create a new Server-based project (in the Analysis tab)
  2. Upload or locate your data on your mapped S3 bucket (in the Server tab, with Server Files)
 

Transfer Files to Server cloud

 Before running server jobs on cloud, you will upload the data files to a Cloud folder, or locate data already transferred. Go to Server File | Browse Files window.
It’s best practice for OmicSoft Administrators to name mapped S3 folders with “Cloud” in the folder, so you can differentiate Cloud S3 folders from network-mapped folders. In this example, the folder is named “SGECloudFolder”, and Vivian has created a subfolder to store input data.

Uploading files from your computer
 If data are not already in the Cloud S3 folder, click Upload to transfer data from your local computer. 
image41_png
 
 
 
 
 

Downloading files from NCBI SRA

 If you want to analyze data submitted to NCBI SRA, ERA, etc., you can quickly download to your Cloud bucket with Download FASTQ files from NCBI SRA

Direct transfer to S3 bucket (Advanced Users)
 Mapped cloud folders are your own S3 buckets; experienced AWS users can directly transfer data into the S3 location, and data will be immediately available for analysis in OmicSoft Server on the Cloud.
  

Run your Cloud analysis

 To run a cloud-based analysis, simply choose your NGS analysis module (such as Download NCBI SRA FASTQ, OmicSoft RNAseq pipeline function , Report Gene/Transcript Counts, etc), and be sure to specify S3 locations for both Input and Output locations.

OmicSoft Server will automatically transfer your S3 input data to compute instances, perform the analysis, and return output data to your S3 output location.

Summary files will be automatically loaded into your OmicSoft Suite project.

image42_png
After sending the data to queue, the job progress could be monitored the same way as server project:
image43_png 

Run Multiple Jobs on Cloud

 With Cloud-based analyses, you can specify as many parallel jobs as the number of samples, and each sample will be analyzed on a separate instance. Because AWS charges by the minute, not by the instance this will not cost any more than running a single parallel job, and this makes it much faster to perform the analyses.
image44_png
image45_png


image46_png
The users can right click on the job and select View Full Log:
image47_png


In the Log window, as you can see, the jobs are being submitted to cloud NGS instances, 2 cloud instances will be started as we have two samples to align:
image48_png

You can monitor the Server Jobs tab for progress of your cloud job.
  

Continue your analysis


After your job is complete, your OmicSoft Suite project will say “Update project”. Click the Update project button to synchronize your local OmicSoft Studio with the latest output data.


image49_png

Congratulations! Now you can successfully run server projects on cloud! 

Example Cloud Workflow: GSE91061 re-analysis

 Step 1: Create a Server project in OmicSoft Server



Step 2: Download FASTQ files from NCBI SRA

SRP094781 is the SRA project containing all raw FASTQ files for GSE91061. SRP094781 contains 109 samples (208 files, paired-end data) with RNA-seq data, and can be downloaded as 109 parallel EC2 jobs.



Step 3: OmicSoft RNA-Seq analysis




Find the downloaded data in your Cloud-mapped folder. You can specify 109 parallel jobs, which will launch 109 parallel alignments, then quantify the data and return all tabular results to your project for visualization and analysis.




The OmicSoft RNA-seq pipeline will align your data to the genome of your choice, quantify expression to your gene model, identify exons and exon junctions, fusions, and mutations.