Evaluation of serverless computing for scalable execution of a joint variant calling workflow

Advances in whole-genome sequencing have greatly reduced the cost and time of obtaining raw genetic information, but the computational requirements of analysis remain a challenge. Serverless computing has emerged as an alternative to using dedicated compute resources, but its utility has not been widely evaluated for standardized genomic workflows. In this study, we define and execute a best-practice joint variant calling workflow using the SWEEP workflow management system. We present an analysis of performance and scalability, and discuss the utility of the serverless paradigm for executing workflows in the field of genomics research. The GATK best-practice short germline joint variant calling pipeline was implemented as a SWEEP workflow comprising 18 tasks. The workflow was executed on Illumina paired-end read samples from the European and African super populations of the 1000 Genomes project phase III. Cost and runtime increased linearly with increasing sample size, although runtime was driven primarily by a single task for larger problem sizes. Execution took a minimum of around 3 hours for 2 samples, up to nearly 13 hours for 62 samples, with costs ranging from $2 to $70.

3. Both in the paper and in the GitHub repository authors state that the modeled workflow was executed on AWS and Azure. However, in the Discussion section (page 12, line 222) authors mention that due to technical limitations it was not possible to execute some tasks hosted on AWS. It is not quite clear, which tasks were executed on which provider and how this influenced the modeled workflow(s). It would be helpful if authors emphasize which actions are required from users to model-deploy-execute a SWEEP-based workflow, and how AWS/Azure come into play here. In case some tasks from the modeled workflow were executed ONLY on Azure, this means that the actual workflow is not a single-cloud workflow and, among other issues, requires considering costs of inter-cloud data transfers too.
Workflows tasks were executed either on AWS or Azure; for the purpose of simplicitysimple routing was wired into the workflow CaaS task definitions. In the GVCF workflow the SWEEP configuration (on the backend) was pointed to Azure for CaaS, and AWS for FaaS tasks. SWEEP offers the users task level overrides to a specific CP or system level routing where the optimum destination of a task (regardless of type) is chosen by SWEEP. Paragraph (Lines 237-239) amended to elaborate the routing mechanisms. The product documentation (docs.sweep.run) has been augmented to emphasize user actions.
4. The implementation requirements related to function/container code are not clear. Since AWS and Azure impose different requirements on source code, integration with service offerings, and packaging formats, is it the case that SWEEP workflows can only be enacted on multiple providers if the code does not use provider-specific features at all? In addition, the invocation of FaaS functions can happen differently on AWS and Azure, e.g., HTTP-based using API Gateways, events, direct calls from the source code. The question is which requirements SWEEP workflows impose on the function code. In case API Gateways are used to invoke functions, the costs of API Gateway offerings have to be included into the picture as well, similar to data transfer costs if cross-cloud interactions were needed for enacting the workflow. SWEEP provides a way to deploy the functions to various CP(s) and has the knowledge of the resource namespace. SWEEP has adapters built-in to invoke FaaS or CaaS artifacts across different CP(s); adapters are aware of the individual CP(s) API specifications. Lines 82-84 address the clarification.

5.
It is stated that SWEEP workflows require actual functions/containers to be already deployed on the target provider, although its API has endpoints for uploading functions. As stated before, it would be helpful if authors clarify the actions required from users to model/deploy (both workflow and functions/containers), and execute a SWEEP workflow.
A task is considered registered when the task is deployed via SWEEP to a particular CP. SWEEP offers APIs to deploy the task, however, it's in a limited beta stage. Lines 79-80 address the clarification.
6. Authors need to explain why using SWEEP is more beneficial than using provider-specific orchestrators such as AWS Step Functions(SF) and Azure Durable Functions(DF). Both SF and DF are mature orchestrators supporting using FaaS/CaaS offerings from AWS and Azure which provide out-of-the-box service integration, error handling, etc. From the description of the process it is not clear, why SWEEP is more beneficial to use. Firstly, constructs allowing implementing composite integration patterns such as scatter-gather are present in SF and DF, together with means to automate the deployment of code/workflows, whereas this work has to be done manually in SWEEP. Moreover, since it is not quite clear which communication type is used to trigger functions/containers, the costs of using SWEEP might incur not only the perrequest costs, but also additional service offerings such as AWS API Gateway, making the overall costs more expensive than using SF, for example. In the best case, authors might provide a comparison between SWEEP and solutions from AWS/Azure to highlight pluses/minuses of using SWEEP workflows, e.g., cross-cloud workflow orchestration becomes possible, bioinformatics pipelines execution is more straightforward, etc.
The paragraph containing the motivation for using SWEEP has been extended to stress the fact that SWEEP supports multiple clouds, and that the main downside of cloud provider specific orchestrators is the associated vendor lock-in. Lines 27-29 and 36-37 address this clarification.

Phrasing:
--page 3, line 35: "As SWEEP is fully built on the serverless framework, ..." This is confusing, Serverless framework is a deployment automation tool, whereas SWEEP is presented as a serverless orchestrator / workflow management system. The phrase "serverless framework" here did not refer to the particular deployment automation tool with this name, but the general execution model. The phrase has been changed to "execution model". Added the clarification in lines 30-42.  (Fig. 1) 5 out of 18 workflow steps are cloud functions, whereas much the paper makes it sound as if most of the workflow was. What were the decisions to go for either technology in each step? This knowledge would greatly contribute to the value of the manuscript as it would help readers facing similar challenges.

SWEEP uses functions and containers to execute workflow tasks. It is not clear if CaaS refers to traditional long-running container services (e.g. pure Docker or Kubernetes), or to short-lived executions (e.g. Fargate, Google Cloud Run, Knative), and whether the statelessness is hence an issue for both. Fargate and ACI are mentioned later on, yet it is not clear why all tasks per runtime get the same memory allocation, and why the memory allocated to functions is much smaller (e.g. 3 GB instances would be possible with Lambda and would speed up execution). That should all be clarified in the section on definition of workflows. More technical details on the "as opposed to Docker Hub base images" would also be useful -whether it merely concerns the installation of additional packages, or also the interface to the containers and how information is passed to them and results are retrieved. It looks like in total
We used the same runtime configuration for all the tasks to achieve simplicity. However, the feedback is keen, we hope to address optimum runtime configuration for tasks in future releases of SWEEP. Lines 104 -107 clarifies the division of tasks types in the GVCF workflow, and the motivation behind it. Fig. 2 positions a cost of zero to a runtime of around 3 hours. First, the graph starts at around 15000s, or around 4 hours; and furthermore, it would imply 0 or negative cost for shorter runtimes whereas the pricing model of FaaS/CaaS is linear except for free tiers. It is not clear if the graph was created and interpreted correctly. Fig. 2 is also not referenced from the text. Some attention should be given to it because it represents the value proposition of the paper, that using SWEEP would be cost-effective.

3.
We were not implying that at zero there was a cost incurred; however, we see the source of confusion as the number of samples start at 2. We have clarified the same in Figure 2 caption. Lines 198 and 209 refers to Fig 2. 4. The manuscript as a whole could be made more readable. The acronym GVCF is not explained; presumably the G stands for Gathering but the rest is unclear. What does it do and why did you choose it as representative workflow that would allow to generalise findings?
GVCF is an enhanced version of the VCF file format--a "genomic" VCF. The "GVCF workflow" is an eponymously named workflow that uses the GVCF file format as an intermediate to parallelize joint variant calling as much as possible. It is commonly used in the field of genomics to call variants on entire cohorts and contains both scalable and less scalable components. This clarification was added in lines 58-61 of the manuscript.

5.
A typical JSON excerpt of GVCF in your own SWEEP language would further help in understanding the nature of the workflow and how it would map to cloud functions or containers. Moreover, most providers -including AWS -offer their own languages for serverless workflows; a short statement of differentiation would also be helpful (that might focus on portability and ability to also include containerised endpoints).
We have uploaded the SWEEP workflow definition to our GitHub repositoryhttps://github.com/SWEEP-Inc/GVCF 6. Moreover, most providers -including AWS -offer their own languages for serverless workflows; a short statement of differentiation would also be helpful (that might focus on portability and ability to also include containerised endpoints).
The paragraph containing the motivation for using SWEEP has been extended to stress the fact that SWEEP supports multiple clouds, and that the main downside of cloud provider specific orchestrators is the associated vendor lock-in. Clarification was added in lines 32-42.

Fig. 1 has dashed edges but the caption talks of dotted edges.
This has been addressed in the main text and legend for Figure 1. 8. For transparency reasons, it would be helpful to point out in the review materials that SWEEP is linked to a startup. KM is listed on the SWEEP website as lead software engineer but the competing interests only mention KA and AJ. KM was shortly employed by DotMote Labs at the time of developing the joint-variant calling workflows; we have amended the verbiage in the competing interest's section of our cover letter. Website has been changed as well as KM is not currently employed with SWEEP.