R can not be pushed in Production - deprecated!

This content is more than 4 years old and the cloud moves fast so some information may be slightly out of date.



Running Shiny on Fargate

Some guys still thinking R cannot be used at scale or only in a limited way. I still do not understand the reason why people are like this. Since my last post about AWS Batch, which is a Docker-based service within AWS, which enables you to work with R, I spend a lot of time with Fargate, another Docker-based service on AWS. This time my boss wanted a BI app. He talked about Tableau and PowerBI, while I was thinking about Shiny web apps. Why? Because you can create them fast, they are good looking and powerful. A problem about those apps comes up if multiple users are using them at the same time. The R runtime will function in a queued manner. Therefore, I decided to use Fargate. Spinning up multiple Shiny apps according to the CPU or memory usage seemed to be a good idea. And this enables you to run R with Docker at scale in a cheap manner.

Blogging about Shiny? With Docker? Seriously? Yes!

Starting my project, I found a bunch full of tutorials how to put Shiny into Docker. But I felt like being the first who wanted to put those apps on Fargate. So actually this Blog is about this part.

Our sample Shiny Web App

So just for simplicity, open Rstudioand choose the little plus symbol. Next, choose Shiny Web App

A Shiny Web App titled “Eggshell the R Foundation for Data Distribution v6.0 (C)” in the R Studio interface.

Give your Shiny App a name and choose - again for simplicity - the Single File(app.R) solution. Click create and your done.

DemoApp

Within RStudio click on Run App to have a look. After launching the you will be prompt to your to browser. Otherwise, you will see this output on the RStudio Console.

A console screenshot showing various terminal commands and responses related to a Shiny web application running on Docker.

Dockerize the app.R

To put the app.R I created this Dockerfile with touch Dockerfile. I chose rocker/shiny-verse:3.5as base image. Addtionally, because I wanted to use the Identity and Access Management (IAM) service on AWS, I installed an additional package to make the aws cli work within my Docker container.


FROM rocker/shiny-verse:3.5  

# Logging path
RUN mkdir -p /var/lib/shiny-server/bookmarks/shiny

# Install presequities
RUN apt-get update \
    && apt-get install -y --no-install-recommends \
    file \
    git \
    libapparmor1 \
    libcurl4-openssl-dev \
    libedit2 \
    libssl-dev \
    lsb-release \
    psmisc \
    procps \
    python-setuptools \
    sudo \
    wget \
    libclang-dev \
    libclang-3.8-dev \
    libobjc-6-dev \
    libclang1-3.8 \
    libclang-common-3.8-dev \
    libllvm3.8 \
    libobjc4 \
    libgc1c2 \
    python-pip \
    dialog \
    nano \
    && pip install virtualenv \
    && pip install awscli

Running Shiny with additional AWS services

If you want to use your Shiny Web App together with other AWS Services like S3 or Athena you need to add the following commands within your Dockerfile:


# Install Java to run queries with Athena
RUN apt-get install openjdk-8-jdk -y
RUN R CMD javareconf
RUN apt-get install r-cran-rjava -y
RUN chown shiny:shiny /tmp/
RUN chown shiny:shiny /usr/local/lib/R/site-library/AWR.Athena/java

We need Java to connect our app with Amazon Athena which can be done via a JDBC/ODBC driver. To make use of this service, I worked sucessfully with the R package AWR.Athena, which requires rjava and RJDBC. For S3 usage, I used the cloudyr packages aws.ec2metadata, aws.signature and aws.s3. We are also going to install all the other needed R packages, like ggplot2.


# Download and install R packages
RUN R -e "install.packages(c('ggplot2', 'RJDBC', 'dplyr','aws.ec2metadata','aws.signature','AWR.Athena','aws.s3', 'DT'), repos='http://cran.rstudio.com/')"

A crucial part: Adaption of shiny-server.sh

This part took a while to figure out. But finally, it’s pretty easy to test your Dockerzied Shiny App on your local machine in the same way as on AWS. We need to configure the shiny-server.sh. By default the script looks like this:


# Make sure the directory for individual app logs exists
mkdir -p /var/log/shiny-server
chown shiny.shiny /var/log/shiny-server

if [ "$APPLICATION_LOGS_TO_STDOUT" = "false" ];
then
    exec shiny-server 2>&1
else
    # start shiny server in detached mode
    exec shiny-server 2>&1 &

    # push the "real" application logs to stdout with xtail
    exec xtail /var/log/shiny-server/
fi

Within Fargate or on our local machine we cannot use curl http://169.254.169.254/meta-data like on EC2. Instead, we need to use the respective URI of each Task. And this needs to be done again and again, as long as your Task will run (because of the temporary credentials provided by the Security Token Service (STS)). So we just need to add the following line to the shiny-server.sh:


# Whether on Fargate or not, write credentials to .Renviron
env | grep AWS_ >> /home/shiny/.Renviron
chown shiny.shiny /home/shiny/.Renviron

By doing so, we can avoid any usage of hard coded credentials within your app.R. The credentials will be passed to the AWS services.

Finishing and Testing the Dockerfile

We need to finish our Dockerfile.


# Copy custom shiny-server.sh for environment variables
COPY shiny-server.sh /usr/bin/shiny-server.sh
RUN chmod 755 /usr/bin/shiny-server.sh

# Copy the app to the image
COPY DemoApp /srv/shiny-server/

# Make all app files readable
RUN chmod -R 755 /srv/shiny-server/

# Avoid using root, so use the pre-built user shiny
USER shiny

# Port for our Shiny WebApp
EXPOSE 3838

# Execute the Container
CMD ["/usr/bin/shiny-server.sh"]

To make testing fast, I used the following script. To make this work, you need the aws cli on your local machine, a cli profile and a IAM-role within your AWS account your are allowed to assume. This role can be used later on for Shiny Web App on Fargate.


#!/bin/bash

# Here you need to set up the the permissions accordingly
tokens=$(aws sts assume-role --role-arn "arn:aws:iam::ACCOUNT_ID:role/Role_ShinyWebApp" --role-session-name "ShinyWebApp" --profile YOUR_CLI_PROFILE)

AWS_ACCESS_KEY_ID=$(echo $tokens | jq -r .Credentials.AccessKeyId)
AWS_SECRET_ACCESS_KEY=$(echo $tokens | jq -r .Credentials.SecretAccessKey) 
AWS_SESSION_TOKEN=$(echo $tokens | jq -r .Credentials.SessionToken)

# Build the Docker image
docker build -t ShinyWebApp:$1 /path/to/your/directory

# Run the Docker image as Docker Container
# With env | grep AWS_ the shiny-servers.sh get's passed the credentials
docker run -it --rm \
    -e AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY_ID \
    -e AWS_SECRET_ACCESS_KEY=$AWS_SECRET_ACCESS_KEY \
    -e AWS_SESSION_TOKEN=$AWS_SESSION_TOKEN \
    -e AWS_DEFAULT_REGION=eu-central-1 \
    --name ShinyWebApp \
    -p 80:3838 ShinyWebApp:$1 # use http (80) or https (443)

With ./dockerrun.sh YOUR_TAG you can build your Docker image and run it afterwards. You can access the Shiny App with your browser.

How to grant Access to your ShinyApp?

This is another pain point. If you want to run such an application on the Internet, you need to secure it. Thats why I decided to use this architecture on AWS.

Diagram depicting architecture with Docker Image, AWS Fargate, Application Load Balancer, and Amazon Cognito.

I built a Docker Cluster with the Elastic Container Servce (ECS) and created a Task with Fargate as launchtype. With this configuration, AWS takes care about patching and cluster orchestration. We can freely scale the amount of tasks. For each Task, the Docker image comes from the Elastic Container Registry (ECR).

To provide access to the Shiny Web App, I used an Application Loadbalancer (ALB). This service is able to scan your browser headers for JSON web tokens. A user without those tokens will be redirected to Amazon Cognito and needs to sign in or to register. In my case I send invitations to my colleagues, so no one from outside can sign up by himself.

After signing in the ALB you will get redirected by the ALB to my Fargate Docker Cluster. Finally, and just to make things beautiful, you can use Route53 for assigning an URL and the Amazon Certificate Manager to provide https access to your Shiny Web App.

Creating the Fargate Service and a Task Definitionen

Let’s create a service.

  • Login in into your AWS Management Console.
  • Search for ECS.
  • Click on Get Started

AWS Fargate console showing a custom deployment configuration.

Afterwards choose custom and configure it to use your own Docker image.

Getting Started with Amazon Elastic Container Service (Amazon ECS) using Fargate

On the next step, you need to choose the right name of your Shiny Web App and the correct ECR image uri.

YOUR_ACCOUNT.dkr.ecr.eu-central-1.amazonaws.com/ShinyWebbApp:YOUR_TAG

My experience shows that 512 (MiB) as memory limit is sufficient. But you need to monitor this for your application! On the Port mappings section you choose Port 3838 the same port, we specified within the Dockerfile. Under Advanced container configurations we do not need to make any further changes. If you specified further commands via docker run on your local machine, you can specify them under this section. Closing this dialog, you will see a page like this.

Diagram of ECS objects and how they relate

Next, we will define our Service. So far we defined our Container and the Task. If you want to use Cognito in the same way like I do, you need to choose Application Load Balancer for Load balancer type. Now, click Next.

Diagram of ECS objects and how they relate

On the last page, you will need to name your cluster.

Diagram of ECS objects and how they relate

Click Nextand Create

The creation process will tale up to 15 minutes. After finishing the creation, you can check whether you app works by pasting in the DNS name of your ALB. If everything works, you will see your Shiny Web App.

Creating the Cognito User Authentication

Go to the Cognito console, click on Manage User Poolsand Create a user pool. Name your User pool. And choose Review defaults and click Create pool.

Create a user pool

Second, you need to create an App client. This App client will be integrated later on in your ALB. Just create a new one and leave all defaults untouched.

Image of the Cognito Console showing the “Which app clients will have access to this user pool?” settings.

Under App integration, choose App client settings. Select the following options accordingly. This will create our JSON Web Tokens with OpenID. If you don’t have an URL just type in the ALB’s DNS name.

Identity providers and OAuth 2.0 settings for your app clients

Under Domain name, you will create your Log-In page. Don’t worry! You don’t need to code anything and you don’t need any URL here. AWS will provide you a ready to use template and will give you a authentication domain.

shiny app auth

That’s it. Go now to the EC2 console.

Define the routing paths for the Application Load Balancer

Finally, go to your ALB. Go to Listeners Panel. We want to create two Listeners Rule sets. First, we want to redirect all request on Port 80 (http) to Port 443 (https).

ALB-WQB-App | HTTP : 80

Second, all request on Port 443 should be evaluated and if authenticated redirected to your Shiny Web App. Use your User pool id and App client id here

Listeners belonging to Application Load Balancers check for connection requests using the protocol and port you specify. Once you have created your listener, you can create and manage additional routing rules as needed.

Going further

You should treat Shiny Web Apps as stateless, meaning that all data should be loaded from an external storage system like S3. Additionally, you should think about scaling your app up and down. Running one Task per day will lead to a service fee of 3 Dollars. That’s why I start the app at 8 am in the morning will stop it at 6 pm. This is pretty easy done with an Amazon Cloudwatch Rule and AWS Lambda. You just need to set the desired task count from 1 to 0 and the other way round. As a result the service fee decreased to 1 Dollar per day. And my boss was pretty happy about not spending a lot of money for software licensing.

And guys: Please stop telling fairytales about R. Cheers!

Similar Posts You Might Enjoy

Sound of Silence - Lift your heavy Workloads to AWS Batch with Docker

Statistical Computing on Your Local Workstation Recently, a costumer told me about his problems to fullfil stastistical computing workloads on his local workstations. At first, you need to know that statistcal computing language like R and Python by default only use one core of your multi core machine. To parallize them, you have a lot of opportunities with regard to CPU-usage, IOPS etc.. Second, even if you parallize your workloads, you got an annoying loud machine in your office. - by Malte Walkowiak

CDK Infrastructure Testing - Part 2b - Unit, Integration and Application Test for Serverless Lambda Functions

After describing the context of the test pyramid for Infrastructure as Code in part 1, and the Web Application in Part 2a - let`s apply that to some Lambda function. - by Gernot Glawe

CDK Infrastructure Testing - Part 2a - Implement Unit, Integration and Application Test for CDK Infrastructure and an EC2 Web Server Application

With CDK you create Infrastructure as Code - IaC. You can automate the test for the IaC code. The three test tastes -Unit, Integration and Application- should work closely together. Here I show you how. It is like the three steps of coffee tasting: 1 smell, 2 Taste, 3 Feel. - by Gernot Glawe