Deploy a Streamlit App to AWS

a implausible Streamlit app, and now it’s time to let the world see and use it.

What choices do you’ve got?

The best method is to make use of the Streamlit Neighborhood Cloud service. That methodology lets anybody on-line entry your Streamlit app, supplied they’ve the required URL. It’s a comparatively easy course of, nevertheless it’s a publicly out there endpoint and, on account of potential safety points and scalability choices, it isn’t an choice for many organisations.

Since Streamlit was acquired by Snowflake, deploying to that platform is now a viable choice as properly.

The third choice is to deploy to one of many many cloud companies, equivalent to Heroku, Google Cloud, or Azure.

As an AWS person, I needed to see how straightforward it could be to deploy a streamlit app to AWS, and that is what this text is about. In the event you seek advice from the official Streamlit documentation on-line (hyperlink on the finish of the article), you’ll discover that there is no such thing as a info or steering on how to do that. So that is the “lacking handbook”.

The deployment course of is comparatively easy. The difficult half is guaranteeing that the AWS networking configuration is ready up accurately. By that, I imply your VPC, safety teams, subnets, route tables, subnet associations, Nat Gateways, Elastic IPS, and many others…

As a result of each organisation’s networking setup is completely different, I’ll assume that you just or somebody in your organisation can resolve this side. Nonetheless, I embody some troubleshooting ideas on the finish of the article for the commonest causes for deployment points. In the event you observe my steps to the letter, you ought to have a working, deployed app by the tip of it.

In my pattern deployment, I’ll be utilizing a VPC with a Public subnet and an Web gateway. In contrast, in real-life situations, you’ll in all probability wish to use a mixture of all or a few of elastic load balancers, personal subnets, NAT gateways and Cognito for person authentication and enhanced safety. Afterward, I’ll talk about some choices for securing your app.

The app we are going to deploy is the dashboard I wrote utilizing Streamlit. TDS printed that article some time again, and you will discover a hyperlink to it on the finish of this text. In that case, I retrieved my dashboard information from a PostgreSQL database working domestically. Nonetheless, to keep away from the prices and problem of organising an RDS Postgres database on AWS, I’ll convert my dashboard code to retrieve its information from a CSV file on S3 — Amazon’s mass storage service.

As soon as that’s accomplished, it’s solely a matter of copying over a CSV to AWS S3 storage, and the dashboard ought to work simply because it did when working domestically utilizing Postgres.

I assume you’ve got an AWS account with entry to the AWS console. Moreover, in case you are choosing the S3 route as your information supply, you’ll have to arrange AWS credentials. Upon getting them, both create an .aws/credentials file in your HOME listing (as I’ve accomplished), or you may move your credential key info instantly within the code.

Assuming all these conditions are met, we will have a look at the deployment utilizing AWS’s Elastic Beanstalk service.

What’s AWS Elastic Beanstalk (EB)?

AWS Elastic Beanstalk (EB) is a totally managed service that simplifies the deployment, scaling, and administration of functions within the AWS Cloud. It permits you to add your software code in in style languages like Python, Java, .NET, Node.js, and extra. It mechanically handles the provisioning of the underlying infrastructure, equivalent to servers, load balancers, and networking. With Elastic Beanstalk, you may concentrate on writing and sustaining your software quite than configuring servers or managing capability as a result of the service seamlessly scales assets as your software’s visitors fluctuates.

Along with provisioning your EC2 servers, and many others., EB will set up any required exterior libraries in your behalf, relying on the deployment sort. It can be configured to run OS instructions on server startup.

The code

Earlier than deploying, let’s overview the modifications I made to my authentic code to accommodate the change in information supply from Postgres to S3. It boils all the way down to changing calls to learn a Postgres desk with calls to learn an S3 object to feed information into the dashboard. I additionally put the primary graphical element creation and show inside a predominant() module, which I name on the finish of the code. Here’s a full itemizing.

import streamlit as st
import pandas as pd
import matplotlib.pyplot as plt
import datetime
import boto3
from io import StringIO

#########################################
# 1. Load Information from S3
#########################################

@st.cache_data
def load_data_from_s3(bucket_name, object_key):
    """
    Reads a CSV file from S3 right into a Pandas DataFrame.
    Make certain your AWS credentials are correctly configured.
    """
    s3 = boto3.shopper("s3")
    obj = s3.get_object(Bucket=bucket_name, Key=object_key)
    df = pd.read_csv(obj['Body'])
    
    # Convert order_date to datetime if wanted
    df['order_date'] = pd.to_datetime(df['order_date'], format='%d/%m/%Y')
    
    return df

#########################################
# 2. Helper Capabilities (Pandas-based)
#########################################

def get_date_range(df):
    """Return min and max dates within the dataset."""
    min_date = df['order_date'].min()
    max_date = df['order_date'].max()
    return min_date, max_date

def get_unique_categories(df):
    """
    Return a sorted checklist of distinctive classes (capitalized).
    """
    classes = df['categories'].dropna().distinctive()
    classes = sorted([cat.capitalize() for cat in categories])
    return classes

def filter_dataframe(df, start_date, end_date, class):
    """
    Filter the dataframe by date vary and optionally by a single class.
    """
    # Guarantee begin/end_date are transformed to datetime simply in case
    start_date = pd.to_datetime(start_date)
    end_date = pd.to_datetime(end_date)
    
    masks = (df['order_date'] >= start_date) & (df['order_date'] <= end_date)
    filtered = df.loc[mask].copy()
    
    # If not "All Classes," filter additional by class
    if class != "All Classes":
        # Classes in CSV is perhaps lowercase, uppercase, and many others.
        # Alter as wanted to match your information
        filtered = filtered[filtered['categories'].str.decrease() == class.decrease()]
    
    return filtered

def get_dashboard_stats(df, start_date, end_date, class):
    """
    Calculate complete income, complete orders, common order worth, and prime class.
    """
    filtered_df = filter_dataframe(df, start_date, end_date, class)
    if filtered_df.empty:
        return 0, 0, 0, "N/A"
    
    filtered_df['revenue'] = filtered_df['price'] * filtered_df['quantity']
    total_revenue = filtered_df['revenue'].sum()
    total_orders = filtered_df['order_id'].nunique()
    avg_order_value = total_revenue / total_orders if total_orders > 0 else 0
    
    # Decide prime class by complete income
    cat_revenue = filtered_df.groupby('classes')['revenue'].sum().sort_values(ascending=False)
    top_cat = cat_revenue.index[0].capitalize() if not cat_revenue.empty else "N/A"
    
    return total_revenue, total_orders, avg_order_value, top_cat

def get_plot_data(df, start_date, end_date, class):
    """
    For 'Income Over Time', group by date and sum income.
    """
    filtered_df = filter_dataframe(df, start_date, end_date, class)
    if filtered_df.empty:
        return pd.DataFrame(columns=['date', 'revenue'])
    
    filtered_df['revenue'] = filtered_df['price'] * filtered_df['quantity']
    plot_df = (
        filtered_df.groupby(filtered_df['order_date'].dt.date)['revenue']
        .sum()
        .reset_index()
        .rename(columns={'order_date': 'date'})
        .sort_values('date')
    )
    return plot_df

def get_revenue_by_category(df, start_date, end_date, class):
    """
    For 'Income by Class', group by class and sum income.
    """
    filtered_df = filter_dataframe(df, start_date, end_date, class)
    if filtered_df.empty:
        return pd.DataFrame(columns=['categories', 'revenue'])
    
    filtered_df['revenue'] = filtered_df['price'] * filtered_df['quantity']
    rev_cat_df = (
        filtered_df.groupby('classes')['revenue']
        .sum()
        .reset_index()
        .sort_values('income', ascending=False)
    )
    rev_cat_df['categories'] = rev_cat_df['categories'].str.capitalize()
    return rev_cat_df

def get_top_products(df, start_date, end_date, class, top_n=10):
    """
    For 'Prime Merchandise', return prime N merchandise by income.
    """
    filtered_df = filter_dataframe(df, start_date, end_date, class)
    if filtered_df.empty:
        return pd.DataFrame(columns=['product_names', 'revenue'])
    
    filtered_df['revenue'] = filtered_df['price'] * filtered_df['quantity']
    top_products_df = (
        filtered_df.groupby('product_names')['revenue']
        .sum()
        .reset_index()
        .sort_values('income', ascending=False)
        .head(top_n)
    )
    return top_products_df

def get_raw_data(df, start_date, end_date, class):
    """
    Return the uncooked (filtered) information with a income column.
    """
    filtered_df = filter_dataframe(df, start_date, end_date, class)
    if filtered_df.empty:
        return pd.DataFrame()
    
    filtered_df['revenue'] = filtered_df['price'] * filtered_df['quantity']
    filtered_df = filtered_df.sort_values(by=['order_date', 'order_id'])
    return filtered_df

def plot_data(information, x_col, y_col, title, xlabel, ylabel, orientation='v'):
    fig, ax = plt.subplots(figsize=(10, 6))
    if not information.empty:
        if orientation == 'v':
            ax.bar(information[x_col], information[y_col])
            plt.xticks(rotation=45)
        else:
            ax.barh(information[x_col], information[y_col])
        ax.set_title(title)
        ax.set_xlabel(xlabel)
        ax.set_ylabel(ylabel)
    else:
        ax.textual content(0.5, 0.5, "No information out there", ha='heart', va='heart')
    return fig

#########################################
# 3. Streamlit Software
#########################################

def predominant():
    # Title
    st.title("Gross sales Efficiency Dashboard")

    # Load your information from S3
    # Substitute these along with your precise bucket title and object key
    bucket_name = "your_s3_bucket_name"
    object_key = "your_object_name"
    
    df = load_data_from_s3(bucket_name, object_key)
    
    # Get min and max date for default vary
    min_date, max_date = get_date_range(df)

    # Create UI for date and class filters
    with st.container():
        col1, col2, col3 = st.columns([1, 1, 2])
        start_date = col1.date_input("Begin Date", min_date)
        end_date = col2.date_input("Finish Date", max_date)
        classes = get_unique_categories(df)
        class = col3.selectbox("Class", ["All Categories"] + classes)

    # Customized CSS for metrics
    st.markdown("""
        <model>
        .metric-row {
            show: flex;
            justify-content: space-between;
            margin-bottom: 20px;
        }
        .metric-container {
            flex: 1;
            padding: 10px;
            text-align: heart;
            background-color: #f0f2f6;
            border-radius: 5px;
            margin: 0 5px;
        }
        .metric-label {
            font-size: 14px;
            colour: #555;
            margin-bottom: 5px;
        }
        .metric-value {
            font-size: 18px;
            font-weight: daring;
            colour: #0e1117;
        }
        </model>
    """, unsafe_allow_html=True)

    # Fetch stats
    total_revenue, total_orders, avg_order_value, top_category = get_dashboard_stats(df, start_date, end_date, class)

    # Show key metrics
    metrics_html = f"""
    <div class="metric-row">
        <div class="metric-container">
            <div class="metric-label">Complete Income</div>
            <div class="metric-value">${total_revenue:,.2f}</div>
        </div>
        <div class="metric-container">
            <div class="metric-label">Complete Orders</div>
            <div class="metric-value">{total_orders:,}</div>
        </div>
        <div class="metric-container">
            <div class="metric-label">Common Order Worth</div>
            <div class="metric-value">${avg_order_value:,.2f}</div>
        </div>
        <div class="metric-container">
            <div class="metric-label">Prime Class</div>
            <div class="metric-value">{top_category}</div>
        </div>
    </div>
    """
    st.markdown(metrics_html, unsafe_allow_html=True)

    # Visualization Tabs
    st.header("Visualizations")
    tabs = st.tabs(["Revenue Over Time", "Revenue by Category", "Top Products"])

    # Income Over Time Tab
    with tabs[0]:
        st.subheader("Income Over Time")
        revenue_data = get_plot_data(df, start_date, end_date, class)
        st.pyplot(plot_data(revenue_data, 'date', 'income', "Income Over Time", "Date", "Income"))

    # Income by Class Tab
    with tabs[1]:
        st.subheader("Income by Class")
        category_data = get_revenue_by_category(df, start_date, end_date, class)
        st.pyplot(plot_data(category_data, 'classes', 'income', "Income by Class", "Class", "Income"))

    # Prime Merchandise Tab
    with tabs[2]:
        st.subheader("Prime Merchandise")
        top_products_data = get_top_products(df, start_date, end_date, class)
        st.pyplot(plot_data(top_products_data, 'product_names', 'income', "Prime Merchandise", "Income", "Product Identify", orientation='h'))

    # Uncooked Information
    st.header("Uncooked Information")
    raw_data = get_raw_data(df, start_date, end_date, class)
    raw_data = raw_data.reset_index(drop=True)
    st.dataframe(raw_data, hide_index=True)

if __name__ == '__main__':
    predominant()

Though it’s a fairly chunky piece of code, I gained’t clarify precisely what it does, as I’ve already coated that in some element in my beforehand referenced TDS article. I’ve included a hyperlink to the article on the finish of this one for individuals who wish to be taught extra.

So, assuming you’ve got a working Streamlit app that runs domestically with out points, listed below are the steps it’s worthwhile to take to deploy it to AWS.

Making ready our code for deployment

1/ Create a brand new folder in your native system to carry your code.

2/ In that folder, you’ll want three recordsdata and a sub-folder containing two extra recordsdata

File 1 is app.py — that is your predominant Streamlit code file
File 2 is necessities.txt — this lists all exterior libraries your code must perform. Relying on what your code does, it would have a minimum of one file referencing the Streamlit library. For my code, the file contained this,

streamlit
boto3
matplotlib
pandas

File 3 is known as Procfile — this tells EB the way to run your code. It’s contents ought to seem like this

net: streamlit run app.py --server.port 8000 --server.enableCORS false

.ebextensions — this can be a subfolder which holds further recordsdata (see beneath)

3/ The .ebextensions subfolder has these two recordsdata.

It ought to have this content material:

option_settings:
  aws:elasticbeanstalk:atmosphere:proxy:
    ProxyServer: nginx

option_settings:
  aws:elasticbeanstalk:container:python:
    WSGIPath: app:predominant

Word, though I didn’t want it for what I used to be doing, for completenes, you may optionally add a number of packages.config recordsdata below the .ebextensions subfolder that may comprise working system instructions which are run when the EC2 server begins up. For instance,

#
# 01_packages.config
#
packages:
    yum:
        amazon-linux-extras: []

instructions:
    01_postgres_activate:
        command: sudo amazon-linux-extras allow postgresql10
    02_postgres_install:
        command: sudo yum set up -y pip3
    03_postgres_install:
        command: sudo pip3 set up -y psycopg2

Upon getting all the required recordsdata, the following step is to zip them into an archive, preserving the folder and subfolder construction. You should use any device you want, however I take advantage of 7-Zip.

Deploying our code

Deployment is a multi-stage course of. First, log in to the AWS console, seek for “Elastic Beanstalk” within the companies search bar, and click on on the hyperlink. From there, you may click on the massive orange “Create Software” button. You’ll see the primary of round six screens, for which you should fill within the particulars. Within the following sections, I’ll describe the fields you should enter. Go away all the pieces else as it’s.

1/ Creating the appliance

That is straightforward: fill within the title of your software and, optionally, its description.

2/ Configure Setting

The atmosphere tier must be set to Net Server.
Fill within the software title.
For Platform sort, select Managed; for Platform, select Python, then determine which model of Python you wish to use. I used Python model 3.11.
Within the Software Code part, click on the Add your code choice and observe the directions. Kind in a model label, then click on ‘Native File’ or ‘S3 Add’, relying on the place your supply recordsdata are positioned. You wish to add the only zip file we created earlier.
Select your occasion sort within the Presets part. I went for the Single occasion (free tier eligible). Then hit the Subsequent button.

3/ Configure Service Entry

Picture from AWS web site

For the Service position, you need to use an current one you probably have it, or AWS will create one for you.
For the occasion profile position, you’ll in all probability have to create this. It simply must have the AWSElasticBeanstalkWebTier and AmazonS3ReadOnlyAccess insurance policies hooked up. Hit the Subsequent button.
I might additionally advise organising an EC2 key pair at this stage, as you’ll want it to log in to the EC2 server that EB creates in your behalf. This may be invaluable for investigating potential server points.

4/ Arrange networking, database and tags

Select your VPC. I had just one default VPC arrange. You even have the choice to create one right here if you happen to don’t have already got one. Make certain your VPC has a minimum of one public subnet.
In Occasion Settings, I checked the Public IP Deal with choice, and I selected to make use of my public subnets. Click on the Subsequent button.

5/ Configure the occasion and scaling

Below the EC2 Safety Teams part, I selected my default safety group. Below Occasion Kind, I opted for the t3.micro. Hit the Next button.

6/ Monitoring

Choose fundamental system well being monitoring
Uncheck the Managed Updates checkbox
Click on Subsequent

7/ Evaluation

Click on Create if all is OK

After this, it is best to see a display like this,

Keep watch over the Occasions tab, as this may notify you if any points come up. In the event you encounter issues, you need to use the Logs tab to retrieve both a full set of logs or the final 100 traces of the deployment log, which may also help you debug any points.

After a couple of minutes, if all has gone properly, the Well being label will swap from gray to inexperienced and your display will look one thing like this:

Now, it is best to have the ability to click on on the Area URL (circled in purple above), and your dashboard ought to seem.

Troubleshooting

The very first thing to verify if you happen to encounter issues when working your dashboard is that your supply information is within the appropriate location and is referenced accurately in your Streamlit app supply code file. In the event you rule that out as a difficulty, then you’ll greater than probably have hit a networking setup drawback, and also you’ll in all probability see a display like this.

If that’s the case, right here are some things you may take a look at. Chances are you’ll have to log in to your EC2 occasion and overview the logs. In my case, I encountered a difficulty with my pip set up command, which ran out of house to put in all the required packages. To resolve that, I had so as to add additional Elastic Block storage to my occasion.

The extra probably trigger will likely be a networking challenge. In that case, strive some or all the recommendations beneath.

VPC Configuration

Guarantee your Elastic Beanstalk atmosphere is deployed in a VPC with a minimum of one public subnet.
Confirm that the VPC has an Web Gateway hooked up.

Subnet Configuration

Verify that the subnet utilized by your Elastic Beanstalk atmosphere is public.
Examine that the “Auto-assign public IPv4 handle” setting is enabled for this subnet.

Route Desk

Confirm that the route desk related along with your public subnet has a path to the Web Gateway (0.0.0.0/0 -> igw-xxxxxxxx).

Safety Group

Evaluation the inbound guidelines of the safety group hooked up to your Elastic Beanstalk cases.
Guarantee it permits incoming visitors on port 80 (HTTP) and/or 443 (HTTPS) from the suitable sources.
Examine that outbound guidelines enable essential outgoing visitors.

Community Entry Management Lists (NACLS)

Evaluation the Community ACLS related along with your subnet.
Guarantee they permit each inbound and outbound visitors on the required ports.

Elastic Beanstalk Setting Configuration

Confirm that your atmosphere is utilizing the right VPC and public subnet within the Elastic Beanstalk console.

EC2 Occasion Configuration

Confirm that the EC2 cases launched by Elastic Beanstalk have public IP addresses assigned.

Load Balancer Configuration (if relevant)

In the event you use a load balancer, guarantee it’s configured accurately within the public subnet.
Examine that the load balancer safety group permits incoming visitors and may talk with the EC2 cases.

Securing your app

Because it stands, your deployed app is seen to anybody on the web who is aware of your deployed EB area title. That is in all probability not what you need. So, what are your choices for securing your app on AWS infrastructure?

1/ Lock the safety group to trusted CIDRs

Within the console, discover the safety group related along with your EB deployment and click on on it. It ought to seem like this,

Be sure to’re on the Inbound Guidelines TAB, select Edit Inbound Guidelines, and alter the supply IP ranges to your company IP ranges or one other set of IP addresses.

2/ Use personal subnets, inside load balancers and NAT Gateways

It is a more difficult choice to implement and sure requires the experience of your AWS community administrator or deployment specialist.

3/ Utilizing AWS Cognito and an software load balancer

Once more, this can be a extra advanced setup that you just’ll in all probability want help with if you happen to’re not an AWS community guru, however it’s maybe probably the most strong of all of them. The move is that this:-

A person navigates to your public Streamlit URL.

The ALB intercepts the request. It sees that the person is both not already logged in or not authenticated.

The ALB mechanically redirects the person to Cognito to check in or create an account. Upon profitable login, Cognito redirects the person again to your software URL. The ALB now recognises a sound session and permits the request to proceed to your Streamlit app.

Your Streamlit app solely ever receives visitors from authenticated customers.

Abstract

On this article, I mentioned deploying a Streamlit dashboard software I had beforehand written to AWS. The unique app utilised PostgreSQL as its information supply, and I demonstrated the way to swap to utilizing AWS S3 in preparation for deploying the app to AWS.

I mentioned deploying the app to AWS utilizing their Elastic Beanstalk service. I described and defined all the additional recordsdata required earlier than deployment, together with the necessity for them to be contained in a zipper archive.

I then briefly defined the Elastic Beanstalk service and described the detailed steps required to make use of it to deploy our Streamlit app to AWS infrastructure. I described the a number of enter screens that wanted to be navigated and confirmed what inputs to make use of at varied phases.

I highlighted some troubleshooting strategies if the app deployment doesn’t go as anticipated.

Lastly, I supplied some recommendations on the way to defend your app from unauthorised entry.

For extra info on Streamlit, take a look at their on-line documention utilizing the hyperlink beneath.

https://docs.streamlit.io

To search out out extra about creating with Streamlit I present the way to develop a contemporary information dashboard with it within the article linked beneath.

Source link

Implementing DRIFT Search with Neo4j and LlamaIndex

Agentic AI in Finance: Opportunities and Challenges for Indonesia

Creating AI that matters | MIT News

A new generative AI approach to predicting chemical reactions | MIT News

I Won $10,000 in a Machine Learning Competition — Here’s My Complete Strategy

The AI Hype Index: College students are hooked on ChatGPT

Can we repair the internet?

New AI system could accelerate clinical research | MIT News

Most Popular

Who Is John Schulman? The Brain Behind ChatGPT’s Breakthrough

Lessons Learned After 6.5 Years Of Machine Learning

Implementing the Fourier Transform Numerically in Python: A Step-by-Step Guide

Our Picks

How To Choose The Perfect AI Tool In 2025 » Ofemwire