Data Visualization Explained (Part 5): Visualizing Time-Series Data in Python (Matplotlib, Plotly, and Altair)

in my knowledge visualization sequence. See the next:

It’s time to start out constructing your personal knowledge visualizations. On this article, I’ll stroll by the method of visualizing time-series knowledge in Python intimately. When you have not learn the earlier articles in my knowledge visualization sequence, I strongly advocate reading at least the previous article for a review of Python.

Over the course of coding visualizations in Python, I’ll deal with three Python packages: Matplotlib, Plotly, and Altair. One strategy to studying these may contain writing 1-2 articles per bundle, every one delving into the chosen bundle intimately. Whereas it is a legitimate strategy, the main target of my sequence is just not on any specific library; it’s in regards to the knowledge visualization course of itself. These packages are merely instruments—a way to an finish.

Consequently, I’ll construction this text and those to comply with every round a specific kind of knowledge visualization, and I’ll talk about tips on how to implement that visualization in every of the listed packages to make sure you have a breadth of approaches out there to you.

First up: a definition for time-series knowledge.

What Is Time-Collection Information?

Formally, time-series knowledge entails a variable that may be a operate of time. In easy phrases, this simply means some knowledge that adjustments over time.

For instance, a public firm’s inventory worth over the past ten years is time-series knowledge. If you happen to’d favor a extra scientific instance, think about the climate. A graph depicting the every day temperature of your favourite metropolis over the course of the yr is a graph that depicts time-series knowledge.

Time-series knowledge is a wonderful start line for knowledge visualization for a number of causes:

It’s an especially widespread and helpful kind of knowledge. There’s fairly a bit of data that’s depending on time, and understanding it gives significant perception into the topic of curiosity going ahead.
There are tried and true strategies to visualise time-series knowledge successfully, as you’ll see under. Grasp these, and also you’ll be in good condition.
As in contrast with another varieties of knowledge, time-series visualizations are pretty intuitive to people and align with our notion of time. This makes it simpler to deal with the fundamental components of visualization design when beginning out, as a substitute of getting slowed down in attempting to make sense of very advanced knowledge.

Let’s begin by looking at completely different visualization strategies on a conceptual degree.

How Is Time-Collection Information Visualized?

The usual for time-series visualization is the famed line chart:

Picture by Wikimedia Commons

This chart typically places time on the x-axis, and the variable that adjustments with time on the y-axis. This gives a view that seem like “transferring ahead,” in step with people’ linear notion of time.

Although the road chart is the usual, there are different, associated potentialities.

A number of Line Chart

This strategy is a direct extension of a singular line chart and shows a number of associated time sequence on the identical plot, permitting comparability between teams or classes (e.g., gross sales by area):

Space Chart

Functionally, an space chart is sort of precisely the identical as a line chart, however the space beneath the road is crammed in. It emphasizes the magnitude of change:

Stacked Space Chart

Technically, the stacked space chart is the analogue to the a number of line chart, however it’s a bit trickier to learn. Particularly, the whole is cumulative, with the baseline for every stacked line beginning on the one under it. For example, at 2023 within the chart under, “Ages 25-64” represents about 4 billion folks, since we begin counting the place “Ages 15-24” ends.

Bar Chart (Vertical or Horizontal)

Lastly, in some instances, a bar chart can be acceptable for time-series visualization. This strategy is beneficial for those who want to present discrete time intervals—corresponding to month-to-month sum or yearly common of some metric—slightly than steady knowledge. That mentioned, I can’t be coding bar charts on this article.

Now, let’s get to really constructing these visualizations. In every of the examples under, I’ll stroll by the code in a selected visualization library for developing line charts and space charts. I’ve linked the data here and encourage you to comply with alongside. To internalize these methods, you will need to follow utilizing them your self.

Coding Time-Collection Visualizations in Matplotlib

import pandas as pd
import matplotlib.pyplot as plt

# Load knowledge
df = pd.read_csv('sales_data.csv')
df['Date'] = pd.to_datetime(df['Date'])

# Instance 1: Easy Line Chart
fig1, ax1 = plt.subplots(figsize=(10, 6))
ax1.plot(df['Date'], df['Product A Sales'], linewidth=2)
ax1.set_xlabel('Date')
ax1.set_ylabel('Gross sales')
ax1.set_title('Product A Gross sales Over Time')
ax1.grid(True, alpha=0.3)
plt.tight_layout()
# Show with: fig1

# Instance 2: A number of Line Chart
fig2, ax2 = plt.subplots(figsize=(10, 6))
ax2.plot(df['Date'], df['Product A Sales'], label='Product A', linewidth=2)
ax2.plot(df['Date'], df['Product B Sales'], label='Product B', linewidth=2)
ax2.plot(df['Date'], df['Product C Sales'], label='Product C', linewidth=2)
ax2.set_xlabel('Date')
ax2.set_ylabel('Gross sales')
ax2.set_title('Gross sales Comparability - All Merchandise')
ax2.legend()
ax2.grid(True, alpha=0.3)
plt.tight_layout()
# Show with: fig2

# Instance 3: Space Chart
fig3, ax3 = plt.subplots(figsize=(10, 6))
ax3.fill_between(df['Date'], df['Product A Sales'], alpha=0.4)
ax3.plot(df['Date'], df['Product A Sales'], linewidth=2)
ax3.set_xlabel('Date')
ax3.set_ylabel('Gross sales')
ax3.set_title('Product A Gross sales - Space Chart')
ax3.grid(True, alpha=0.3)
plt.tight_layout()
# Show with: fig3

# Instance 4: Stacked Space Chart
fig4, ax4 = plt.subplots(figsize=(10, 6))
ax4.stackplot(df['Date'], df['Product A Sales'], df['Product B Sales'], df['Product C Sales'],
              labels=['Product A', 'Product B', 'Product C'], alpha=0.7)
ax4.set_xlabel('Date')
ax4.set_ylabel('Gross sales')
ax4.set_title('Complete Gross sales - Stacked Space Chart')
ax4.legend(loc='higher left')
ax4.grid(True, alpha=0.3)
plt.tight_layout()
# Show with: fig4

Working this code produces the next 4 visualizations:

Let’s break the code down step-by-step to make sure you perceive what is going on:

First, we load the info into pandas as a CSV file and make sure the date is correctly represented as a datetime object.
Matplotlib constructions charts throughout the Determine object, which represents the complete Canvas. This may be accessed immediately utilizing plt.determine, however having a number of variables utilizing plt.subplots is extra intuitive for a number of visualizations. Each name to plt.subplots defines a brand new, separate Determine (canvas).
The road fig1, ax1 = plt.subplots(figsize=(10, 6)) defines the primary subplot; fig1 represents the canvas, however ax1 represents the precise plotting space inside it and is the variable the place you’ll make most adjustments.
Matplotlib has completely different capabilities for various charts. The plot operate plots 2-D factors after which connects them to assemble a line chart. That is what we specify within the line ax1.plot(df['Date'], df['Product A Sales'], linewidth=2).
The remaining traces are primarily aesthetic capabilities that do precisely what their names recommend: labeling axes, including gridlines, and specifying format.
For the a number of line chart, the code is exactly the identical, besides we name plot 3 times: one for every set of x-y factors that we need to graph to point out all of the merchandise.
The space chart is sort of an identical to the road chart, apart from the addition of ax3.fill_between(df['Date'], df['Product A Sales'], alpha=0.4), which tells Matplotlib to shade the world under the road.
The stacked space chart, in contrast, requires us to make use of the stacked_plot operate, which takes in all three knowledge arrays we need to plot directly. The remaining aesthetic code, nevertheless, is similar.

Strive programming these your self in your favourite IDE or in a Jupyter pocket book. What patterns do you see? Which chart do you favor essentially the most?

Additionally, keep in mind that you don’t want to memorize this syntax, particularly if you’re new to programming knowledge visualizations or new to Python normally. Concentrate on attempting to know what is going on on a conceptual degree; you may at all times search for the actual syntax and plug your knowledge in as wanted.

This can maintain true for the remaining two examples as properly.

Coding Time-Collection Visualizations in Plotly

Right here is the code to generate the identical visualizations as above, this time in Plotly’s model:

import pandas as pd
import plotly.graph_objects as go

# Load knowledge
df = pd.read_csv('sales_data.csv')
df['Date'] = pd.to_datetime(df['Date'])

# Instance 1: Easy Line Chart
fig1 = go.Determine()
fig1.add_trace(go.Scatter(x=df['Date'], y=df['Product A Sales'], mode='traces', title='Product A'))
fig1.update_layout(
    title='Product A Gross sales Over Time',
    xaxis_title='Date',
    yaxis_title='Gross sales',
    template='plotly_white'
)
# Show with: fig1

# Instance 2: A number of Line Chart
fig2 = go.Determine()
fig2.add_trace(go.Scatter(x=df['Date'], y=df['Product A Sales'], mode='traces', title='Product A'))
fig2.add_trace(go.Scatter(x=df['Date'], y=df['Product B Sales'], mode='traces', title='Product B'))
fig2.add_trace(go.Scatter(x=df['Date'], y=df['Product C Sales'], mode='traces', title='Product C'))
fig2.update_layout(
    title='Gross sales Comparability - All Merchandise',
    xaxis_title='Date',
    yaxis_title='Gross sales',
    template='plotly_white'
)
# Show with: fig2

# Instance 3: Space Chart
fig3 = go.Determine()
fig3.add_trace(go.Scatter(
    x=df['Date'], y=df['Product A Sales'],
    mode='traces',
    title='Product A',
    fill='tozeroy'
))
fig3.update_layout(
    title='Product A Gross sales - Space Chart',
    xaxis_title='Date',
    yaxis_title='Gross sales',
    template='plotly_white'
)
# Show with: fig3

# Instance 4: Stacked Space Chart
fig4 = go.Determine()
fig4.add_trace(go.Scatter(
    x=df['Date'], y=df['Product A Sales'],
    mode='traces',
    title='Product A',
    stackgroup='one'
))
fig4.add_trace(go.Scatter(
    x=df['Date'], y=df['Product B Sales'],
    mode='traces',
    title='Product B',
    stackgroup='one'
))
fig4.add_trace(go.Scatter(
    x=df['Date'], y=df['Product C Sales'],
    mode='traces',
    title='Product C',
    stackgroup='one'
))
fig4.update_layout(
    title='Complete Gross sales - Stacked Space Chart',
    xaxis_title='Date',
    yaxis_title='Gross sales',
    template='plotly_white'
)
# Show with: fig4

We acquire the next 4 visualizations:

Here’s a breakdown of the code:

Plotly is totally unbiased of Matplotlib. It makes use of equally named Determine objects, however doesn’t have any ax objects.
The Scatter operate with mode “traces” is used to construct a line chart with the required x- and y-axis knowledge. You may consider the add_trace operate as including a brand new part to an present Determine. Thus, for the a number of line chart, we merely name add_trace with the suitable Scatter inputs 3 times.
For labeling and aesthetics in Plotly, use the update_layout operate.
The world chart is constructed identically to the road chart, with the addition of the non-obligatory argument fill='tozeroy'.
- Upon first look, this will appear to be some obscure colour, however it’s really saying “TO ZERO Y,” specifying to Plotly the world that needs to be crammed in.
- If you happen to’re having hassle visualizing this, strive altering the argument to “tozerox” and see what occurs.
For the stacked space chart, we’d like a special non-obligatory parameter: stackgroup='one'. Including this to every of the Scatter calls tells Plotly that they’re all to be constructed as a part of the identical stack.

A bonus of Plotly is that by default, all Plotly charts are interactive and include the flexibility to zoom, hover for tooltips, and toggle the legend. (Notice the pictures above are saved as PNGs, so you’ll need to generate the plots your self with a view to see this.)

Coding Time-Collection Visualizations in Altair

Let’s end off by producing these 4 visualizations in Altair. Right here is the code:

import pandas as pd
import altair as alt

# Load knowledge
df = pd.read_csv('sales_data.csv')
df['Date'] = pd.to_datetime(df['Date'])

# Instance 1: Easy Line Chart
chart1 = alt.Chart(df).mark_line().encode(
    x='Date:T',
    y='Product A Gross sales:Q'
).properties(
    title='Product A Gross sales Over Time',
    width=700,
    top=400
)
# Show with: chart1

# Instance 2: A number of Line Chart
# Reshape knowledge for Altair
df_melted = df.soften(id_vars='Date', var_name='product', value_name='gross sales')

chart2 = alt.Chart(df_melted).mark_line().encode(
    x='Date:T',
    y='gross sales:Q',
    colour='product:N'
).properties(
    title='Gross sales Comparability - All Merchandise',
    width=700,
    top=400
)
# Show with: chart2

# Instance 3: Space Chart
chart3 = alt.Chart(df).mark_area(opacity=0.7).encode(
    x='Date:T',
    y='Product A Gross sales:Q'
).properties(
    title='Product A Gross sales - Space Chart',
    width=700,
    top=400
)
# Show with: chart3

# Instance 4: Stacked Space Chart
chart4 = alt.Chart(df_melted).mark_area(opacity=0.7).encode(
    x='Date:T',
    y='gross sales:Q',
    colour='product:N'
).properties(
    title='Complete Gross sales - Stacked Space Chart',
    width=700,
    top=400
)
# Show with: chart4

We acquire the next charts:

Let’s break down the code:

Altair has a barely completely different construction from Matplotlib and Plotly. It takes some follow to understand, however when you perceive it, its intuitiveness makes constructing new visualizations simple.
The whole lot in Altair revolves across the Chart object, into which you move in your knowledge. Then, you utilize a mark_ operate to specify what sort of chart you need to construct, and the encoding operate to specify what variables will correspond to what visible components on the chart (e.g., x-axis, y-axis, colour, dimension, and many others.).
For the road chart, we use the mark_line operate, after which specify that we would like the date on the x-axis and the gross sales on the y-axis.
The soften operate doesn’t change the info itself, simply its construction. It places the merchandise all right into a single column, a “lengthy format” which is extra amenable to Altair’s visualization mannequin. For more details, check out this helpful article.
As soon as we remodel the info as above, we are able to construct our a number of line chart just by including a “colour” encoding, as proven within the code. This was made potential as a result of all of the product sorts are actually out there in a single column, and we are able to inform Altair to differentiate them by colour.
The code for producing space charts showcases the great thing about Altair’s construction. The whole lot stays the identical—all you should do is change the operate getting used to mark_area!

As you discover different varieties of visualizations by yourself (and in future articles!), Altair’s mannequin for constructing visualizations will turn out to be simpler to implement (and hopefully recognize).

What’s Subsequent?

In future articles, I’ll cowl tips on how to use these libraries to construct further varieties of visualizations. As you proceed studying, keep in mind that the aim of those articles is not to grasp anybody instrument. That is about studying knowledge visualization holistically, and my hope is that you’ve got walked away from this text with a greater understanding of how time-series knowledge is visualized.

As for the code, that consolation comes with time and follow. For now, you must be at liberty to take the examples above and regulate them on your personal knowledge as wanted.

Till subsequent time.

References

Source link

MIT scientists debut a generative AI model that could create molecules addressing hard-to-treat diseases | MIT News

Why CrewAI’s Manager-Worker Architecture Fails — and How to Fix It

How to Implement Three Use Cases for the New Calendar-Based Time Intelligence

The unique, mathematical shortcuts language models use to predict dynamic scenarios | MIT News

Are You Being Unfair to LLMs?

Exploring RAFT: The Future of AI with Retrieval-Augmented Fine-Tuning

Celebrating an academic-industry collaboration to advance vehicle technology | MIT News

OpenAI Just Launched a Jobs Platform. Here’s What That Means for You.

Most Popular

Best Invoice Automation Software 2025 [Updated]

What Optimization Terminologies for Linear Programming Really Mean

Things I Learned by Participating in GenAI Hackathons Over the Past 6 Months

Our Picks