Automated Testing: A Software Engineering Concept Data Scientists Must Know To Succeed

Why you must learn this text

information scientists whip up a Jupyter Pocket book, mess around in some cells, after which preserve total information processing and mannequin coaching pipelines in the identical pocket book.

The code is examined as soon as when the pocket book was first written, after which it’s uncared for for some undetermined period of time – days, weeks, months, years, till:

The outputs of the pocket book have to be rerun to re-generate outputs that had been misplaced.
The pocket book must be rerun with completely different parameters to retrain a mannequin.
One thing wanted to be modified upstream, and the pocket book must be rerun to refresh downstream datasets.

A lot of you’ll have felt shivers down your backbone studying this…

Why?

Since you instinctively know that this pocket book is rarely going to run.

You recognize it in your bones the code in that pocket book will have to be debugged for hours at finest, re-written from scratch at worst.

In each instances, it is going to take you a very long time to get what you want.

Why does this occur?

Is there any manner of avoiding this?

Is there a greater manner of writing and sustaining code?

That is the query we might be answering on this article.

The Resolution: Automated Testing

What’s it?

Because the title suggests, automated testing is the method of operating a predefined set of exams in your code to make sure that it’s working as anticipated.

These exams confirm that your code behaves as anticipated — particularly after modifications or additions — and provide you with a warning when one thing breaks. It removes the necessity for a human to manually take a look at your code, and there’s no have to run it on precise information.

Handy, isn’t it?

Varieties of Automated Testing

There are such a lot of various kinds of testing, and masking all of them is past the scope of this text.

Let’s simply give attention to the 2 fundamental varieties most related to an information scientist:

Unit Exams
Integration Exams

Unit Exams

Picture by writer. Illustration of the idea of a unit take a look at.

Exams the smallest elements of code in isolation (e.g., a perform).

The perform ought to do one factor solely to make it simple to check. Give it a recognized enter, and test that the output is as anticipated.

Integration Exams

Picture by writer. Illustration of the idea of an integration take a look at.

Exams how a number of elements work collectively.

For us information scientists, it means checking whether or not information loading, merging, and preprocessing steps produce the anticipated closing dataset, given a recognized enter dataset.

A sensible instance

Sufficient with the speculation, let’s see the way it works in apply.

We’ll undergo a easy instance the place an information scientist has written some code in a Jupyter pocket book (or script), one which many information scientists can have seen of their jobs.

We’ll decide up on why the code is dangerous. Then, we’ll attempt to make it higher.

By higher, we imply:

Straightforward to check
Straightforward to learn

which in the end means simple to take care of, as a result of in the long term, good code is code that works, retains working, and is straightforward to take care of.

We’ll then design some unit exams for our improved code, highlighting why the modifications are helpful for testing. To forestall this text from turning into too lengthy, I’ll defer examples of integration testing to a future article.

Then, we’ll undergo some guidelines of thumb for what code to check.

Lastly, we’ll cowl easy methods to run exams and easy methods to construction initiatives.

Photograph by Wolfgang Weiser on Unsplash

Instance Pipeline

We’ll use the next pipeline for example:

# bad_pipeline.py

import pandas as pd

# Load information
df1 = pd.read_csv("information/customers.csv")
df2 = pd.read_parquet("information/transactions.parquet")
df3 = pd.read_parquet("information/merchandise.parquet")

# Preprocessing
# Merge person and transaction information
df = df2.merge(df1, how='left', on='user_id')

# Merge with product information
df = df.merge(df3, how='left', on='product_id')

# Filter for latest transactions
df = df[df['transaction_date'] > '2023-01-01']

# Calculate complete value
df['total_price'] = df['quantity'] * df['price']

# Create buyer section
df['segment'] = df['total_price'].apply(lambda x: 'excessive' if x > 100 else 'low')

# Drop pointless columns
df = df.drop(['user_email', 'product_description', 'price'], axis=1)

# Group by person and section to get complete quantity spent
df = df.groupby(['user_id', 'segment']).agg({'total_price': 'sum'}).reset_index()

# Save output
df.to_parquet("information/final_output.parquet")

In actual life, we’d see tons of of strains of code crammed right into a single pocket book. However the script is exemplary of all of the issues that want fixing in typical information science notebooks.

This code is doing the next:

Masses person, transaction, and product information.
Merges them right into a unified dataset.
Filters latest transactions.
Provides calculated fields (total_price, section).
Drops irrelevant columns.
Aggregates complete spending per person and section.
Saves the consequence as a Parquet file.

Why is that this pipeline dangerous?

Oh, there are such a lot of causes coding on this method is dangerous, relying on what lens you take a look at it from. It’s not the content material that’s the downside, however how it’s structured.

Whereas there are numerous angles we will focus on the disadvantages of writing code this fashion, for this text we’ll give attention to testability.

1. Tightly coupled logic (in different phrases, no modularity)

All operations are crammed right into a single script and run without delay. It’s unclear what every half does except you learn each line. Even for a script this easy, that is troublesome to do. In real-life scripts, it will possibly solely worsen when code can attain tons of of strains.

This makes it unattainable to check.

The one manner to take action can be to run all the factor unexpectedly from begin to end, most likely on precise information that you simply’re going to make use of.

In case your dataset is small, then maybe you will get away with this. However normally, information scientists are working with a truck-load of information, so it’s infeasible to run any type of a take a look at or sanity test shortly.

We’d like to have the ability to break the code up into manageable chunks that do one factor solely, and do it nicely. Then, we will management what goes in, and make sure that what we anticipate comes out of it.

2. No Parameterization

Hardcoded file paths and values like 2023-01-01 make the code brittle and rigid. Once more, laborious to check with something however the reside/manufacturing information.

There’s no flexibility in how we will run the code, the whole lot is fastened.

What’s worse, as quickly as you modify one thing, you don’t have any assurance that nothing’s damaged additional down the script.

For instance, what number of occasions have you ever made a change that you simply thought was benign, solely to run the code and discover a utterly sudden a part of the code to interrupt?

How one can enhance?

Now, let’s see step-by-step how we will enhance this code.

Please notice, we’ll assume that we’re utilizing the pytest module for our exams going forwards.

1. A transparent, configurable entry level

def run_pipeline(
    user_path: str,
    transaction_path: str,
    product_path: str,
    output_path: str,
    cutoff_date: str = '2023-01-01'
):
    # Load information
    ...

    # Course of information
    ...

    # Save consequence
    ...

We begin off by making a single perform that we will run from wherever, with clear arguments that may be modified.

What does this obtain?

This permits us to run the pipeline in particular take a look at situations.

# GIVEN SOME TEST DATA
test_args = dict(
	test_user_path = "<test-dir>/fake_users.csv",
	test_transaction_path = "<test-dir>/fake_transaction.parquet",
	test_product_path = "<test-dir>/fake_products.parquet",
	test_cutoff_date = "<yyyy-MM-dd>",
)

# RUN THE PIPELINE THAT'S TO BE TESTED
run_pipeline(**test_args)

# TEST THE OUTPUT IS AS EXPECTED
output = <load-the-data>
expected_output = <define-the-expected-data>
assert output == expected_output

Instantly, you can begin passing in several inputs, completely different parameters, relying on the sting case that you simply need to take a look at for.

It provides you flexibility to run the code in several settings by making it simpler to manage the inputs and outputs of your code.

Writing your pipeline on this manner paves the way in which for integration testing your pipeline. Extra on this in a later article.

2. Group code into significant chunks that do one factor, and do it nicely

Now, that is the place a little bit of artwork is available in – completely different individuals will organise code otherwise relying on which elements they discover essential.

There isn’t a proper or mistaken reply, however the frequent sense is to verify a perform does one factor and does it nicely. Do that, and it turns into simple to check.

A method we may group our code is like under:

def load_data(user_path: str, transaction_path: str, product_path: str):
    """Load information from specified paths"""
    df1 = pd.read_csv(user_path)
    df2 = pd.read_parquet(transaction_path)
    df3 = pd.read_parquet(product_path)
    return df1, df2, df3

def create_user_product_transaction_dataset(
    user_df:pd.DataFrame,
    transaction_df:pd.DataFrame,
    product_df:pd.DataFrame
):
    """Merge person, transaction, and product information right into a single dataset.
    
    The dataset identifies which person purchased what product at what time and value.
    
    Args:
	    user_df (pd.DataFrame):
            A dataframe containing person info. Should have column
            'user_id' that uniquely identifies every person.
	    
	    transaction_df (pd.DataFrame):
            A dataframe containing transaction info. Should have
            columns 'user_id' and 'product_id' which are international keys
            to the person and product dataframes, respectively.
	    
	    product_df (pd.DataFrame):
            A dataframe containing product info. Should have
            column 'product_id' that uniquely identifies every product.
    
    Returns:
        A dataframe that merges the person, transaction, and product information
        right into a single dataset.
    """
    df = transaction_df.merge(user_df, how='left', on='user_id')
    df = df.merge(product_df, how='left', on='product_id')
    return df

def drop_unnecessary_date_period(df:pd.DataFrame, cutoff_date: str):
    """Drop transactions that occurred earlier than the cutoff date.

    Word:
        Something earlier than the cutoff date may be dropped as a result of
        of <causes>.

    Args:
        df (pd.DataFrame): A dataframe with a column `transaction_date`
        cutoff_date (str): A date within the format 'yyyy-MM-dd'
        
    Returns:
        A dataframe with the transactions that occurred after the cutoff date
    """
    df = df[df['transaction_date'] > cutoff_date]
    return df

def compute_secondary_features(df:pd.DataFrame) -> pd.DataFrame:
    """Compute secondary options.
    
    Args:
        df (pd.DataFrame): A dataframe with columns `amount` and `value`
    
    Returns:
        A dataframe with columns `total_price` and `section`
        added to it.
    """
    df['total_price'] = df['quantity'] * df['price']
    df['segment'] = df['total_price'].apply(lambda x: 'excessive' if x > 100 else 'low')
    return df

What does the grouping obtain?

Higher documentation

Effectively, to begin with, you find yourself with some pure retail area in your code so as to add docstrings. Why is that this essential? Effectively have you ever tried studying your personal code a month after writing it?

Folks overlook particulars in a short time, and even code *you’ve* written can turn out to be undecipherable inside just some days.

It’s important to doc what the code is doing, what it expects to take as enter, and what it returns, on the very least.

Together with docstrings in your code offers context and units expectations for the way a perform ought to behave, making it simpler to grasp and debug failing exams sooner or later.

Higher Readability

By ‘encapsulating’ the complexity of your code into smaller capabilities, you may make it simpler to learn and perceive the general circulation of a pipeline with out having to learn each single line of code.

def run_pipeline(
    user_path: str,
    transaction_path: str,
    product_path: str,
    output_path: str,
    cutoff_date: str
):
    user_df, transaction_df, product_df = load_data(
        user_path,
        transaction_path,
        product_path
    )
    df = create_user_product_transaction_dataset(
        user_df,
        transaction_df,
        product_df
    )
    df = drop_unnecessary_date_period(df, cutoff_date)
    df = compute_secondary_features(df)
    df.to_parquet(output_path)

You’ve supplied the reader with a hierarchy of data, and it provides the reader a step-by-step breakdown of what’s happing within the run_pipeline perform via significant perform names.

The reader then has the selection of wanting on the perform definition and the complexity inside, relying on their wants.

The act of mixing code into ‘significant’ chunks like that is demonstrating an idea known as ‘Encapsulation’ and ‘Abstraction’.

For extra particulars on encapsulation, you possibly can learn my article on this here

Smaller packets of code to check

Subsequent, we’ve got a really particular, well-defined set of capabilities that do one factor. This makes it simpler to check and debug, since we solely have one factor to fret about.

See under on how we assemble a take a look at.

Setting up a Unit Check

1. Observe the AAA Sample

def test_create_user_product_transaction_dataset():
    # GIVEN

    # RUN

    # TEST
    ...

Firstly, we outline a take a look at perform, appropriately named test_<function_name>.

Then, we divide it into three sections:

GIVEN: the inputs to the perform, and the anticipated output. Arrange the whole lot required to run the perform we need to take a look at.
RUN: run the perform given the inputs.
TEST: examine the output of the perform to the anticipated output.

It is a generic sample that unit exams ought to observe. The usual title for this design sample is the ‘AAA sample’, which stands for Prepare, Act, Assert.

I don’t discover this naming intuitive, which is why I exploit GIVEN, RUN, TEST.

2. Prepare: arrange the take a look at

# GIVEN
user_df = pd.DataFrame({
    'user_id': [1, 2, 3], 'title': ["John", "Jane", "Bob"]
})
transaction_df = pd.DataFrame({
    'user_id': [1, 2, 3],
    'product_id': [1, 1, 2],
    'extra-column1-str': ['1', '2', '3'],
    'extra-column2-int': [4, 5, 6],
    'extra-column3-float': [1.1, 2.2, 3.3],
})
product_df = pd.DataFrame({
    'product_id': [1, 2], 'product_name': ["apple", "banana"]
})
expected_df = pd.DataFrame({
    'user_id': [1, 2, 3],
    'product_id': [1, 1, 2],
    'extra-column1-str': ['1', '2', '3'],
    'extra-column2-int': [4, 5, 6],
    'extra-column3-float': [1.1, 2.2, 3.3],
    'title': ["John", "Jane", "Bob"],
    'product_name': ["apple", "apple", "banana"],
})

Secondly, we outline the inputs to the perform, and the anticipated output. That is the place we bake in our expectations about how the inputs will seem like, and what the output ought to seem like.

As you possibly can see, we don’t have to outline each single function that we anticipate to be run, solely those that matter for the take a look at.

For instance, transaction_df defines the user_id, product_id columns correctly, while additionally including three columns of various varieties (str, int, float) to simulate the truth that there might be different columns.

The identical goes for product_df and user_df, although these tables are anticipated to be a dimension desk, so simply defining title and product_name columns will suffice.

3. Act: Run the perform to check

# RUN
output_df = create_user_product_transaction_dataset(
    user_df, transaction_df, product_df
)

Thirdly, we run the perform with the inputs we outlined, and acquire the output.

4. Assert: Check the result is as anticipated

# TEST
pd.testing.assert_frame_equal(
    output_df,
    expected_df
)

and at last, we test whether or not the output matches the anticipated output.

Word, we use the pandas testing module since we’re evaluating pandas dataframes. For non-pandas datafames, you should use the assert assertion as a substitute.

The complete testing code will seem like this:

import pandas as pd

def test_create_user_product_transaction_dataset():
    # GIVEN
    user_df = pd.DataFrame({
        'user_id': [1, 2, 3], 'title': ["John", "Jane", "Bob"]
    })
    transaction_df = pd.DataFrame({
        'user_id': [1, 2, 3],
        'product_id': [1, 1, 2],
        'transaction_date': ["2021-01-01", "2021-01-01", "2021-01-01"],
        'extra-column1': [1, 2, 3],
        'extra-column2': [4, 5, 6],
    })
    product_df = pd.DataFrame({
        'product_id': [1, 2], 'product_name': ["apple", "banana"]
    })
    expected_df = pd.DataFrame({
        'user_id': [1, 2, 3],
        'product_id': [1, 1, 2],
        'transaction_date': ["2021-01-01", "2021-01-01", "2021-01-01"],
        'extra-column1': [1, 2, 3],
        'extra-column2': [4, 5, 6],
        'title': ["John", "Jane", "Bob"],
        'product_name': ["apple", "apple", "banana"],
    })
    
    # RUN
    output_df = create_user_product_transaction_dataset(
        user_df, transaction_df, product_df
    )

    # TEST
    pd.testing.assert_frame_equal(
        output_df,
        expected_df
    )

To organise your exams higher and make them cleaner, you can begin utilizing a mix of lessons, fixtures, and parametrisation.

It’s past the scope of this text to delve into every of those ideas intimately, so for individuals who have an interest I present the pytest How-To guide as reference to those ideas.

Photograph by Agence Olloweb on Unsplash

What to Check?

Now that we’ve created a unit take a look at for one perform, we flip our consideration to the remaining capabilities that we’ve got. Acute readers will now be considering:

“Wow, do I’ve to jot down a take a look at for the whole lot? That’s a variety of work!”

Sure, it’s true. It’s further code that you could write and preserve.

However the excellent news is, it’s not crucial to check completely the whole lot, however you could know what’s essential within the context of what your work is doing.

Under, I’ll provide you with a couple of guidelines of thumb and issues that I make when deciding what to check, and why.

1. Is the code vital for the result of the undertaking?

There are vital junctures in an information science undertaking which are simply pivotal to the success of an information science undertaking, a lot of which often comes on the data-preparation and mannequin analysis/rationalization levels.

The instance take a look at we noticed above on the create_user_product_transaction_dataset perform is an effective instance.

This dataset will type the premise of all downstream modelling exercise.

If the person -> product be part of is wrong in no matter manner, then it is going to influence the whole lot we do downstream.

Thus, it’s price taking the time to make sure this code works accurately.

At a naked minimal, the take a look at we’ve established makes certain the perform is behaving in precisely the identical manner because it used to after each code change.

Instance

Suppose the be part of must be rewritten to enhance reminiscence effectivity.

After making the change, the unit take a look at ensures the output stays the identical.

If one thing was inadvertently altered such that the output began to look completely different (lacking rows, columns, completely different datatypes), the take a look at would instantly flag the problem.

2. Is the code primarily utilizing third-party libraries?

Take the load information perform for instance:

def load_data(user_path: str, transaction_path: str, product_path: str):
    """Load information from specified paths"""
    df1 = pd.read_csv(user_path)
    df2 = pd.read_parquet(transaction_path)
    df3 = pd.read_parquet(product_path)
    return df1, df2, df3

This perform is encapsulating the method of studying information from completely different recordsdata. Underneath the hood, all it does is name three pandas load capabilities.

The principle worth of this code is the encapsulation.

In the meantime, it doesn’t have any enterprise logic, and for my part, the perform scope is so particular that you simply wouldn’t anticipate any logic to be added sooner or later.

If it does, then the perform title ought to be modified because it does extra than simply loading information.

Due to this fact, this perform does not require a unit take a look at.

A unit take a look at for this perform would simply be testing that pandas works correctly, and we must always be capable of belief that pandas has examined their very own code.

3. Is the code more likely to change over time?

This level has already been implied in 1 & 2. For maintainability, maybe that is an important consideration.

You ought to be considering:

How advanced is the code? Are there some ways to attain the identical output?
What may trigger somebody to change this code? Is the info supply prone to modifications sooner or later?
Is the code clear? Are there behaviours that may very well be simply ignored throughout a refactor?

Take create_user_product_transaction_dataset for instance.

The enter information might have modifications to their schema sooner or later.
Maybe the dataset turns into bigger, and we have to break up the merge into a number of steps for efficiency causes.
Maybe a unclean hack must go in briefly to deal with nulls as a result of a difficulty with the info supply.

In every case, a change to the underlying code could also be crucial, and every time we have to make sure the output doesn’t change.

In distinction, load_data does nothing however hundreds information from a file.

I don’t see this altering a lot sooner or later, aside from maybe a change in file format. So I’d defer writing a take a look at for this till a major change to the upstream information supply happens (one thing like this is able to almost definitely require altering a variety of the pipeline).

The place to Put Exams and How one can Run Them

To date, we’ve lined easy methods to write testable code and easy methods to create the exams themselves.

Now, let’s take a look at easy methods to construction your undertaking to incorporate exams — and easy methods to run them successfully.

Undertaking Construction

Typically, an information science undertaking can observe the under construction:

<project-name>
|-- information                # the place information is saved
|-- conf                # the place config recordsdata to your pipelines are saved
|-- src                 # all of the code to copy your undertaking is saved right here
|-- notebooks           # all of the code for one-off experiments, explorations, and many others. are saved right here
|-- exams               # all of the exams are saved right here
|-- pyproject.toml
|-- README.md
|-- necessities.txt

The src folder ought to include all of the code for the undertaking which are vital for the supply of your undertaking.

Common rule of thumb

If it’s code you anticipate operating a number of occasions (with completely different inputs or parameters), it ought to go within the src folder.

Examples embrace:

information processing
function engineering
mannequin coaching
mannequin analysis

In the meantime, something that’s one-off items of study may be in Jupyter notebooks, saved within the notebooks folder.

This primarily contains

EDA
ad-hoc mannequin experimentation
evaluation of native mannequin explanations

Why?

As a result of Jupyter notebooks are notoriously flaky, troublesome to handle, and laborious to check. We don’t need to be rerunning vital code by way of notebooks.

The Check Folder Construction

Let’s say your src folder seems to be like this:

src
|-- pipelines
    |-- data_processing.py
    |-- feature_engineering.py
    |-- model_training.py
    |-- __init__.py

Every file incorporates capabilities and pipelines, just like the instance we noticed above.

The take a look at folder ought to then seem like this:

exams
|-- pipelines
    |-- test_data_processing.py
    |-- test_feature_engineering.py
    |-- test_model_training.py

the place the take a look at listing mirrors the construction of the src listing and every file begins with the test_ prefix.

The explanation for that is easy:

It’s simple to seek out the exams for a given file, because the take a look at folder construction mirrors the src folder.
It retains take a look at code properly separated from supply code.

Operating Exams

After you have your exams arrange like above, you possibly can run them in quite a lot of methods:

1. By way of the terminal

pytest -v

2. By way of a code editor

I exploit this for all my initiatives.

Visible studio code is my editor of selection; it auto-discovers the exams for me, and it’s tremendous simple to debug.

After having a learn of the docs, I don’t suppose there’s any level in me re-iterating their contents since they’re fairly self-explanatory, so right here’s the hyperlink:

Equally, most code editors may even have related capabilities, so there’s no excuse for not writing exams.

It actually is straightforward, learn the docs and get began.

3. By way of a CI pipeline (e.g. GitHub Actions, Gitlab, and many others.)

It’s simple to arrange exams to run routinely on pull requests by way of GitHub.

The concept is everytime you make a PR, it is going to routinely discover and run the exams for you.

Which means even when overlook to run the exams regionally by way of 1 or 2, they are going to at all times be run for you everytime you need to merge your modifications.

Once more, no level in me re-iterating the docs; right here’s the hyperlink

The Finish-Purpose We Need To Obtain

Following on from the above directions, I feel it’s higher use of each of our time to focus on some essential factors about what we need to obtain via automated exams, somewhat than regurgitating directions you will discover within the above hyperlinks.

At the beginning, automated exams are being written to ascertain belief in your code, and to minimise human error.

That is for the good thing about:

Your self
Your staff
and the enterprise as a complete.

Due to this fact, to actually get essentially the most out of the exams you’ve written, you have to get spherical to organising a CI pipeline.

It makes a world of distinction with the ability to overlook to run the exams regionally, and nonetheless have the reassurance that the exams might be run while you create a PR or push some modifications.

You don’t need to be the particular person chargeable for a bug that creates a manufacturing incident since you forgot to run the exams, or to be the one to have missed a bug throughout a PR evaluate.

So please, in the event you write some exams, make investments a while into organising a CI pipeline. Learn the github docs, I implore you. It’s trivial to arrange, and it’ll do you wonders.

Closing Remarks

After studying this text, I hope it’s impressed upon you

The significance of writing exams, particularly inside the context of information science
How simple it’s to jot down and run them

However there’s one final purpose why you must know easy methods to write automated take a look at.

That purpose is that

Information Science is altering.

Information science was largely proof-of-concept, constructing fashions in Jupyter notebooks, and sending fashions to engineers for deployment. In the meantime, information scientists constructed up a notoriety for creating horrible code.

However now, the trade has matured.

It’s turning into simpler to shortly construct and deploy fashions as ML-Ops and ML-engineering mature.

Thus,

mannequin constructing
deployment
retraining
upkeep

is turning into the duty of machine studying engineers.

On the identical time, the info wrangling that we used to do have gotten so advanced that that is now turning into specialised to devoted information engineering groups.

In consequence, information science sits in a really slim area between these two disciplines, and fairly quickly the strains between information scientist and information analyst will blur.

The trajectory is that information scientists will now not be constructing cutting-edge fashions, however will turn out to be extra enterprise and product centered, producing insights and MI reviews as a substitute.

If you wish to keep nearer to the mannequin constructing, it doesn’t suffice to simply code anymore.

You have to discover ways to code correctly, and easy methods to preserve them nicely. Machine studying is now not a novelty, it’s now not simply PoCs, it’s turning into software program engineering.

If You Need To Study Extra

If you wish to study extra about software program engineering expertise utilized to Information Science, listed below are some associated articles:

You can too turn out to be a Group Member on Patreon here!

We now have devoted dialogue threads for all articles; Ask me questions on automated testing, focus on the subject in additional element, and share experiences with different information scientists. The educational doesn’t have to cease right here.

You could find the devoted dialogue thread for this text here.

Source link

Five with MIT ties elected to National Academy of Medicine for 2025 | MIT News

Why Should We Bother with Quantum Computing in ML?

Federated Learning and Custom Aggregation Schemes

Microsoft’s Quiet AI Layoffs, US Copyright Office’s Bombshell AI Guidance, 2025 State of Marketing AI Report, and OpenAI Codex

OpenAI Cancels Its For-Profit Plans

Researchers teach LLMs to solve complex planning challenges | MIT News

Learning Triton One Kernel At a Time: Vector Addition

Machine-learning tool gives doctors a more detailed 3D picture of fetal health | MIT News

Most Popular

Confusion Matrix Made Simple: Accuracy, Precision, Recall & F1-Score

OpenAI lanserar AgentKit – nu kan vem som helst bygga AI-agenter

Audio Spectrogram Transformers Beyond the Lab

Our Picks