[tutorial_ml:03] Testing and Experimenting with your package

Section 003

3 min readDec 8, 2023

Photo by Pixabay from Pexels: https://www.pexels.com/photo/man-standing-on-rock-against-clear-blue-sky-256894/

Manual testing

Once your Python package is installed — whether in development mode (-e) or using the built distribution—you can easily perform initial manual tests. These tests help verify the basic functionality of your package. Here's a simple process to get started:

Open a Python Interpreter: Launch a Python interpreter in your terminal with the python command.python

Import and Test Your Package: In the Python interpreter, try importing your package and running some basic commands:

import tutorial
# This should print the package location (site-packages or local directory).
print(tutorial.__file__)

from tutorial import wrangler
datum = wrangler.generate_ad_data(50)
print(len(datum))  # Verify the generated data length.

Validate Package Installation: The path printed by tutorial.__file__ will indicate whether the package is installed in your site-packages (for a regular installation) or in your local directory (for an editable installation).

Integrating Pytest

Beyond manual testing, it’s beneficial to incorporate automated testing into your development workflow. pytest is an excellent tool for this purpose, offering a range of features for comprehensive testing. Here’s how you can leverage pytest in your package:

Basic function tests: Verify that individual functions behave as expected.
Integration tests: Ensure that different parts of your package work together correctly.
Regression tests: Check that new changes don’t break existing functionality.

In this tutorial, we have added a few simple testing functionalities to exemplify the usage of pytest on the project.

Testing imports:

def test_imports():
    from tutorial import wrangler
    assert hasattr(wrangler, 'generate_ad_data')

Testing the generate_ad_data:

def test_generate_ad_data():
    from tutorial.wrangler import generate_ad_data
    data = generate_ad_data(10)
    assert len(data) == 10  # Validate the data count.

Testing Data Type of AdUnit:

def test_ad_unit_type():
    from tutorial.wrangler import generate_ad_data
    from tutorial.schemas.ad_unit import AdUnit
    data = generate_ad_data(1)
    assert isinstance(data[0], AdUnit)  # Confirm that data is of AdUnit type.

Testing the Entire Pipeline:

def test_pipeline():
    from tutorial.wrangler import generate_ad_data, clean_data, preprocess_data, analyze_ad_performance
    data = generate_ad_data(10)
    cleaned_data = clean_data(data)
    preprocessed_data = preprocess_data(cleaned_data)
    model, predictions = analyze_ad_performance(preprocessed_data)
    assert len(predictions) == 2  # Or any other relevant assertion.

Integrating pytest enhances the reliability and maintainability of your package by automating the testing process and enabling rapid identification of issues.