CLI Interactive Demos

These CLI demos showcase practical data quality workflows that you can use!

🎬 Workflow-Based Demonstrations

Essential validations for everyday data quality checks
Data exploration tools that require no Python knowledge
CI/CD integration patterns for automated data quality
Complete pipelines from exploration to production validation

Prerequisites

To follow along with these demonstrations:

pip install pointblank
pb --help  # Verify installation

Getting Started with the CLI

Learn the basics of Pointblank’s CLI and run your first validation:

Getting Started CLI overview and your first data quality validation

Essential Data Quality Validations

See the most commonly used validation checks that catch critical data issues:

Essential Validations Duplicate detection, null checks, and data extract debugging

Data Exploration Tools

Discover how to profile and explore data using CLI tools that are quick and easy to use:

Data Exploration Preview data, find missing values, and generate column summaries

CI/CD Integration & Automation

Learn how to integrate data quality checks into automated pipelines:

CI/CD Integration Exit codes, pipeline integration, and automated quality gates

Complete Data Quality Workflow

Follow an end-to-end data quality pipeline combining exploration, validation, and profiling:

Complete Workflow Full pipeline: explore → validate → automate

Getting Started

Ready to implement data quality workflows? Here’s how to get started:

1. Install and Verify

pip install pointblank
pb --help

2. Explore Various Data Sources

# Try previewing a built-in dataset
pb preview small_table

# Access local files (even use patterns to combine multiple Parquet files)
pb preview sales_data.csv
pb scan "data/*.parquet"

# Inspect datasets in GitHub repositories (no need to download the data!)
pb preview "https://github.com/user/repo/blob/main/data.csv"
pb missing "https://raw.githubusercontent.com/user/repo/main/sales.parquet"

# Work with DB tables through connection strings
pb info "duckdb:///warehouse/analytics.ddb::customers"

3. Run Essential Validations

# Check for duplicate rows
pb validate small_table --check rows-distinct

# Validate data from multiple sources
pb validate "data/*.parquet" --check col-vals-not-null --column customer_id
pb validate "https://github.com/user/repo/blob/main/sales.csv" --check rows-distinct

# Extract failing data for debugging
pb validate small_table --check col-vals-gt --column a --value 5 --show-extract

4. Integrate with CI/CD

# Use exit codes for automation (0 = pass, 1 = fail)
pb validate small_table --check rows-distinct --exit-code