Skip to main content

How to enrich an uploaded dataset end to end

This guide shows you how to take a raw CSV of company or contact records, upload it to Landbase, run a sequence of dataset workflows to match and enrich the records, and download the finished output. This is the right approach when you have your own data (a CRM export, a spreadsheet, a scrape) and want Landbase to match records to its database and add enrichment fields.

Prerequisites

  • landbase-cli installed and authenticated
  • A CSV or Excel file with at least one identifier column (e.g. company name, domain, or LinkedIn URL)

Step 1: Upload your file

DATASET_ID=$(landbase-cli upload ./crm-export.csv --name="CRM Export Jun 2026" | jq -r .id)
echo "Dataset ID: $DATASET_ID"
The upload command returns a JSON object with an id field. Capturing it in $DATASET_ID avoids having to copy-paste it for every subsequent command.

Step 2: Run the onboard workflow

The onboard workflow normalizes your data, maps columns to Landbase’s schema, and prepares it for downstream workflows.
landbase-cli workflow onboard $DATASET_ID --wait
--wait blocks until the workflow completes. For large datasets this can take a few minutes.

Step 3: Run the match workflow

The match workflow compares each row against the Landbase database and adds a confidence score and a Landbase record ID to matched rows.
landbase-cli workflow match $DATASET_ID --wait

Step 4: Run the enrich workflow

The enrich workflow pulls additional fields (industry, employee count, HQ location, LinkedIn, etc.) for matched records.
landbase-cli workflow enrich $DATASET_ID \
  --company-fields=industry,size_range,hq_city,hq_country \
  --wait
To see what fields are available for enrichment, run:
landbase-cli enrich --help

Step 5: Publish and find the output dataset

The publish workflow generates a downloadable file from the dataset.
landbase-cli workflow publish $DATASET_ID --format=csv --wait
After publishing, the output is a child dataset — a new dataset linked to your original. Find it with:
CHILD_ID=$(landbase-cli datasets lineage $DATASET_ID --direction=children \
  | jq -r '[.[] | select(.workflow_type == "publish")] | .[0].id')
echo "Output dataset: $CHILD_ID"

Step 6: Download the output

landbase-cli datasets download $CHILD_ID ./enriched-crm.csv
Your enriched file is now saved locally.

Full pipeline as a script

#!/bin/bash
set -e

FILE=${1:-crm-export.csv}
NAME=${2:-"CRM Export"}

echo "Uploading $FILE..."
DATASET_ID=$(landbase-cli upload "$FILE" --name="$NAME" | jq -r .id)
echo "Dataset: $DATASET_ID"

echo "Onboarding..."
landbase-cli workflow onboard $DATASET_ID --wait

echo "Matching..."
landbase-cli workflow match $DATASET_ID --wait

echo "Enriching..."
landbase-cli workflow enrich $DATASET_ID \
  --company-fields=industry,size_range,hq_city,hq_country \
  --wait

echo "Publishing..."
landbase-cli workflow publish $DATASET_ID --format=csv --wait

echo "Finding output..."
CHILD_ID=$(landbase-cli datasets lineage $DATASET_ID --direction=children \
  | jq -r '[.[] | select(.workflow_type == "publish")] | .[0].id')

echo "Downloading..."
landbase-cli datasets download $CHILD_ID ./enriched-output.csv

echo "Done. Output: enriched-output.csv"
Save this as enrich-pipeline.sh, make it executable with chmod +x enrich-pipeline.sh, and run it with:
./enrich-pipeline.sh my-contacts.csv "My Contacts Jun 2026"

Troubleshooting

Workflow fails at onboard: Check that your CSV has recognizable column headers (name, domain, email, first_name, last_name, etc.). The onboard step maps your columns — ambiguous headers can cause it to fail. No child dataset found after publish: Run landbase-cli datasets lineage $DATASET_ID --direction=children without the jq filter to see all child datasets and their workflow types. Timeout: Add --timeout=600 to any --wait command to allow up to 10 minutes. Large datasets take longer.