Deductive Pipeline API

Deductive Pipeline API

The Deductive Pipeline API runs a set of data validation, cleansing, and anonymization rules against an incoming dataset in a data pipeline.

It can run basic data type validation, session validation, duplication checks, and verification against external reference lists. When data does not meet the validation rules, the API can attempt to automatically fix the data or quarantine it for later investigation.

The API returns a report object that contains details on the cleansing that has been done on the data. The entries within this report can be logged locally, used to create an email report, or logged centrally to a centralized logging stack like the Elastic stack.

Key features:

  • Data type validation: Validating incoming datasets for string, number, date, and session type formatting and range checks using the Deductive Data Pipeline API

  • Anonymizing data: The Deductive Pipeline API support tokenization, hashing, and encyrption of incoming datasets for anonymisation and pseudonymization

  • Referential integrity: Using the Deductive Pipeline API to validate data against other known good datasets to ensure referential

The API can either be run locally using a python library or though the AWS marketplace.

Learn more

The following articles cover the full functionality of the Pipeline API

  • Deductive Pipeline API on AWS - The Deductive Pipeline API is available through the AWS marketplace (more)
  • Deductive Pipeline Python Client - Deductive Tools includes a client for the Pipeline API (more)
  • Deductive Pipeline API: Sample Data - Sample files to demonstrate usage of the Deductive Pipeline API (more)
  • Deductive Pipeline API: Validating basic data types - Validating incoming datasets for basic string, number, and date type formatting and range checks using the Deductive Data Pipeline API (more)
  • Deductive Pipeline API: Anonymizing data - The Deductive Pipeline API support tokenization, hashing, and encyrption of incoming datasets for anonymisation and pseudonymization (more)
  • Deductive Pipeline API: Referential Integrity - Using the Deductive Pipeline API to validate data against other known good datasets to ensure referential integrity (more)
  • Deductive Pipeline API: Handling invalid data - Invalid data can be quarantined or automatically fixed by the Deductive Data Pipeline API (more)
  • Deductive Pipeline API: Working with session data - The Deductive Pipeline API can check for gaps and overlaps in session data and automatically fix them (more)
  • Deductive Pipeline API: Reporting and monitoring data quality - The Deductive Pipeline API logs data that does not meet the defined rules and quarantines bad data (more)
  • Deductive Pipeline API: Full API reference - A field by field breakdown of the full functionality of the Deductive Data Pipeline API (more)

Need help? Get in touch...

Sign up below and one of our data consultants will get right back to you


Deductive is a global consulting firm providing data consulting and engineering services to companies that want to build and implement strategies to put data to work. We work with primary data generators, businesses harvesting their own internal data, data-centric service providers, data brokers, agencies, media buyers and media sellers.


145 Marina Boulevard
San Rafael
California - 94901
+1 (415) 843 1774

Registered in Delaware


Thames Tower
Station Road
Reading
RG1 1LX

Registered in England & Wales, number 8170657