Platform Architecture

How data flows from raw sources through Bronze and Silver to analytics-ready Gold

SOURCESFiles (CSV, JSON, Parquet)REST APIsJDBC / DatabasesKafka StreamsCDC EventsBRONZERAW INGESTIONSchema-on-read landingSCD2 change trackingDead letter quarantineQuality enforcementIngestion audit logMulti-format supportPortal: configure & deploy jobsSILVERCANONICAL ENTITIES3NF business modellingMulti-source SCD2 mergeDomain entity designAttribute-level priorityQuality contractsOpenLineage trackingPortal: model & transformGOLDANALYTICS READYStar schema outputFact tablesDimension tablesBI & ReportingAggregationsData productsComing soonData Platform Portal — self-service pipeline management across all layers

What the Portal does at each layer

Bronze Layer

Raw ingestion

  • Configure file, JDBC, API and streaming sources
  • SCD2 change tracking with surrogate keys
  • Dead letter quarantine for bad records
  • Quality thresholds and quarantine rules
  • Ingestion audit log — every run tracked

Silver Layer

Canonical entities

  • Design canonical domain entities (3NF)
  • Multi-source SCD2 merge with attribute priority
  • AI-assisted entity modelling advisor
  • Entity-relationship diagram generation
  • Quality contracts and validation rules

Gold Layer

Coming soon

Soon
  • Star schema design — facts and dimensions
  • Aggregation and rollup pipelines
  • BI tool connectivity (Power BI, Tableau)
  • Data product publication
  • SLA monitoring and freshness checks