PAM Derivatives Legacy Message Platform (Azure Synapse)

Highlights

Built a configuration + Excel-template driven message engine supporting nested and repeating financial structures
Generates SMF and transaction messages dynamically from Spark SQL tables with full audit and diagnostics output
Integrated with Azure Synapse pipelines for scheduled, production-grade execution
Handles missing data, partial availability, and schema variability safely

Impact

Generates SMF and transaction messages for ~800 CUSIPs spanning options, futures, and swaps from a single unified platform
Designed for asset-class extensibility: FI, Cash, and legacy assets onboard without major platform changes
Validated as a proof-of-concept for cross-cloud deployment, demonstrating portability across different cloud providers
Replaced brittle, hardcoded derivatives logic with a reusable metadata-driven architecture

Context

Derivatives processing required complex, hierarchical legacy messages built from many Spark tables, with frequent structure changes and partial data availability. Hardcoding this logic was brittle, slow to change, and risky.

What I Built

A metadata-driven legacy message generation platform that:

Uses Excel as the message structure control plane
Uses config sets to control runs, scopes, and environments
Scans Spark tables dynamically and builds messages at runtime
Supports:
- Direct fields
- Single-field submessages
- Nested structures
- Repeating message groups
Produces:
- Final business-ready Excel outputs
- Full summary, diagnostics, and audit reports

The system runs inside Azure Synapse and is orchestrated by pipelines on a daily schedule.

Reliability & Scale

Missing tables and optional fields are handled gracefully
Partial messages still generate with full diagnostics
Spark caching and batching are used for performance
The system scales with cluster size and data volume

Outcomes

Standardized derivatives legacy message generation across the platform
Greatly reduced change risk when message formats evolve
Improved operational transparency and audit readiness
Established a reusable, template-driven generation pattern for future feeds

Why This Matters

This project demonstrates true data-platform engineering:

Metadata-driven systems, dynamic schema handling, Spark-native execution, and production-grade orchestration. Not just pipelines.