Submission Guide: Business Case Split StrategyΒΆ

PurposeΒΆ

Submission Strategy OverviewΒΆ

flowchart LR Start[Submission] --> Part1["Part 1: Business Case πŸ“„ Design & Architecture"] Start --> Part2["Part 2: Code Appendix πŸ’» Full Implementation"] Part1 -->|Initial Submission| Company[Company Review] Company -->|Request Code| Part2 Company -->|Interview| Interview[Code Discussion] Part2 --> Interview style Start fill:#1976d2,color:#fff style Part1 fill:#1976d2,color:#fff style Part2 fill:#5c6bc0,color:#fff style Company fill:#ffa000,color:#111 style Interview fill:#2e7d32,color:#fff

Submission StrategyΒΆ

Part 1: Business Case (Initial Submission)ΒΆ

What: Strategic documentation, architecture design, and high-level implementation approach When: First submission to the company

Purpose (2)ΒΆ

Part 2: Code Appendix (Upon Request)ΒΆ

What: Full implementation code, infrastructure as code, and test artifacts When: Upon company request or during interview presentation

Purpose (3)ΒΆ


Part 1: Business Case ContentsΒΆ

πŸ“„ Core DocumentsΒΆ

  • βœ… EXECUTIVE_SUMMARY.md - High-level overview and assumptions
  • βœ… HANDOUT.md - Interview presentation handout

πŸ—οΈ Architecture & Design (Task 2)ΒΆ

  • βœ… tasks/02_data_lake_architecture_design/architecture.md - Complete architecture documentation
  • βœ… tasks/02_data_lake_architecture_design/diagram.mmd - Architecture diagram source

πŸ“Š SQL Solution (Task 3)ΒΆ

  • βœ… tasks/03_sql/balance_history_2024_q1.sql - Production-ready SQL query
  • βœ… tasks/03_sql/SQL_PSEUDOCODE.md - SQL design rationale
  • βœ… tasks/03_sql/SQL_DIAGRAM.md - Query flow diagrams

πŸ”„ ETL Design (Task 1 - Design Only)ΒΆ

  • βœ… tasks/01_data_ingestion_transformation/ASSUMPTIONS_AND_EDGE_CASES.md - Design assumptions and edge case handling
  • βœ… tasks/01_data_ingestion_transformation/ETL_PSEUDOCODE.md - ETL logic pseudocode
  • βœ… tasks/01_data_ingestion_transformation/ETL_DIAGRAM.md - ETL flow diagrams
  • ⚠️ Code snippets only (key functions, not full implementation)
  • ❌ NOT included: Full ingest_transactions.py source code

πŸš€ CI/CD Design (Task 4 - Design Only)ΒΆ

  • βœ… tasks/04_devops_cicd/cicd_workflow.md - Complete CI/CD workflow design
  • ⚠️ Workflow descriptions and diagrams only
  • ❌ NOT included: Full .github/workflows/ci.yml YAML
  • ❌ NOT included: Full infra/terraform/main.tf Terraform code

πŸ“§ Communication (Task 5)ΒΆ

Original Scope Deliverables:

  • βœ… tasks/05_communication_documentation/stakeholder_email.md - Email for non-technical stakeholders (original scope)
  • βœ… tasks/05_communication_documentation/one_pager_tech.md - Technical one-page document (original scope)

Extended Communications (Appendix - Not in original scope):

  • ⚠️ tasks/05_communication_documentation/appendix/ - Extended stakeholder communication templates and tools (reference material only)

πŸ“š Supporting DocumentationΒΆ

  • βœ… docs/technical/TESTING.md - Testing approach and strategy
  • βœ… docs/technical/AWS_SERVICES_ANALYSIS.md - AWS services analysis (updated, references AWS_SPARK_SIMPLIFICATION_ANALYSIS.md)

Part 2: Code Appendix ContentsΒΆ

πŸ’» Full Implementation CodeΒΆ

ETL Code (Task 1)ΒΆ

  • βœ… tasks/01_data_ingestion_transformation/src/etl/ - Complete ETL implementation
  • ingest_transactions.py - Pandas-based ETL (original)
  • ingest_transactions_spark.py - PySpark-optimized ETL (recommended)
  • All supporting modules (validation, metadata, s3_operations, etc.)
  • βœ… tasks/01_data_ingestion_transformation/requirements.txt - Python dependencies
  • βœ… tasks/01_data_ingestion_transformation/requirements-spark.txt - PySpark dependencies
  • βœ… tasks/01_data_ingestion_transformation/config.yaml - Configuration template
  • ⚠️ Note: Test files are excluded from submission (see SUBMISSION_EXCLUSIONS.md)

Infrastructure as Code (Task 4)ΒΆ

  • βœ… tasks/04_devops_cicd/.github/workflows/ci.yml - Complete GitHub Actions workflow
  • βœ… tasks/04_devops_cicd/infra/terraform/main.tf - Complete Terraform infrastructure
  • βœ… tasks/04_devops_cicd/infra/terraform/ - Additional Terraform files (if any)

SQL Testing (Task 3)ΒΆ

  • ⚠️ Note: SQL test files are excluded from submission (see SUBMISSION_EXCLUSIONS.md)
  • βœ… tasks/03_sql/balance_history_2024_q1.sql - Production SQL query (included)
  • βœ… tasks/03_sql/schema.sql - Schema definition (if exists)

Supporting FilesΒΆ

  • ⚠️ Note: Development files are excluded (see SUBMISSION_EXCLUSIONS.md):
  • Test files and test infrastructure
  • Docker files and containerization setup
  • Development dependencies and linting configurations
  • Sample data files

File Inclusion MatrixΒΆ

File/Directory Business Case Code Appendix Notes
EXECUTIVE_SUMMARY.md βœ… βœ… Core document
HANDOUT.md βœ… βœ… Interview prep
tasks/02_data_lake_architecture_design/ βœ… βœ… Design docs
tasks/03_sql/balance_history_2024_q1.sql βœ… βœ… SQL is fine to include
tasks/03_sql/SQL_*.md βœ… βœ… Design docs
tasks/03_sql/tests/ ❌ βœ… Test code
tasks/01_data_ingestion_transformation/ASSUMPTIONS_*.md βœ… βœ… Design docs
tasks/01_data_ingestion_transformation/ETL_*.md βœ… βœ… Design docs
tasks/01_data_ingestion_transformation/src/etl/*.py ❌ βœ… Full code excluded
tasks/01_data_ingestion_transformation/tests/ ❌ βœ… Test code
tasks/01_data_ingestion_transformation/config.yaml ⚠️ βœ… Template only in business case
tasks/04_devops_cicd/cicd_workflow.md βœ… βœ… Design doc
tasks/04_devops_cicd/.github/workflows/ci.yml ❌ βœ… Full YAML excluded
tasks/04_devops_cicd/infra/terraform/main.tf ❌ βœ… Full Terraform excluded
tasks/05_communication_documentation/stakeholder_email.md βœ… βœ… Original scope: Email for non-technical stakeholders
tasks/05_communication_documentation/one_pager_tech.md βœ… βœ… Original scope: Technical one-page document
tasks/05_communication_documentation/appendix/ ❌ ⚠️ Extended communications (not in original scope, reference only)
docs/technical/TESTING.md βœ… βœ… Testing approach
docs/technical/AWS_SERVICES_ANALYSIS.md βœ… βœ… Analysis doc (updated)
tasks/01_data_ingestion_transformation/AWS_SPARK_SIMPLIFICATION_ANALYSIS.md βœ… βœ… AWS/Spark decision doc
docs/archive/ ❌ ❌ Archived historical docs (not included)

LegendΒΆ

  • βœ… Included
  • ❌ Excluded
  • ⚠️ Partial (template/example only)

Packaging InstructionsΒΆ

Business Case Package StructureΒΆ


ohpen-case-study-business-case/
β”œβ”€β”€ EXECUTIVE_SUMMARY.md
β”œβ”€β”€ HANDOUT.md
β”œβ”€β”€ SUBMISSION_GUIDE.md (this file)
β”œβ”€β”€ TESTING.md
β”œβ”€β”€ AWS_SERVICES_ANALYSIS.md (if applicable)
└── tasks/
    β”œβ”€β”€ 01_data_ingestion_transformation/
    β”‚   β”œβ”€β”€ ASSUMPTIONS_AND_EDGE_CASES.md
    β”‚   β”œβ”€β”€ ETL_PSEUDOCODE.md
    β”‚   β”œβ”€β”€ ETL_DIAGRAM.md
    β”‚   └── config.yaml (template only)
    β”œβ”€β”€ 02_data_lake_architecture_design/
    β”‚   β”œβ”€β”€ architecture.md
    β”‚   └── diagram.mmd
    β”œβ”€β”€ 03_sql/
    β”‚   β”œβ”€β”€ balance_history_2024_q1.sql
    β”‚   β”œβ”€β”€ SQL_PSEUDOCODE.md
    β”‚   └── SQL_DIAGRAM.md
    β”œβ”€β”€ 04_devops_cicd/
    β”‚   └── cicd_workflow.md
    └── 05_communication_documentation/
        └── [all communication files]

Code Appendix Package StructureΒΆ


ohpen-case-study-code-appendix/
β”œβ”€β”€ README.md (explains this is the code appendix)
└── tasks/
    β”œβ”€β”€ 01_data_ingestion_transformation/
    β”‚   β”œβ”€β”€ src/
    β”‚   β”‚   └── etl/
    β”‚   β”‚       └── ingest_transactions.py
    β”‚   β”œβ”€β”€ tests/
    β”‚   β”‚   β”œβ”€β”€ test_etl.py
    β”‚   β”‚   β”œβ”€β”€ test_integration.py
    β”‚   β”‚   └── conftest.py
    β”‚   β”œβ”€β”€ sample_data/
    β”‚   β”œβ”€β”€ requirements.txt
    β”‚   β”œβ”€β”€ requirements-dev.txt
    β”‚   β”œβ”€β”€ config.yaml
    β”‚   └── ruff.toml
    β”œβ”€β”€ 03_sql/
    β”‚   └── tests/
    β”‚       β”œβ”€β”€ test_balance_query.py
    β”‚       β”œβ”€β”€ test_data.sql
    β”‚       └── expected_output.csv
    └── 04_devops_cicd/
        β”œβ”€β”€ .github/
        β”‚   └── workflows/
        β”‚       └── ci.yml
        └── infra/
            └── terraform/
                └── main.tf

Communication StrategyΒΆ

Initial Email (Business Case)ΒΆ

Subject (4)ΒΆ

Body Template (5)ΒΆ


Dear [Hiring Manager/Team],

I'm pleased to submit my response to the Ohpen Data Engineer case study. This submission focuses on the business case, architecture design, and strategic approach.

## What's Included

- Executive summary and design assumptions
- Complete data lake architecture documentation
- SQL solution for month-end balance history
- ETL design (pseudocode, diagrams, edge case handling)
- CI/CD workflow design
- Stakeholder communication templates
- Comprehensive checklist review

## What's Not Included (Available Upon Request)

- Full implementation code (ETL, Terraform, CI/CD YAML)
- Complete test suites
- Infrastructure configuration files

I've structured this submission to demonstrate architectural thinking and design decisions first. The full implementation code is available as a separate appendix that I can provide upon request or during our interview discussion.

I look forward to discussing the design decisions and approach with you.

Best regards,
[Your Name]

Follow-up Email (Code Appendix - if requested)ΒΆ

Subject (6)ΒΆ

Body Template (1)ΒΆ


Dear [Hiring Manager/Team],

As requested, please find attached the complete code appendix for the Ohpen case study submission.

## What's Included (1)

- Full ETL implementation (Python)
- Complete Terraform infrastructure code
- GitHub Actions CI/CD workflow
- Comprehensive test suites
- Configuration files and dependencies

This appendix complements the business case submission and demonstrates the technical implementation depth.

I'm happy to walk through the code during our interview discussion.

Best regards,
[Your Name]

Rationale for SplitΒΆ

Why This Split WorksΒΆ

  1. Demonstrates Strategic Thinking First
  2. Shows you can design before coding
  3. Highlights architectural decision-making
  4. Proves you understand business context

  5. Protects Intellectual Property

  6. Code is valuable work product
  7. Allows you to present code in person
  8. Gives you control over when to share full implementation

  9. Shows Professional Judgment

  10. Demonstrates understanding of what matters to different audiences
  11. Shows you can communicate at appropriate levels
  12. Proves you understand the difference between design and implementation

  13. Interview Advantage

  14. Code discussion becomes interview topic
  15. Allows you to explain decisions in person
  16. Creates natural conversation flow

Quick Reference ChecklistΒΆ

Before Sending Business CaseΒΆ

  • All design documents included
  • No full source code files (.py, .tf, .yml)
  • SQL query included (it's a query, not infrastructure code)
  • All communication documents included
  • Architecture diagrams included
  • Pseudocode and design rationale included
  • Config templates only (not full configs)

Before Sending Code AppendixΒΆ

  • All source code files included
  • Test files included
  • Infrastructure as code included
  • Dependencies files included
  • README explaining appendix structure

Recent Updates (2026-01-22)ΒΆ

  • βœ… Updated to reflect current project structure
  • βœ… Added PySpark implementation files
  • βœ… Clarified exclusions (tests, Docker, dev files)
  • βœ… Updated file paths to match current structure
  • βœ… Added note about archived documentation

Last Updated: 2026-01-22ΒΆ