Submission Guide: Business Case Split StrategyΒΆ
PurposeΒΆ
Submission Strategy OverviewΒΆ
flowchart LR Start[Submission] --> Part1["Part 1: Business Case π Design & Architecture"] Start --> Part2["Part 2: Code Appendix π» Full Implementation"] Part1 -->|Initial Submission| Company[Company Review] Company -->|Request Code| Part2 Company -->|Interview| Interview[Code Discussion] Part2 --> Interview style Start fill:#1976d2,color:#fff style Part1 fill:#1976d2,color:#fff style Part2 fill:#5c6bc0,color:#fff style Company fill:#ffa000,color:#111 style Interview fill:#2e7d32,color:#fff
Submission StrategyΒΆ
Part 1: Business Case (Initial Submission)ΒΆ
What: Strategic documentation, architecture design, and high-level implementation approach When: First submission to the company
Purpose (2)ΒΆ
Part 2: Code Appendix (Upon Request)ΒΆ
What: Full implementation code, infrastructure as code, and test artifacts When: Upon company request or during interview presentation
Purpose (3)ΒΆ
Part 1: Business Case ContentsΒΆ
π Core DocumentsΒΆ
- β
EXECUTIVE_SUMMARY.md- High-level overview and assumptions - β
HANDOUT.md- Interview presentation handout
ποΈ Architecture & Design (Task 2)ΒΆ
- β
tasks/02_data_lake_architecture_design/architecture.md- Complete architecture documentation - β
tasks/02_data_lake_architecture_design/diagram.mmd- Architecture diagram source
π SQL Solution (Task 3)ΒΆ
- β
tasks/03_sql/balance_history_2024_q1.sql- Production-ready SQL query - β
tasks/03_sql/SQL_PSEUDOCODE.md- SQL design rationale - β
tasks/03_sql/SQL_DIAGRAM.md- Query flow diagrams
π ETL Design (Task 1 - Design Only)ΒΆ
- β
tasks/01_data_ingestion_transformation/ASSUMPTIONS_AND_EDGE_CASES.md- Design assumptions and edge case handling - β
tasks/01_data_ingestion_transformation/ETL_PSEUDOCODE.md- ETL logic pseudocode - β
tasks/01_data_ingestion_transformation/ETL_DIAGRAM.md- ETL flow diagrams - β οΈ Code snippets only (key functions, not full implementation)
- β NOT included: Full
ingest_transactions.pysource code
π CI/CD Design (Task 4 - Design Only)ΒΆ
- β
tasks/04_devops_cicd/cicd_workflow.md- Complete CI/CD workflow design - β οΈ Workflow descriptions and diagrams only
- β NOT included: Full
.github/workflows/ci.ymlYAML - β NOT included: Full
infra/terraform/main.tfTerraform code
π§ Communication (Task 5)ΒΆ
Original Scope Deliverables:
- β
tasks/05_communication_documentation/stakeholder_email.md- Email for non-technical stakeholders (original scope) - β
tasks/05_communication_documentation/one_pager_tech.md- Technical one-page document (original scope)
Extended Communications (Appendix - Not in original scope):
- β οΈ
tasks/05_communication_documentation/appendix/- Extended stakeholder communication templates and tools (reference material only)
π Supporting DocumentationΒΆ
- β
docs/technical/TESTING.md- Testing approach and strategy - β
docs/technical/AWS_SERVICES_ANALYSIS.md- AWS services analysis (updated, references AWS_SPARK_SIMPLIFICATION_ANALYSIS.md)
Part 2: Code Appendix ContentsΒΆ
π» Full Implementation CodeΒΆ
ETL Code (Task 1)ΒΆ
- β
tasks/01_data_ingestion_transformation/src/etl/- Complete ETL implementation ingest_transactions.py- Pandas-based ETL (original)ingest_transactions_spark.py- PySpark-optimized ETL (recommended)- All supporting modules (validation, metadata, s3_operations, etc.)
- β
tasks/01_data_ingestion_transformation/requirements.txt- Python dependencies - β
tasks/01_data_ingestion_transformation/requirements-spark.txt- PySpark dependencies - β
tasks/01_data_ingestion_transformation/config.yaml- Configuration template - β οΈ Note: Test files are excluded from submission (see SUBMISSION_EXCLUSIONS.md)
Infrastructure as Code (Task 4)ΒΆ
- β
tasks/04_devops_cicd/.github/workflows/ci.yml- Complete GitHub Actions workflow - β
tasks/04_devops_cicd/infra/terraform/main.tf- Complete Terraform infrastructure - β
tasks/04_devops_cicd/infra/terraform/- Additional Terraform files (if any)
SQL Testing (Task 3)ΒΆ
- β οΈ Note: SQL test files are excluded from submission (see SUBMISSION_EXCLUSIONS.md)
- β
tasks/03_sql/balance_history_2024_q1.sql- Production SQL query (included) - β
tasks/03_sql/schema.sql- Schema definition (if exists)
Supporting FilesΒΆ
- β οΈ Note: Development files are excluded (see SUBMISSION_EXCLUSIONS.md):
- Test files and test infrastructure
- Docker files and containerization setup
- Development dependencies and linting configurations
- Sample data files
File Inclusion MatrixΒΆ
| File/Directory Business Case Code Appendix Notes |
|---|
EXECUTIVE_SUMMARY.md β
β
Core document |
HANDOUT.md β
β
Interview prep |
tasks/02_data_lake_architecture_design/ β
β
Design docs |
tasks/03_sql/balance_history_2024_q1.sql β
β
SQL is fine to include |
tasks/03_sql/SQL_*.md β
β
Design docs |
tasks/03_sql/tests/ β β
Test code |
tasks/01_data_ingestion_transformation/ASSUMPTIONS_*.md β
β
Design docs |
tasks/01_data_ingestion_transformation/ETL_*.md β
β
Design docs |
tasks/01_data_ingestion_transformation/src/etl/*.py β β
Full code excluded |
tasks/01_data_ingestion_transformation/tests/ β β
Test code |
tasks/01_data_ingestion_transformation/config.yaml β οΈ β
Template only in business case |
tasks/04_devops_cicd/cicd_workflow.md β
β
Design doc |
tasks/04_devops_cicd/.github/workflows/ci.yml β β
Full YAML excluded |
tasks/04_devops_cicd/infra/terraform/main.tf β β
Full Terraform excluded |
tasks/05_communication_documentation/stakeholder_email.md β
β
Original scope: Email for non-technical stakeholders |
tasks/05_communication_documentation/one_pager_tech.md β
β
Original scope: Technical one-page document |
tasks/05_communication_documentation/appendix/ β β οΈ Extended communications (not in original scope, reference only) |
docs/technical/TESTING.md β
β
Testing approach |
docs/technical/AWS_SERVICES_ANALYSIS.md β
β
Analysis doc (updated) |
tasks/01_data_ingestion_transformation/AWS_SPARK_SIMPLIFICATION_ANALYSIS.md β
β
AWS/Spark decision doc |
docs/archive/ β β Archived historical docs (not included) |
LegendΒΆ
- β Included
- β Excluded
- β οΈ Partial (template/example only)
Packaging InstructionsΒΆ
Business Case Package StructureΒΆ
ohpen-case-study-business-case/
βββ EXECUTIVE_SUMMARY.md
βββ HANDOUT.md
βββ SUBMISSION_GUIDE.md (this file)
βββ TESTING.md
βββ AWS_SERVICES_ANALYSIS.md (if applicable)
βββ tasks/
βββ 01_data_ingestion_transformation/
β βββ ASSUMPTIONS_AND_EDGE_CASES.md
β βββ ETL_PSEUDOCODE.md
β βββ ETL_DIAGRAM.md
β βββ config.yaml (template only)
βββ 02_data_lake_architecture_design/
β βββ architecture.md
β βββ diagram.mmd
βββ 03_sql/
β βββ balance_history_2024_q1.sql
β βββ SQL_PSEUDOCODE.md
β βββ SQL_DIAGRAM.md
βββ 04_devops_cicd/
β βββ cicd_workflow.md
βββ 05_communication_documentation/
βββ [all communication files]
Code Appendix Package StructureΒΆ
ohpen-case-study-code-appendix/
βββ README.md (explains this is the code appendix)
βββ tasks/
βββ 01_data_ingestion_transformation/
β βββ src/
β β βββ etl/
β β βββ ingest_transactions.py
β βββ tests/
β β βββ test_etl.py
β β βββ test_integration.py
β β βββ conftest.py
β βββ sample_data/
β βββ requirements.txt
β βββ requirements-dev.txt
β βββ config.yaml
β βββ ruff.toml
βββ 03_sql/
β βββ tests/
β βββ test_balance_query.py
β βββ test_data.sql
β βββ expected_output.csv
βββ 04_devops_cicd/
βββ .github/
β βββ workflows/
β βββ ci.yml
βββ infra/
βββ terraform/
βββ main.tf
Communication StrategyΒΆ
Initial Email (Business Case)ΒΆ
Subject (4)ΒΆ
Body Template (5)ΒΆ
Dear [Hiring Manager/Team],
I'm pleased to submit my response to the Ohpen Data Engineer case study. This submission focuses on the business case, architecture design, and strategic approach.
## What's Included
- Executive summary and design assumptions
- Complete data lake architecture documentation
- SQL solution for month-end balance history
- ETL design (pseudocode, diagrams, edge case handling)
- CI/CD workflow design
- Stakeholder communication templates
- Comprehensive checklist review
## What's Not Included (Available Upon Request)
- Full implementation code (ETL, Terraform, CI/CD YAML)
- Complete test suites
- Infrastructure configuration files
I've structured this submission to demonstrate architectural thinking and design decisions first. The full implementation code is available as a separate appendix that I can provide upon request or during our interview discussion.
I look forward to discussing the design decisions and approach with you.
Best regards,
[Your Name]
Follow-up Email (Code Appendix - if requested)ΒΆ
Subject (6)ΒΆ
Body Template (1)ΒΆ
Dear [Hiring Manager/Team],
As requested, please find attached the complete code appendix for the Ohpen case study submission.
## What's Included (1)
- Full ETL implementation (Python)
- Complete Terraform infrastructure code
- GitHub Actions CI/CD workflow
- Comprehensive test suites
- Configuration files and dependencies
This appendix complements the business case submission and demonstrates the technical implementation depth.
I'm happy to walk through the code during our interview discussion.
Best regards,
[Your Name]
Rationale for SplitΒΆ
Why This Split WorksΒΆ
- Demonstrates Strategic Thinking First
- Shows you can design before coding
- Highlights architectural decision-making
-
Proves you understand business context
-
Protects Intellectual Property
- Code is valuable work product
- Allows you to present code in person
-
Gives you control over when to share full implementation
-
Shows Professional Judgment
- Demonstrates understanding of what matters to different audiences
- Shows you can communicate at appropriate levels
-
Proves you understand the difference between design and implementation
-
Interview Advantage
- Code discussion becomes interview topic
- Allows you to explain decisions in person
- Creates natural conversation flow
Quick Reference ChecklistΒΆ
Before Sending Business CaseΒΆ
- All design documents included
- No full source code files (
.py,.tf,.yml) - SQL query included (it's a query, not infrastructure code)
- All communication documents included
- Architecture diagrams included
- Pseudocode and design rationale included
- Config templates only (not full configs)
Before Sending Code AppendixΒΆ
- All source code files included
- Test files included
- Infrastructure as code included
- Dependencies files included
- README explaining appendix structure
Recent Updates (2026-01-22)ΒΆ
- β Updated to reflect current project structure
- β Added PySpark implementation files
- β Clarified exclusions (tests, Docker, dev files)
- β Updated file paths to match current structure
- β Added note about archived documentation