Coursera

Modern Data Architecture & Lakehouse Engineering Specialization

Seize the savings! Get 40% off 3 months of Coursera Plus and full access to thousands of courses.

Coursera

Modern Data Architecture & Lakehouse Engineering Specialization

Design and Build Modern Data Platforms.

Learn to architect, secure, and optimize cloud-based lakehouse systems for enterprise analytics.

Hurix Digital

Instructor: Hurix Digital

Included with Coursera Plus

Learn more

Get in-depth knowledge of a subject
Intermediate level

Recommended experience

4 weeks to complete
at 10 hours a week
Flexible schedule
Learn at your own pace
Get in-depth knowledge of a subject
Intermediate level

Recommended experience

4 weeks to complete
at 10 hours a week
Flexible schedule
Learn at your own pace

What you'll learn

  • Architect and provision secure, resilient cloud data infrastructure using Infrastructure as Code and disaster recovery best practices.

  • Build lakehouse platforms with transactional integrity, automated pipelines, and seamless integration of diverse data sources.

  • Optimize data system performance through strategic partitioning, query tuning, security controls, and systematic benchmarking.

Details to know

Shareable certificate

Add to your LinkedIn profile

Taught in English
Recently updated!

February 2026

See how employees at top companies are mastering in-demand skills

 logos of Petrobras, TATA, Danone, Capgemini, P&G and L'Oreal

Advance your subject-matter expertise

  • Learn in-demand skills from university and industry experts
  • Master a subject or tool with hands-on projects
  • Develop a deep understanding of key concepts
  • Earn a career certificate from Coursera

Specialization - 6 course series

What you'll learn

  • Infrastructure as Code automates data platform deployments, replacing manual processes with version-controlled, repeatable systems.

  • Cost optimization uses performance benchmarking and data analysis to identify efficient compute/storage configs for specific workloads.

  • Business continuity requires proactive disaster recovery with automated failover and continuous replication for strict recovery goals.

  • Successful cloud data engineering balances performance, cost, and reliability through strategic design and continuous monitoring.

Skills you'll gain

Category: Disaster Recovery
Category: Business Continuity
Category: Infrastructure as Code (IaC)
Category: Cloud Deployment
Category: Performance Analysis
Category: Terraform
Category: Data Warehousing
Category: Capacity Management
Category: Automation
Category: Data Architecture
Category: Cloud Computing Architecture
Category: AWS CloudFormation
Category: Benchmarking
Category: Business Continuity Planning
Category: IT Infrastructure
Category: Data Infrastructure
Category: Cost Management

What you'll learn

  • Batch data transformation converts raw semi-structured data into analysis-ready formats that support enterprise decisions.

  • Workload analysis guides database design by linking access patterns and query frequency to performance and cost gains.

  • Migration choices must rely on performance testing and quantitative analysis to ensure ROI-driven transformations.

  • System performance depends on storage, queries, and hardware, requiring holistic technical and business evaluation.

Skills you'll gain

Category: Amazon Redshift
Category: Data Wrangling
Category: Database Design
Category: Database Management
Category: Apache Hive
Category: Operational Databases
Category: Apache Cassandra
Category: Azure Synapse Analytics
Category: Data Transformation
Category: Data Architecture

What you'll learn

  • Data protection requires layered security controls that balance privacy with operational utility.

  • Proactive monitoring and anomaly detection are essential for identifying security threats before they escalate into breaches.

  • Compliance frameworks provide structured approaches to evaluating and strengthening organizational security postures.

  • Effective data governance integrates technical controls with policy frameworks to create comprehensive protection strategies.

Skills you'll gain

Category: Security Management

What you'll learn

  • Security by design applies layered defenses across storage, identity, and networks from the start of infrastructure setup.

  • Infrastructure as Code ensures consistent, auditable security settings that reduce errors and support compliance needs.

  • The principle of least privilege must be embedded into every access control decision, granting only necessary permissions to specific resources.

  • Secure networks rely on segmentation with private subnets and controls to protect systems from public exposure.

Skills you'll gain

Category: Data Security
Category: Encryption
Category: Infrastructure as Code (IaC)
Category: Cloud Security
Category: Identity and Access Management
Category: Network Security
Category: Cloud Storage
Category: Data Integrity
Category: Data Infrastructure
Category: Infrastructure Security
Category: Data Management
Category: Cloud Infrastructure
Category: Private Cloud
Category: Security Controls

What you'll learn

  • Transactional storage layers ensure data lake reliability, supporting concurrent operations and maintaining integrity.

  • Version control in data lakes enables auditing, compliance, time-travel queries, and error recovery for production systems.

  • Schema evolution strategies help data systems adapt to business changes while maintaining backward compatibility.

  • Converting raw files to transactional formats is a key pattern supporting both analytics and operational reliability.

Skills you'll gain

Category: Data Pipelines
Category: SQL
Category: Data Lakes

What you'll learn

  • Performance optimization is a systematic process requiring analysis of data access patterns, not random configuration changes.

  • Strategic partitioning minimizes expensive network shuffles and is the foundation of scalable Spark applications.

  • Intelligent caching of reusable intermediate datasets can dramatically reduce computation costs and improve job reliability.

  • The Spark UI provides actionable insights that guide optimization decisions and enable data-driven performance improvements.

Skills you'll gain

Category: Apache Spark
Category: Performance Tuning
Category: Systems Analysis
Category: Data Processing
Category: Data Pipelines
Category: PySpark

Earn a career certificate

Add this credential to your LinkedIn profile, resume, or CV. Share it on social media and in your performance review.

Instructor

Hurix Digital
Coursera
290 Courses 23,121 learners

Offered by

Coursera

Why people choose Coursera for their career

Felipe M.

Learner since 2018
"To be able to take courses at my own pace and rhythm has been an amazing experience. I can learn whenever it fits my schedule and mood."

Jennifer J.

Learner since 2020
"I directly applied the concepts and skills I learned from my courses to an exciting new project at work."

Larry W.

Learner since 2021
"When I need courses on topics that my university doesn't offer, Coursera is one of the best places to go."

Chaitanya A.

"Learning isn't just about being better at your job: it's so much more than that. Coursera allows me to learn without limits."
Coursera Plus

Open new doors with Coursera Plus

Unlimited access to 10,000+ world-class courses, hands-on projects, and job-ready certificate programs - all included in your subscription

Advance your career with an online degree

Earn a degree from world-class universities - 100% online

Join over 3,400 global companies that choose Coursera for Business

Upskill your employees to excel in the digital economy

Frequently asked questions