← Data & Cloud Platforms
Module 1 Free 4 min

Databases, Data Warehouses & Data Lakes

What each of these places really is, and why your company keeps data in all three.

What you'll learn

  • Tell a database, a data warehouse and a data lake apart
  • Match each store to the job it does best
  • Speak about company data with more confidence

Every dashboard you have ever opened, every customer record a colleague looks up, and every monthly report that lands in your inbox is sitting on top of a quiet layer most people never see. We call it the hidden plumbing behind dashboards. Data has to live somewhere, and companies do not keep it all in one big pile. They use a few different kinds of storage, each suited to a different job. Once you can tell them apart, a lot of confusing conversations suddenly make sense.

DatabaseLive, day-to-dayOne app at a timee.g. orders systemData WarehouseTidy, combined historyBuilt for reportinge.g. sales by regionData LakeRaw, everythingAny shape, cheape.g. logs, photos

Three stores, three jobs — most companies use all of them together.

The database: where work happens right now

A database is the workhorse behind the apps you use every day. When a customer places an order, when HR updates your address, or when a colleague logs a support ticket, those records are written into a database in real time. Its whole personality is built around speed and accuracy for small, frequent changes — reading one record, updating another, adding a new one — all happening this very second.

Think of a database as the front desk of a busy hotel. It is brilliant at handling one guest at a time, checking them in and out instantly. What it is not designed for is someone wandering in and asking, “Can you total every booking made in the last five years and break it down by country?” Ask a live database a question that heavy and you risk slowing the system down for everyone trying to use the app. That is where the next store comes in.

The data warehouse: tidy answers for the business

A data warehouse is a separate, carefully organised store built specifically for asking questions across lots of data at once. Information is copied in from many databases — sales, finance, web, HR — then cleaned, lined up so the columns mean the same thing everywhere, and arranged for fast reporting. When your sales dashboard shows revenue by region over three years, it is almost certainly reading from a warehouse, not from the live ordering system.

The trade-off is that a warehouse is usually a little behind real time, refreshed on a schedule rather than instantly. That is a feature, not a flaw: you want stable, consistent numbers for a board report, not figures that twitch every second.

A database answers “what is true right now?” A warehouse answers “what has been happening, across the whole business?”

The data lake: keep everything, decide later

A data lake is the most relaxed of the three. It is a large, low-cost store that holds data in its raw form — neat tables, yes, but also website clickstream logs, scanned documents, sensor readings, images, even chat transcripts. You pour it all in without forcing it into tidy columns first, because storage is cheap and you may not yet know which bits you will need.

The catch is that raw is messy. A lake on its own is not something a finance manager queries directly; it is the gravel pit that engineers later sort, clean and shape — often feeding a warehouse. The modern middle ground, a lakehouse, tries to give you a lake’s cheap flexibility with a warehouse’s tidy, query-friendly structure in one place.

How they fit together

In practice these are not rivals; they form a chain. Live apps write to databases. That data, plus everything else, lands in a data lake. Engineers refine the useful parts into a data warehouse. And the polished numbers flow out to the dashboards leaders actually read. Knowing which store a number came from tells you how fresh it is, how trustworthy it is, and why “the report and the live system disagree” is often perfectly normal — they are looking at different stores at different moments.

Spot it: data stores

Read each situation and decide for yourself, then tap a card to flip it and check your answer.

Sort the data stores

Drag each item into the bucket it belongs to — or tap an item, then tap a bucket. Hit Check placement when you’re done.

Databaselive, day-to-day
Data Warehousetidy history for reporting
Data Lakeraw, any shape, cheap

Tip: drag with a mouse, or tap an item then tap a bucket on touch screens. Get one wrong and the answer key appears.

How to use it

You do not need to build any of this, but naming it well makes you sharper in meetings. Try phrases like: “Is that figure from the live database or last night’s warehouse refresh?” or “Can we keep the raw files in the data lake in case we need them later?” or “That report looks behind — when does the warehouse refresh?” Asking which store a number lives in is one of the fastest ways to sound like you genuinely understand the plumbing — because now you do.

Quick check

1. Which store is built mainly for fast, live updates from an app?

2. A three-year sales dashboard most likely reads from a…

3. A data lake is best described as…