← Azure Data for Non-Engineers
Module 2 Free 4 min

Storage Accounts, Blob Storage & Data Lakes

Where Azure keeps your files — and what people mean by blobs, containers and data lakes.

What you'll learn

  • Explain what a storage account and a container are
  • Describe blob storage in plain terms
  • Understand what makes a data lake (ADLS Gen2) different

Before any clever analysis can happen, your data has to live somewhere. In Azure, that somewhere is usually a storage account — a giant, secure cupboard in the cloud for files of every kind. Almost every data project starts here, because pipelines, dashboards and reports all need raw material to draw from. Get comfortable with three words — storage account, blob and data lake — and most storage conversations will suddenly make sense.

The storage account: your cloud cupboard

A storage account is the top-level container Microsoft gives you to hold files in the cloud. Think of it as renting a self-storage unit: one big, lockable space with your name on it, sitting in a chosen region, billed to a chosen subscription. Inside that unit you can keep almost anything — documents, images, exports, backups, enormous data files — and Azure looks after the locks, the durability and the constant copying that keeps your files safe. You don’t manage hard drives; you just put things in and take things out.

Storage account — the cupboardContainera shelfblob (a file)blob (a file)Containeranother shelfblob (a file)blob (a file)Data lakesame cupboard,with real foldersbuilt for analytics

A storage account holds containers (shelves); each container holds blobs (files). Switch on the lake features and you get true folders for analytics.

Blobs and containers: shelves and the things on them

Inside a storage account, files are organised into containers. A container is simply a top-level shelf you use to group related files — one for raw exports, one for finished reports, one for images. Each file you drop onto a shelf is called a blob. The word sounds odd, but it just means “a file in the cloud” — the term stands for Binary Large OBject, which is engineer-speak for “any lump of data, whatever its shape”. A photo is a blob; a CSV export is a blob; a backup is a blob. So Blob Storage is nothing more exotic than “the part of Azure that holds your files”, organised as containers full of blobs.

This setup is cheap, endlessly expandable and very reliable, which is why it’s the default home for almost everything. When a colleague says “the file’s in blob storage” or “drop it in the container”, they simply mean it’s sitting on a shelf in the cloud cupboard, ready to be picked up.

A storage account is the cupboard, a container is a shelf, and a blob is a single file on it.

Data lakes: the same cupboard, dressed for analytics

A data lake takes that same storage account and switches on extra abilities aimed at large-scale analysis. In Azure this is called ADLS Gen2 — Azure Data Lake Storage Generation 2 — and it’s not a separate product you buy. It’s a storage account with a setting turned on (the hierarchical namespace) that gives you proper, nested folders instead of just flat shelves. That sounds minor, but it changes how the data behaves.

Why does it matter? Analytics tools work with enormous numbers of files, and real folders let them find, sort and process those files far faster and more cheaply. A data lake is the place you pour all your data — raw, messy, structured or not — and keep it until something needs it. The name is apt: a lake is large, holds water of every kind, and you draw from it as required. The tools you’ll meet later — Data Factory, Databricks, Synapse and Fabric — almost always read from and write to a data lake.

Lake versus ordinary blob storage

The short version: ordinary blob storage is perfect for everyday files and backups, while a data lake is blob storage tuned for heavy analytics, with folders and faster access for the big-data tools. Same cupboard underneath — different settings on top. You don’t have to choose between them as concepts; many projects use both.

Spot it: storage terms

Read each situation and decide for yourself, then tap a card to flip it and check your answer.

Sort the storage concepts

Drag each item into the bucket it belongs to — or tap an item, then tap a bucket. Hit Check placement when you’re done.

Blob Storageeveryday files
Data Lake (ADLS Gen2)tuned for analytics

Tip: drag with a mouse, or tap an item then tap a bucket on touch screens. Get one wrong and the answer key appears.

How to use it

You’ll mostly talk about storage rather than configure it, so a few phrases go a long way. “Where’s the raw data landing?” usually has the answer “in the data lake”. “Which container is it in?” asks which shelf to look on. If someone says “it’s just sitting in blob storage”, you know it’s a plain file in the cloud, not yet processed. And if you hear “point the pipeline at the lake”, you now understand that means “have the tool read from our big analytics storage”. Knowing the difference between a casual blob and a purpose-built data lake is enough to follow — and sensibly question — almost any Azure storage discussion.

Quick check

1. A "blob" in Azure is…

2. A container is best thought of as…

3. A data lake (ADLS Gen2) is mainly…