Same shape as Trust. Not a layer you add at the end. A discipline that runs through every build layer. Every Lambda you wrote without thinking about IAM is a Lambda you have to refactor. Every payload you ingested without a PII handling decision is data you have to retroactively classify. Designed in at L1. Present at L2, L3, L4, L5. By L4 it's too late to retrofit.
Three things to get right at every layer: least-privilege service accounts, row-level access, and a PII column registry. The cardinal rule: IAM, not application code. The boundary lives in the platform, never in the dashboard or the Python.
PII — "personally identifiable information" — is any column that points at a human: email, phone, address, government ID, birthday, geolocation. Not every column in a customer table is PII, but some are, and the first job is tagging which ones.
Row-level security is like a guest list at a door. The table is the same for everyone, but the bouncer (the database) only lets each person see the rows they're allowed to. Sales rep for the West region queries dim_customer and gets back West customers. Analytics team queries the same table and gets everyone. Same SQL. Different results.
Every connector authenticates as a service account — a dedicated machine login, separate from any human. Give each one the smallest possible set of permissions for its job, and nothing else.
In practice:
raw.ads_* tables.This is the principle of least privilege: every login gets exactly the access it needs to do its job, not an ounce more. It takes ten extra minutes at setup. It's the difference between "a key leaked; we rotated it" and "a key leaked; someone now has read-write on our prod database."
Policy on dim_account: account.owner_id = current_user(). Inherits to every downstream mart and tool.
Sees every account, every deal, every order — but emails and phones return as stable hashes. Aggregates still work; re-identification doesn't.
Every MCP call inherits the user's row scope. PII never leaves the tool boundary — unless a verb like open_record explicitly surfaces it for a specific ID.
-- glue/lake_formation/policies.sql
-- Mark PII columns once at the catalog level, not in every query.
ALTER TABLE marts.customers
SET TBLPROPERTIES ('pii_columns' = 'email,phone,address_line1,address_line2');
GRANT SELECT (customer_id, signup_date, lifetime_value, segment)
ON marts.customers TO ROLE analyst;
GRANT SELECT (customer_id, email, phone, signup_date, lifetime_value)
ON marts.customers TO ROLE finance;
-- Row-level: analysts only see customers in their region.
CREATE ROW ACCESS POLICY analyst_region_filter
ON marts.customers
AS (region = current_user_region());
Load-bearingPII is defined at the column level once, not in every query. Analysts never see emails. Finance does. The model itself doesn’t change — the policy does. Compliance review becomes "show me policies.sql" instead of auditing 40 dashboards.