Cross-cutting discipline Discipline 2 of 2 cross-cutting disciplines ~10 min read
D2
Cross-cutting discipline · designed in at L1

Identity & Compliance — the lake is safe. Auth, IAM, PII, audit.

Same shape as Trust. Not a layer you add at the end. A discipline that runs through every build layer. Every Lambda you wrote without thinking about IAM is a Lambda you have to refactor. Every payload you ingested without a PII handling decision is data you have to retroactively classify. Designed in at L1. Present at L2, L3, L4, L5. By L4 it's too late to retrofit.

Our take

Three things to get right at every layer: least-privilege service accounts, row-level access, and a PII column registry. The cardinal rule: IAM, not application code. The boundary lives in the platform, never in the dashboard or the Python.

First, the words PII & row-level security

PII is the stuff that, in the wrong hands, causes a bad headline.

PII — "personally identifiable information" — is any column that points at a human: email, phone, address, government ID, birthday, geolocation. Not every column in a customer table is PII, but some are, and the first job is tagging which ones.

Row-level security is like a guest list at a door. The table is the same for everyone, but the bouncer (the database) only lets each person see the rows they're allowed to. Sales rep for the West region queries dim_customer and gets back West customers. Analytics team queries the same table and gets everyone. Same SQL. Different results.

Why early
Tagging PII and setting up roles on day one takes an afternoon. Retrofitting it during a security review because a prospect's CISO asked for it takes weeks and blocks deals. The difference between "cleared in a week" and "blocked for a quarter."
Rule 04

Give each connector its own login. Read-only. Narrow.

Every connector authenticates as a service account — a dedicated machine login, separate from any human. Give each one the smallest possible set of permissions for its job, and nothing else.

In practice:

  • The CRM connector only needs to read accounts and contacts. It doesn't need write access. Don't give it write access.
  • The ad-platform connector only needs to pull campaign stats. It doesn't need to modify or delete anything in the source, and it doesn't need write access to the warehouse either — it writes only to its own raw.ads_* tables.
  • The warehouse login your analysts use should be able to read marts, but not read raw PII columns — that's what roles are for.

This is the principle of least privilege: every login gets exactly the access it needs to do its job, not an ounce more. It takes ten extra minutes at setup. It's the difference between "a key leaked; we rotated it" and "a key leaked; someone now has read-write on our prod database."

dim_customer · column
Tag
Sales rep
Analytics
AI / MCP
#
customer_id
id
allow
allow
allow
a
email
pii · contact
allow
hash
deny
a
phone
pii · contact
allow
hash
deny
$
credit_card_last4
pci
deny
deny
deny
#
lifetime_revenue
metric
allow (own acct)
allow
allow
@
account_id
fk
scoped
allow
scoped
a
lifecycle_stage
derived
allow
allow
allow
allow — raw value returned hash — stable hash only scoped — row-level filter on calling user deny — never readable from this role
Row-level access lives here
Role · sales rep
Their accounts only.

Policy on dim_account: account.owner_id = current_user(). Inherits to every downstream mart and tool.

+ own rows× other reps
Role · analytics
All rows. PII hashed.

Sees every account, every deal, every order — but emails and phones return as stable hashes. Aggregates still work; re-identification doesn't.

+ all rows~ hashed PII
Role · AI / MCP
Scoped to the calling human.

Every MCP call inherits the user's row scope. PII never leaves the tool boundary — unless a verb like open_record explicitly surfaces it for a specific ID.

+ scoped rows× free PII
Rule
The LLM never sees raw email, phone, or payment info unless a human with appropriate access explicitly asked for that specific record. Everything else gets a stable hash or a bucketed label.
Artifact glue/lake_formation/policies.sql ~18 lines · column + row policies
-- glue/lake_formation/policies.sql
-- Mark PII columns once at the catalog level, not in every query.

ALTER TABLE marts.customers
  SET TBLPROPERTIES ('pii_columns' = 'email,phone,address_line1,address_line2');

GRANT SELECT (customer_id, signup_date, lifetime_value, segment)
  ON marts.customers TO ROLE analyst;

GRANT SELECT (customer_id, email, phone, signup_date, lifetime_value)
  ON marts.customers TO ROLE finance;

-- Row-level: analysts only see customers in their region.
CREATE ROW ACCESS POLICY analyst_region_filter
  ON marts.customers
  AS (region = current_user_region());

Load-bearingPII is defined at the column level once, not in every query. Analysts never see emails. Finance does. The model itself doesn’t change — the policy does. Compliance review becomes "show me policies.sql" instead of auditing 40 dashboards.