Saturday, May 2, 2026

Microsoft Graph API & Development Complete Guide

 

Microsoft Graph API & Development — Complete Guide

Graph Fundamentals · Authentication · OData Queries · Webhooks · Delta Query · Batch API · SDKs · Scenarios · Cheat Sheet


Table of Contents

  1. Core Concepts — Basics
  2. Authentication & Authorisation
  3. OData Queries & Key Endpoints
  4. Advanced Graph Features
  5. Graph SDKs & Power Platform Integration
  6. Scenario-Based Questions
  7. Cheat Sheet — Quick Reference

1. Core Concepts — Basics

What is Microsoft Graph and what problem does it solve?

Microsoft Graph is a unified REST API providing access to data, relationships, and intelligence across all Microsoft 365 services through a single endpoint: https://graph.microsoft.com.

The problem it solved — before Graph:

  • Each service had its own API: Exchange → EWS, SharePoint → CSOM/REST, Azure AD → AAD Graph, Teams → Teams API, OneDrive → its own endpoint
  • Developers had to learn multiple auth models, endpoint patterns, and SDKs per service
  • Cross-service scenarios (create a Teams meeting + add to calendar + notify by email) required calls to multiple separate APIs

With Microsoft Graph:

  1. Single endpoint: one URL, one SDK, one auth flow for all M365 services
  2. Rich relationships: traverse entity relationships — user → mailbox → messages → attachments, user → joined Teams → channels → messages
  3. Intelligence layer: access M365 AI insights — trending documents, people you work with, meeting insights

What Graph covers: Users, Groups, Mail, Calendar, Contacts, Teams, SharePoint, OneDrive, Planner, To-Do, OneNote, Security, Identity, Intune, Compliance, Reports, and more.


What are the two versions of Microsoft Graph API?

Version Endpoint Use
v1.0 https://graph.microsoft.com/v1.0/ Stable, production-ready, backward compatibility guaranteed. Use for all production apps.
beta https://graph.microsoft.com/beta/ Preview APIs under development. May change without notice. Prototyping only — never production.

Warning: Using beta APIs in production is a common and dangerous mistake. If a beta API changes or is removed, your production application breaks without warning.


What are delegated permissions vs application permissions?

Delegated permissions: app acts on behalf of a signed-in user. App can only do what the user can do — user's permissions are the ceiling.

Application permissions: app acts as itself with no user context. Can access all tenant data for the granted scope. Admin consent required.

Delegated (user context):
→ Requires a signed-in user
→ Token contains user claims
→ Access scoped to what the signed-in user can access
→ Consent: user or admin can consent
→ Use for: apps with interactive user sign-in
→ Example: GET /me/messages  ← signed-in user's emails only

Application (service/daemon):
→ No user — app authenticates with its own identity
→ Token contains app claims (no user)
→ Access to ALL tenant data in the granted scope
→ Consent: ADMIN ONLY
→ Use for: background services, automation, scheduled jobs
→ Example: GET /users/{userId}/messages  ← any user's emails

Critical: Application permissions are extremely powerful
Mail.Read (application) = read EVERY mailbox in the entire tenant
Always apply least privilege — request minimum permissions needed

Critical: Admin consent is required for ALL application permissions. Never grant broader permissions than the app actually needs.


What is Graph Explorer?

Graph Explorer (developer.microsoft.com/graph/graph-explorer) is a web-based tool for exploring and testing Graph API calls without writing code.

Key features:

  • Sign in with your M365 account to run real calls against your tenant
  • Library of 200+ sample queries across all workloads
  • Request builder with headers and request body
  • Full JSON response viewer with syntax highlighting
  • Permissions panel — see required permissions, consent for testing
  • Code snippets — auto-generate code in C#, JavaScript, Java, Python, PowerShell, Go

2. Authentication & Authorisation

What OAuth 2.0 flows are used for Microsoft Graph?

Auth Code Flow (delegated — web apps with user sign-in):
1. User clicks Sign In → app redirects to Entra ID login
2. User authenticates + consents to permissions
3. Entra ID returns authorisation code to redirect URI
4. App exchanges code for access token + refresh token
5. App calls Graph: Authorization: Bearer {access_token}
6. Token expires → refresh token used silently for new access token
→ Use for: web apps, mobile apps with user context

Client Credentials Flow (application — no user):
1. App authenticates to Entra ID with client ID + client secret/certificate
2. Entra ID returns access token (no user claims)
3. App calls Graph: Authorization: Bearer {access_token}
→ Use for: background services, daemons, scheduled jobs

Device Code Flow (delegated — devices without browser):
1. App requests device code from Entra ID
2. App shows user a code + URL (microsoft.com/devicelogin)
3. User opens URL on another device, enters code, authenticates
4. App polls Entra ID until auth complete
→ Use for: CLIs, IoT devices, scripting tools (PnP PowerShell uses this)

On-Behalf-Of (OBO) Flow (delegated — API calling Graph for a user):
1. User authenticates to API A
2. API A calls Graph on behalf of user using OBO
3. Graph sees original user identity in the token
→ Use for: middle-tier APIs that call Graph with user context

How do you register an app to call Microsoft Graph?

Steps:
1. Azure Portal → Entra ID → App registrations → New registration
   → Set: name, supported account types, redirect URI

2. Note credentials from Overview:
   Application (client) ID: unique app identifier in Entra ID
   Directory (tenant) ID: your Entra ID tenant identifier

3. Create credentials (Certificates & Secrets):
   → Client secret: simple but requires rotation
   → Certificate: recommended for production — more secure

4. Configure permissions (API permissions):
   → Add permission → Microsoft Graph
   → Select: Delegated permissions OR Application permissions
   → Choose required scopes

5. Grant admin consent:
   → Required for application permissions and high-privilege delegated scopes
   → "Grant admin consent for [tenant]" button

Token endpoint:
https://login.microsoftonline.com/{tenantId}/oauth2/v2.0/token

Scope formats:
https://graph.microsoft.com/.default         ← all granted permissions
https://graph.microsoft.com/User.Read        ← specific delegated scope
https://graph.microsoft.com/Mail.Send        ← specific delegated scope

What is MSAL and why should you use it?

MSAL (Microsoft Authentication Library) is Microsoft's official library for authenticating with Entra ID and acquiring tokens. Available for .NET, JavaScript/Node.js, Python, Java, Go, iOS, Android.

Why use MSAL over raw HTTP:

  1. Token caching: automatically caches and reuses tokens until expiry
  2. Token refresh: silently refreshes expired tokens — no user re-prompt
  3. Multiple flow support: handles Auth Code, Client Credentials, Device Code, OBO flows
  4. Conditional Access handling: detects claims challenges and triggers re-authentication
  5. Security built-in: implements PKCE, state parameter, nonce automatically
// MSAL Node.js — Client Credentials (app-only):
const msal = require('@azure/msal-node');

const config = {
  auth: {
    clientId: process.env.CLIENT_ID,
    authority: `https://login.microsoftonline.com/${process.env.TENANT_ID}`,
    clientSecret: process.env.CLIENT_SECRET
  }
};

const cca = new msal.ConfidentialClientApplication(config);
const tokenResponse = await cca.acquireTokenByClientCredential({
  scopes: ['https://graph.microsoft.com/.default']
});

const accessToken = tokenResponse.accessToken;
// Use in: Authorization: Bearer {accessToken}

3. OData Queries & Key Endpoints

What are the OData query parameters in Microsoft Graph?

$select — choose which fields to return (always use this!):
GET /me/messages?$select=subject,from,receivedDateTime,isRead
→ Returns only those fields, not the full ~40-property message object

$filter — filter results:
GET /users?$filter=department eq 'Finance'
GET /me/messages?$filter=isRead eq false and importance eq 'high'
GET /groups?$filter=startswith(displayName,'Sales')
GET /me/events?$filter=start/dateTime ge '2025-01-01T00:00:00Z'

$orderby — sort results:
GET /me/messages?$orderby=receivedDateTime desc
GET /users?$orderby=displayName asc

$top — limit results returned:
GET /users?$top=50
GET /me/messages?$top=25

$expand — include related entities inline:
GET /me/messages?$expand=attachments
GET /groups?$expand=members($select=displayName,mail)
GET /users/{id}?$expand=manager($select=displayName,mail)

$count — include total count:
GET /users?$count=true
(Requires header: ConsistencyLevel: eventual)

$search — full-text search:
GET /me/messages?$search="project proposal"
GET /users?$search="displayName:John"

Tip: Always use $select in production — the default response returns every property. This dramatically reduces payload size and improves performance at scale.


How does pagination work in Microsoft Graph?

When results exceed the page size, Graph returns an @odata.nextLink property containing the URL for the next page.

// Pagination loop (JavaScript):
let url = 'https://graph.microsoft.com/v1.0/users?$top=100&$select=displayName,mail';
let allUsers = [];

while (url) {
  const response = await graphClient.api(url).get();
  allUsers = allUsers.concat(response.value);
  url = response['@odata.nextLink'] || null;
}

console.log(`Total users: ${allUsers.length}`);

Key pagination rules:

  • Never construct pagination URLs manually — always use @odata.nextLink as-is
  • $skiptoken values are opaque — never parse or modify them
  • $skip + $top is NOT reliable for large datasets — use nextLink
  • Some endpoints have max $top values (messages: 1000 max, users: 999 max)

What are the most important Graph endpoints?

Identity / Users:
GET  /me                                    ← signed-in user profile
GET  /users                                 ← all users in tenant
GET  /users/{id}                            ← specific user
GET  /users/{id}/manager                    ← user's manager
GET  /users/{id}/directReports              ← direct reports
GET  /users/{id}/memberOf                   ← groups/roles user belongs to
GET  /me/photo/$value                       ← profile photo (binary)

Mail:
GET  /me/messages                           ← inbox messages
POST /me/sendMail                           ← send email
GET  /me/mailFolders                        ← folder list
PATCH /me/messages/{id}                     ← update (e.g., mark read)

Calendar:
GET  /me/events                             ← calendar events
POST /me/events                             ← create meeting/event
GET  /me/calendarView?startDateTime=&endDateTime=  ← events in range

Teams:
GET  /me/joinedTeams                        ← teams I'm member of
GET  /teams/{id}/channels                   ← team channels
POST /teams/{id}/channels/{id}/messages     ← send channel message
GET  /chats/{id}/messages                   ← chat messages

SharePoint / OneDrive:
GET  /me/drive                              ← my OneDrive
GET  /me/drive/root/children                ← root folder contents
GET  /sites/{siteId}/drives                 ← SharePoint document libraries
GET  /sites/{siteId}/lists                  ← SharePoint lists
GET  /sites/{siteId}/lists/{listId}/items   ← list items

Groups:
GET  /groups                                ← all M365 + security groups
POST /groups                                ← create group
GET  /groups/{id}/members                   ← group members
POST /groups/{id}/members/$ref              ← add member to group

Reports (admin only — Reports.Read.All):
GET  /reports/getEmailActivityCounts(period='D30')
GET  /reports/getTeamsUserActivityCounts(period='D30')
GET  /reports/getSharePointSiteUsageDetail(period='D30')

What is the difference between /me and /users/{id}?

/me — delegated tokens only:
→ Shortcut for the signed-in user
→ Only works when a user is signed in
→ Returns data for the authenticated user

GET /me/messages       ← my emails
GET /me/events         ← my calendar
GET /me/drive          ← my OneDrive

/users/{id} — delegated OR application tokens:
→ Explicitly specifies a user by ID or UPN
→ Required for daemon apps and admin scenarios

GET /users/alice@contoso.com/messages  ← Alice's emails
GET /users/{objectId}/events           ← specific user's calendar

Application permission scenarios always use /users/{id}:
→ Background compliance scanning of all mailboxes
→ Admin reporting tool reading all user profiles
→ Provisioning service creating calendar events for onboarded users

4. Advanced Graph Features

What are Microsoft Graph change notifications (webhooks)?

Change notifications allow your application to receive real-time notifications when M365 resources change — instead of polling repeatedly.

1. Register a subscription:
POST https://graph.microsoft.com/v1.0/subscriptions
{
  "changeType": "created,updated,deleted",
  "notificationUrl": "https://yourapp.com/webhook",
  "resource": "/me/messages",
  "expirationDateTime": "2025-04-01T00:00:00Z",
  "clientState": "secretClientState"
}

2. Validate endpoint (Graph sends validation request):
POST https://yourapp.com/webhook?validationToken=abc123
→ Your endpoint MUST echo back validationToken within 10 seconds

3. Receive change notifications:
{
  "value": [{
    "subscriptionId": "...",
    "changeType": "created",
    "resource": "users/abc/messages/xyz",
    "clientState": "secretClientState",
    "resourceData": { "@odata.type": "#microsoft.graph.message", "id": "xyz" }
  }]
}

4. Fetch the changed resource:
GET /me/messages/xyz

Key requirements:
→ Subscriptions expire (max 4,230 minutes for most resources)
→ Renew before expiry: PATCH /subscriptions/{id}
→ Always validate clientState on incoming notifications (security)
→ Respond 202 Accepted immediately, then process async
→ Rich notifications: include resource data in the payload

Tip: Always respond 202 Accepted immediately — then process the notification asynchronously. Slow responses cause Graph to mark your endpoint unhealthy and stop sending notifications.


What is Microsoft Graph Delta Query?

Delta Query returns only the changes (new, updated, deleted items) since the last call — without re-fetching the full dataset.

Initial full sync:
GET /users/delta?$select=displayName,mail,department

Response includes:
{
  "@odata.deltaLink": "https://graph.microsoft.com/v1.0/users/delta?$deltatoken=abc123",
  "value": [ ...all users... ]
}
→ Store the deltaLink token

Incremental sync (call every 15 minutes):
GET https://graph.microsoft.com/v1.0/users/delta?$deltatoken=abc123
→ Returns ONLY users added, updated, or deleted since last call
→ Provides new deltaLink token for next call

Deleted items appear as:
{ "id": "xyz", "@removed": { "reason": "deleted" } }

Resources supporting delta:
users, groups, messages, events, contacts,
directoryObjects, teams, channels, driveItems

Use cases:
→ User provisioning: sync new/changed/deleted users to HR system
→ Calendar sync: get only new/changed meetings
→ Teams membership: track who joined/left teams
→ Mail processing: process only new emails since last run

Tip: Delta Query is the correct answer to any "how do you efficiently sync M365 data to an external system" question. Never poll the full resource list — use delta for incremental changes only.


What is the Microsoft Graph Batch API?

The Batch API combines up to 20 individual Graph requests into a single HTTP request — reducing round trips.

POST https://graph.microsoft.com/v1.0/$batch
{
  "requests": [
    {
      "id": "1",
      "method": "GET",
      "url": "/me/profile"
    },
    {
      "id": "2",
      "method": "GET",
      "url": "/me/messages?$top=5&$select=subject,from"
    },
    {
      "id": "3",
      "method": "GET",
      "url": "/me/joinedTeams"
    }
  ]
}
Response:
{
  "responses": [
    { "id": "1", "status": 200, "body": { ...profile... } },
    { "id": "2", "status": 200, "body": { ...messages... } },
    { "id": "3", "status": 200, "body": { ...teams... } }
  ]
}

Batch API rules:

  • Up to 20 requests per batch
  • Each request is independent — one failure doesn't affect others
  • Responses may be out of order — matched by id field
  • Dependency chaining via "dependsOn": ["1"] for sequential operations
  • All requests use the same auth token

How do you handle throttling in Microsoft Graph?

Throttling occurs when too many requests are made in a short period. Graph returns 429 Too Many Requests with a Retry-After header.

// Exponential backoff with Retry-After:
async function callGraphWithRetry(url, headers, maxRetries = 5) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    const response = await fetch(url, { headers });

    if (response.status !== 429) {
      return await response.json();
    }

    const retryAfter = parseInt(response.headers.get('Retry-After') || '30');
    const backoff = Math.pow(2, attempt) * 1000;
    const delay = Math.max(retryAfter * 1000, backoff);

    console.log(`Throttled. Retrying after ${delay}ms (attempt ${attempt + 1})`);
    await new Promise(resolve => setTimeout(resolve, delay));
  }
  throw new Error('Max retries exceeded — request consistently throttled');
}

Prevention strategies:

  • Always use $select to reduce response payload
  • Use Delta Query instead of full re-fetch
  • Use Batch API to consolidate requests
  • Implement request queuing with rate limiting
  • Distribute large jobs across time periods (avoid bursts)

5. Graph SDKs & Power Platform Integration

What are the Microsoft Graph SDKs?

SDK Language Package
Graph SDK for .NET C#, F#, VB.NET Microsoft.Graph + Azure.Identity
Graph JavaScript SDK Node.js, browser @microsoft/microsoft-graph-client + @azure/msal-node
Graph SDK for Python Python msgraph-sdk + azure-identity
Graph SDK for Java Java, Android microsoft-graph + azure-identity
Graph PowerShell SDK PowerShell Microsoft.Graph module
// C# Graph SDK — app-only (client credentials):
using Azure.Identity;
using Microsoft.Graph;

var credential = new ClientSecretCredential(tenantId, clientId, clientSecret);
var graphClient = new GraphServiceClient(credential);

var users = await graphClient.Users.GetAsync(config => {
  config.QueryParameters.Select = new[] { "displayName", "mail", "department" };
  config.QueryParameters.Filter = "department eq 'Finance'";
  config.QueryParameters.Top = 50;
});

foreach (var user in users.Value) {
  Console.WriteLine($"{user.DisplayName} - {user.Mail}");
}
// JavaScript SDK — delegated:
const { Client } = require('@microsoft/microsoft-graph-client');
const graphClient = Client.initWithMiddleware({ authProvider });

const messages = await graphClient
  .api('/me/messages')
  .select('subject,from,receivedDateTime,isRead')
  .filter('isRead eq false')
  .top(25)
  .get();

What is the Microsoft Graph PowerShell SDK?

The Graph PowerShell SDK replaces the deprecated Azure AD PowerShell (AzureAD) and MSOnline (MSOL) modules.

# Install and connect:
Install-Module Microsoft.Graph -Scope CurrentUser
Connect-MgGraph -Scopes "User.Read.All","Group.ReadWrite.All"

# Users:
Get-MgUser -UserId "alice@contoso.com"
Get-MgUser -Filter "department eq 'Finance'" -Select DisplayName,Mail
New-MgUser -DisplayName "Bob Smith" -MailNickname "bsmith" -AccountEnabled $true ...
Update-MgUser -UserId $userId -Department "Engineering"

# Groups:
Get-MgGroup -Filter "startswith(displayName,'Sales')"
Add-MgGroupMember -GroupId $groupId -DirectoryObjectId $userId
Remove-MgGroupMemberByRef -GroupId $groupId -DirectoryObjectId $userId

# Mail:
Get-MgUserMessage -UserId $userId -Filter "isRead eq false" -Top 25

# Migration from deprecated modules:
# AzureAD module → Microsoft.Graph PowerShell SDK
Get-AzureADUser         → Get-MgUser
New-AzureADUser         → New-MgUser
Get-AzureADGroup        → Get-MgGroup
Add-AzureADGroupMember  → Add-MgGroupMember
# MSOnline module → Microsoft.Graph PowerShell SDK
Get-MsolUser            → Get-MgUser

Warning: Azure AD PowerShell (AzureAD module) and MSOnline (MSOL) are fully deprecated as of March 2025. All scripts must be migrated to Microsoft.Graph PowerShell SDK — a critical migration point for every M365 admin.


How do you use Microsoft Graph in Power Automate?

Option 1: HTTP with Azure AD connector (recommended):
Action: HTTP
  Method: GET
  URI: https://graph.microsoft.com/v1.0/users?$select=displayName,mail
       &$filter=department eq 'Finance'
  Authentication: Active Directory OAuth
    Tenant: [tenant ID]
    Audience: https://graph.microsoft.com
    Client ID: [app registration client ID]
    Secret: [from Key Vault reference]
→ Follow with Parse JSON to work with response

Option 2: Office 365 Users connector (standard — no premium):
→ Get user profile, manager, direct reports, search users
→ No premium licence required
→ Limited to user-related operations only

Office 365 Users in Power Apps:
Office365Users.UserProfile("alice@contoso.com").DisplayName
Office365Users.ManagerV2({userId: User().Email}).DisplayName
Office365Users.DirectReports(User().Email)

6. Scenario-Based Questions

Scenario: Build a user provisioning system syncing Azure AD users to an external HR database.

  1. App registration: application permission User.Read.All. Certificate authentication (not client secret) for production. Admin consent granted.

  2. Initial full sync:

    GET /users/delta?$select=displayName,mail,department,jobTitle,accountEnabled
    

    Process all users. Store the final @odata.deltaLink token in your database.

  3. Incremental sync (scheduled every 15 minutes):

    GET /users/delta?$deltatoken={stored_token}
    

    Process only changed/added/deleted users. Update stored token.

  4. Deletion handling: when @removed appears → mark user inactive in HR system.

  5. Pagination: follow @odata.nextLink until all results retrieved before storing the final deltaLink.

  6. Error handling: exponential backoff for 429 responses. Dead-letter queue for failed sync items with retry logic.


Scenario: Build a Teams bot that notifies a channel when a new SharePoint list item is created.

  1. App registration: permissions Sites.Read.All + ChannelMessage.Send

  2. Subscribe to SharePoint list changes:

    POST /subscriptions
    {
      "changeType": "created",
      "notificationUrl": "https://yourbot.azurewebsites.net/webhook",
      "resource": "/sites/{siteId}/lists/{listId}/items",
      "expirationDateTime": "2025-04-30T00:00:00Z",
      "clientState": "mySecretState"
    }
    
  3. Webhook endpoint: validate clientState. Respond 202 Accepted immediately. Process async.

  4. Fetch the new item:

    GET /sites/{siteId}/lists/{listId}/items/{itemId}?$expand=fields
    
  5. Post to Teams channel:

    POST /teams/{teamId}/channels/{channelId}/messages
    {
      "body": {
        "contentType": "html",
        "content": "<b>New item:</b> {item.fields.Title}"
      }
    }
    
  6. Subscription renewal: Azure Function timer — renew subscriptions before expiry using PATCH /subscriptions/{id}.


Scenario: Send an email with attachments using Microsoft Graph.

Permission: Mail.Send (delegated = user's own email, application = any user's email)

POST /users/{userId}/sendMail
{
  "message": {
    "subject": "Q1 Project Update",
    "body": {
      "contentType": "HTML",
      "content": "<h1>Q1 Update</h1><p>Please see attached.</p>"
    },
    "toRecipients": [
      { "emailAddress": { "address": "alice@contoso.com", "name": "Alice" } }
    ],
    "ccRecipients": [
      { "emailAddress": { "address": "manager@contoso.com" } }
    ],
    "attachments": [
      {
        "@odata.type": "#microsoft.graph.fileAttachment",
        "name": "Q1_Report.pdf",
        "contentType": "application/pdf",
        "contentBytes": "base64EncodedFileContent=="
      }
    ],
    "importance": "high"
  },
  "saveToSentItems": true
}

For large attachments (> 3MB):
POST /me/messages/{id}/attachments/createUploadSession
→ Upload in chunks up to 4MB each
→ Better reliability for large files

Scenario: Build an M365 usage reporting dashboard.

  1. Permission: Reports.Read.All as application permission (admin consent required)

  2. Available report endpoints:

    GET /reports/getEmailActivityCounts(period='D30')
    GET /reports/getTeamsUserActivityCounts(period='D30')
    GET /reports/getSharePointSiteUsageDetail(period='D30')
    GET /reports/getOneDriveActivityUserDetail(period='D30')
    GET /reports/getM365AppUserDetail(period='D30')
    GET /reports/getMailboxUsageDetail(period='D30')
    
  3. Response format: CSV by default. Pass Accept: application/json header for JSON.

  4. Privacy note: user display names are anonymised by default. Enable "Display concealed user, group, and site names in reports" in M365 Admin → Settings → Reports to show real names.

  5. Power BI integration: use Power BI's built-in M365 Usage Analytics template or import JSON reports for custom dashboards.


7. Cheat Sheet — Quick Reference

Permission Quick Reference

Permission type:  Delegated          Application
User context:     Required           None (no user)
Token contains:   User + app claims  App claims only
Consent:          User or admin      ADMIN ONLY
Use for:          Interactive apps   Background services
Max scope:        What user can do   All tenant data

Common delegated scopes:
User.Read          → read signed-in user's profile
Mail.Read          → read signed-in user's email
Mail.Send          → send mail as signed-in user
Calendars.ReadWrite → manage signed-in user's calendar
Sites.Read.All     → read all SharePoint sites
Group.Read.All     → read all groups

Common application scopes:
User.Read.All      → read all users in tenant
Mail.Read          → read ALL mailboxes in tenant
Calendars.Read     → read ALL calendars in tenant
Sites.Read.All     → read all SharePoint content
Reports.Read.All   → read M365 usage reports

OData Quick Reference

Filter operators:
eq, ne, gt, ge, lt, le    → equals, not-equals, comparison
and, or, not              → logical operators
startswith(prop, 'value') → string starts with
endswith(prop, 'value')   → string ends with
contains(prop, 'value')   → string contains

Common filter patterns:
$filter=department eq 'Finance'
$filter=isRead eq false
$filter=startswith(displayName,'J')
$filter=createdDateTime ge '2025-01-01T00:00:00Z'
$filter=assignedLicenses/any(x:x/skuId eq {skuGuid})

Combine parameters:
/users?$select=displayName,mail&$filter=department eq 'IT'&$orderby=displayName&$top=50

Webhook Subscription Limits

Resource                           Max expiration
Messages (mail)                    4,230 minutes (~3 days)
Calendar events                    4,230 minutes
Users / Groups (directory)         41,760 minutes (~29 days)
Teams channels / chats             60 minutes
SharePoint list items              4,230 minutes
driveItem (OneDrive/SPO)           4,230 minutes

Renewal: PATCH /subscriptions/{id}
  { "expirationDateTime": "new-expiry-datetime" }

Best practice: renew at 75% of subscription lifetime
Schedule a timer function to check and renew all active subscriptions

Delta Query Support

Supports delta:
/users/delta               → user provisioning sync
/groups/delta              → group membership sync
/me/messages/delta         → incremental mail processing
/me/events/delta           → calendar sync
/me/contacts/delta         → contacts sync
/directoryObjects/delta    → all directory changes
/teams/delta               → Teams changes
/sites/{id}/lists/{id}/items/delta  → SharePoint list sync

deltaLink: use to get changes since last call
nextLink:  use to page through large initial sync
@removed:  indicates deleted item { "reason": "deleted" | "changed" }

Graph SDK Comparison

.NET SDK:
Install-Package Microsoft.Graph Azure.Identity
var client = new GraphServiceClient(credential);
var users = await client.Users.GetAsync(...);

JavaScript SDK:
npm install @microsoft/microsoft-graph-client @azure/msal-node
const client = Client.initWithMiddleware({ authProvider });
const result = await client.api('/users').select('displayName').get();

Python SDK:
pip install msgraph-sdk azure-identity
client = GraphServiceClient(credentials=credential, scopes=scopes)
users = await client.users.get()

PowerShell SDK:
Install-Module Microsoft.Graph
Connect-MgGraph -Scopes "User.Read.All"
Get-MgUser -Filter "department eq 'Finance'"

Top 10 Tips

  1. Single endpoint, one auth flow — Graph's key value proposition over pre-Graph era. https://graph.microsoft.com covers all M365 services. Demonstrate you understand this unification.

  2. v1.0 for production, beta never in production — beta APIs can change or be removed without notice. Always a key rule to state in any Graph API design discussion.

  3. Application permissions require admin consent — and give access to ALL tenant data in the granted scope. Always request minimum permissions (least privilege). This shows security awareness.

  4. Always use $select — the default response returns every property. $select dramatically reduces payload size and improves performance at scale. Mention this in every query discussion.

  5. Delta Query for sync, not full re-fetch — the correct answer to any "how do you sync M365 data to an external system" question. Initial full sync + incremental delta = efficient synchronisation.

  6. Change notifications over polling — webhooks push changes to your app in real time. Polling the API every minute is wasteful and gets throttled. Always recommend change notifications for event-driven scenarios.

  7. 202 Accepted immediately, then process async — webhook endpoints must respond within a few seconds. Never do synchronous processing in the webhook handler — queue the work and process it asynchronously.

  8. Retry-After header on 429 — always honour the Retry-After value from throttled responses. Implement exponential backoff. Ignoring throttle responses is a common production bug.

  9. Azure AD PowerShell is deprecated (March 2025) — AzureAD and MSOnline modules are dead. All admin scripts must use Microsoft.Graph PowerShell SDK. Knowing this migration is critical for admin.

  10. Graph Explorer is your best friend — being fluent in Graph Explorer demonstrates practical experience immediately. Know how to use the sample queries, permissions panel, and code snippet generation.



Microsoft 365 Security & Compliance (Purview) Complete Guide

Microsoft 365 Security & Compliance (Purview) — Complete Guide

Sensitivity Labels · DLP · Retention · eDiscovery · Audit · Insider Risk · Zero Trust · Conditional Access · Scenarios · Cheat Sheet


Table of Contents

  1. Core Concepts — Basics
  2. Information Protection & Sensitivity Labels
  3. Data Loss Prevention (DLP)
  4. Data Lifecycle Management & Retention
  5. eDiscovery, Audit & Insider Risk
  6. Identity Protection & Zero Trust
  7. Scenario-Based Questions
  8. Cheat Sheet — Quick Reference

1. Core Concepts — Basics

What is Microsoft Purview and what does it cover?

Microsoft Purview is the unified data governance, risk, and compliance platform in Microsoft 365 — rebranded from Microsoft 365 Compliance Centre in 2022, consolidating Microsoft Information Protection (MIP) and Azure Purview under one brand.

Core capability areas:

Capability Description
Information Protection Sensitivity labels, encryption, rights management
Data Loss Prevention Detect and prevent oversharing of sensitive information
Data Lifecycle Management Retention policies, retention labels, records management
eDiscovery Search, preserve, and export content for legal investigations
Audit Comprehensive activity logging across M365 services
Compliance Manager Assess compliance posture against regulations (GDPR, ISO 27001, HIPAA)
Insider Risk Management Detect risky user behaviour before data leaks occur
Communication Compliance Monitor communications for policy violations

What are the key Microsoft 365 compliance licence tiers?

Feature E3 E5
Audit log retention 90 days 1 year (+10yr add-on)
eDiscovery Standard Premium (Advanced)
Insider Risk Management No Yes
Communication Compliance No Yes
Advanced DLP (Endpoint) Limited Full
Auto-labelling policies Limited Full
Customer Lockbox No Yes
MailItemsAccessed audit No Yes

Warning: Always clarify licence tier before designing a compliance solution. Insider Risk, Communication Compliance, and Advanced eDiscovery require E5 or the E5 Compliance add-on.


What is the Microsoft Purview compliance portal?

The Microsoft Purview compliance portal (compliance.microsoft.com) is the central management interface.

Key features:

  • Content Explorer: shows WHERE sensitive content lives (which sites, mailboxes, folders)
  • Activity Explorer: shows WHAT is happening to labelled content (label applied, removed, file shared externally)
  • Compliance Manager: compliance score dashboard with improvement actions per regulation
  • Solutions: access to all compliance tools — Information Protection, DLP, Records Management, eDiscovery, Audit

Tip: Content Explorer and Activity Explorer are powerful diagnostic tools — the "what" and "where" of your data estate.


2. Information Protection & Sensitivity Labels

What are Sensitivity Labels and how do they work?

Sensitivity Labels are metadata tags applied to emails, documents, meetings, and Microsoft 365 containers (Teams, SharePoint sites, M365 Groups) that define how content should be protected.

Typical label taxonomy:
Public              → no restrictions — safe for external sharing
General             → internal use, no encryption
Confidential
  ├── All Employees   → internal only, no encryption
  └── Specific People → encrypted, only named recipients open
Highly Confidential
  └── Restricted     → encrypted, watermarked, no forwarding/copy/print

What a label can enforce:
→ Encryption (Azure Rights Management): who can open, copy, print, forward
→ Content marking: header ("CONFIDENTIAL"), footer, watermark
→ Access control: prevent edit, copy, print, forward, screen capture
→ Container settings (Teams/SPO sites):
   Privacy: Public or Private (enforced)
   External sharing: blocked
   Unmanaged device access: browser-only or blocked
→ Auto-labelling: apply when sensitive info types detected

Label persistence:
→ Travels with the file wherever it goes
→ Emailed externally, stored in Dropbox, opened on personal device
→ Encryption follows the document — not the location

What is the difference between manual, recommended, and auto-labelling?

  1. Manual labelling: user selects from Sensitivity button in Office apps/Outlook/Teams. Relies on user judgment.

  2. Recommended labelling: client detects sensitive content → pops up recommendation. User can accept or dismiss.

  3. Mandatory labelling: user must select a label before saving or sending — forces deliberate classification.

  4. Auto-labelling (client-side): automatically applies label when sensitive content detected. No user prompt. Works in Office apps.

  5. Auto-labelling policies (service-side): scan content at rest in SharePoint, OneDrive, and Exchange. Applies labels as a background service — catches existing content without user interaction. Requires E5.

Tip: Service-side auto-labelling policies are the most powerful — they scan ALL existing content and label it centrally. The answer to "how do you label millions of existing documents."


What is Azure Rights Management (ARM) and how does it underpin sensitivity labels?

Azure Rights Management is the cloud-based encryption and access control service that enforces sensitivity label protection.

How it works:
1. User applies "Confidential – Specific People" label to a document
2. AIP client encrypts the document with AES-256
3. A use licence is stored: who can access + what they can do
4. When recipient opens: their identity verified via Entra ID
5. If authorised: document decrypts in memory, permissions enforced
   If not authorised: document stays encrypted — cannot open

Permissions ARM can enforce:
VIEW (open/read)        → can read only
EDIT                    → can edit but not copy/print
COPY                    → can copy content to clipboard
PRINT                   → can print
FORWARD                 → can forward (email)
REPLY                   → can reply
EXTRACT                 → can copy/paste content

Super Users:
→ A designated group that can decrypt ANY ARM-protected content
→ Used by eDiscovery administrators and compliance officers
→ Must be explicitly configured — no default super users

3. Data Loss Prevention (DLP)

What is Data Loss Prevention (DLP) in Microsoft Purview?

DLP policies detect and prevent sharing of sensitive information across Microsoft 365. They monitor content, match against sensitive information types or labels, and take protective actions.

DLP policy structure:

1. Locations (where policy applies):
   → Exchange (email)
   → SharePoint Online
   → OneDrive for Business
   → Microsoft Teams chat and channel messages
   → Endpoint devices (Windows/macOS via MDE)
   → Power Platform
   → Microsoft Defender for Cloud Apps

2. Conditions (what triggers the policy):
   → Content contains: SITs (Credit Card, SSN, NHS No., IBAN)
   → Content is labelled: Highly Confidential
   → Being shared: externally / with specific domains
   → Instance count: ≥ 3 credit card numbers

3. Actions (what happens):
   → Block the action (prevent send/share)
   → Block with override (user overrides with justification)
   → Restrict access (accessible to owner only)
   → Show policy tip to user
   → Alert admin / send incident report
   → Quarantine the email

What are Sensitive Information Types (SITs)?

SITs are pattern definitions used to detect sensitive data in content — the "what to look for" in DLP and auto-labelling policies.

Type Description
Built-in SITs 300+ pre-defined (Credit Card, SSN, NHS No., IBAN, Passport No.) — use regex + keyword proximity + checksum
Custom SITs Organisation-specific patterns (employee IDs, project codes) — regex + keyword lists + confidence levels
Trainable classifiers AI/ML models trained on document types ("Legal Contract", "HR Document", "Source Code")
Named entities ML-based detection of personal names, addresses, medical terms — context-aware
Custom SIT example — Employee ID:
Pattern: EMP-[0-9]{6}   (e.g., EMP-123456)
Supporting keywords: "employee", "staff ID", "badge number"
Confidence levels:
  High:   pattern match + keyword within 200 chars
  Medium: pattern match alone

Custom SIT use in DLP:
→ Condition: content contains "Employee ID" custom SIT
→ Action: block external sharing, alert HR compliance team

What is Endpoint DLP?

Endpoint DLP extends Purview DLP policies to Windows 10/11 and macOS devices enrolled in Microsoft Defender for Endpoint (MDE).

Activities monitored and controlled:

  1. Copy to USB/removable media: block or audit
  2. Copy to network share: prevent unauthorised locations
  3. Upload to cloud services: block personal Dropbox, Google Drive, personal OneDrive
  4. Print: block or audit printing of sensitive files
  5. Clipboard copy: block copying content from sensitive documents
  6. Unallowed apps: prevent opening sensitive files in unauthorised apps

Warning: Endpoint DLP requires devices onboarded to Microsoft Defender for Endpoint. Devices not in MDE are unprotected — a common deployment gap.


4. Data Lifecycle Management & Retention

What are retention policies vs retention labels?

Retention policy: applied to a location (all SharePoint sites, all Exchange). Blunt instrument — retains or deletes everything in that location.

Retention label: applied to individual items. Provides item-level lifecycle management. Can declare items as records.

Retention policy (location-level):
→ Applied to: All SharePoint, All Exchange, All Teams
→ Setting: Retain 3 years, then delete
→ Effect: EVERYTHING in those locations retained for 3 years
→ Users cannot delete items during retention period

Retention label (item-level):
→ Applied to: specific documents, emails, or library items
→ Setting: "Contract" label → Retain 7 years, then disposition review
→ Effect: only labelled items have 7-year retention
→ Can declare records (immutable)

Priority rules :
1. Retain wins over delete
2. Longer retention wins over shorter
3. Explicit retention label wins over implicit retention policy

Example:
Retention policy: delete after 3 years
Retention label on item: retain for 7 years
→ RESULT: item kept for 7 years (label wins, longer wins)

Tip: The priority rule is a guaranteed question. "Preserve wins over delete, longer wins over shorter, label wins over policy."


What is Records Management and what is a regulatory record?

Record Type Description
Record Locked from editing/deletion during retention. Label CAN be removed by site owner. Most common.
Regulatory record Strictest — label cannot be removed, content cannot be edited or deleted by ANYONE (including Global Admins) during retention period. Requires admin opt-in.
Disposition review At end of retention period — reviewers must approve deletion or extend before auto-deletion. Full audit trail.
File plan Structured classification system mapping business functions to retention labels. Exportable for regulatory review.

Critical: Regulatory records are irreversible. Once declared, no one — including Global Admins — can delete the content until the retention period expires. Test thoroughly before enabling in production.


What is Preservation Lock?

Preservation Lock locks a retention policy so it cannot be turned off or weakened — even by Global Admins. It is permanent and irreversible.

What Preservation Lock prevents:
→ Cannot decrease the retention duration
→ Cannot disable the policy
→ Cannot remove locations

What is still allowed:
→ Can ADD more locations
→ Can EXTEND the retention period (only strengthening)

Use cases:
→ SEC Rule 17a-4(f) — financial services record immutability
→ FINRA, CFTC regulations
→ Any regulation requiring "policies cannot be circumvented by insiders"

Enable via PowerShell (not available in UI):
Set-RetentionCompliancePolicy -Identity "SEC Records Policy" `
  -RestrictiveRetention $true

Critical: Preservation Lock is permanent. If you lock a policy with incorrect settings, you cannot fix it. Always test retention policies thoroughly before locking.


5. eDiscovery, Audit & Insider Risk

What are the three levels of eDiscovery in Microsoft Purview?

Level Licence Key Features
Content Search Free Search across Exchange/SPO/OneDrive/Teams. Export. No case management, no holds.
eDiscovery (Standard) E3 Case-based. Add custodians, place holds, search within case, export.
eDiscovery (Premium) E5 Custodian management, legal hold notifications, review sets, predictive coding, redaction, chain-of-custody audit.
eDiscovery Standard workflow:
1. Create case: "Smith v Contoso 2025"
2. Add custodians: relevant employees
3. Place holds: preserve Exchange/SPO/OneDrive for custodians
4. Search: keyword + date + sender/recipient filters
5. Review: preview content, identify relevant items
6. Export: PST (email) or file format for legal review

Hold types:
Query-based hold: only items matching search query preserved
Full hold:        ALL custodian content preserved

eDiscovery Premium additions:
→ Legal hold notifications: formal notice to custodians via email workflow
→ Review sets: collect evidence, apply tags/annotations/redactions
→ Predictive coding: AI scores relevance — prioritise review
→ Export formats: native, PST, PDF with bates numbering

What is Microsoft Purview Audit?

Purview Audit captures a comprehensive log of user and admin activities across Microsoft 365.

Audit tiers:
Standard (E3): 90-day retention, standard activities
Premium (E5):  1-year default (10-year add-on), high-value events

Key activities captured:
SharePoint: FileAccessed, FileSharingInvitationCreated,
            SitePermissionsModified, SensitivityLabelApplied
Exchange:   MailItemsAccessed (Premium only!), Send,
            SoftDelete, HardDelete, RecordDelete
Teams:      MeetingParticipantDetail, ChatCreated, MessageSent
Admin:      AddUser, ResetPassword, AssignRole, ChangePolicy

Search via portal:
Purview → Audit → New search
→ Filter: User, Activity type, Date range, Workload
→ Export to CSV for analysis

PowerShell:
Search-UnifiedAuditLog `
  -StartDate "2025-01-01" -EndDate "2025-01-31" `
  -Operations "FileAccessed" `
  -UserIds "user@contoso.com" `
  -ResultSize 5000

Tip: MailItemsAccessed (Premium audit only) shows which emails a compromised account READ — not just sent. Critical for breach investigations. This is the key E5 audit differentiator.


What is Insider Risk Management?

Insider Risk Management (IRM) uses Microsoft 365 signals to detect risky user behaviour patterns before they result in data leaks.

Built-in policy templates:

Template Detects
Data theft by departing employee Mass downloads, USB copies, sharing spikes in 90 days before/after last day
General data leaks Unusual uploads to personal cloud, email to personal accounts, USB copies
Data leaks by priority users Enhanced monitoring for executives, employees with elevated access
Security policy violations Repeated disabling of security software, bypassing controls
Patient data misuse Inappropriate access to patient records (Healthcare)

Warning: IRM requires HR and Legal involvement. Only compliance investigators see alerts — not regular managers. Involve your Data Protection Officer before deployment.


What is Communication Compliance?

Communication Compliance monitors internal and external communications for policy violations.

Policy types:
Offensive language / harassment  → detect threats, discrimination in Teams/email/Yammer
Financial regulatory compliance  → detect potential SEC/FINRA violations
Confidential info disclosure     → detect inappropriate sharing of business secrets
Conflict of interest             → detect undisclosed conflicts
Custom policies                  → keyword lists, SITs, trainable classifiers

Workflow:
Policy configured → communications captured and analysed
  → ML model scores each communication
  → High-confidence matches → alert for reviewer
  → Reviewer: Resolve / Escalate / Tag / Notify user
  → Full audit trail of all review actions

Requirements:
→ E5 or Communication Compliance add-on
→ Reviewers need Communication Compliance Analyst role
→ Privacy notice to users required in most jurisdictions (legal review)

6. Identity Protection & Zero Trust

What is Microsoft Defender for Office 365?

MDO protects against advanced email threats bypassing basic spam/malware filters.

Feature How It Works
Safe Attachments Opens attachments in a virtual sandbox (detonation) before delivery. Malicious → blocked. Adds 1-5 min delay.
Safe Links Replaces URLs with Microsoft-proxied links. Re-evaluates at time-of-click for malicious content.
Anti-phishing ML detects impersonation attacks, domain lookalikes, spear phishing, BEC.
Attack Simulation Training Send simulated phishing emails. Track who clicked. Auto-enrol clickers in security training.
Threat Explorer Investigate email threats — malicious emails received, delivery actions, users targeted.

Tip: Safe Attachments = sandboxing on delivery. Safe Links = time-of-click re-evaluation. Both are frequently asked about in .


What is Zero Trust and how does Microsoft implement it?

Zero Trust assumes breach and requires every access request to be explicitly authorised regardless of network location.

Three Zero Trust principles:

  1. Verify explicitly: authenticate and authorise based on all available signals — identity, location, device health, data classification, anomalies
  2. Use least privilege access: just-in-time, just-enough access; risk-based adaptive policies
  3. Assume breach: minimise blast radius, segment access, verify encryption, use analytics
Microsoft Zero Trust pillars:
Identity        → Entra ID, MFA, Conditional Access, PIM
Endpoints       → Intune, Defender for Endpoint, compliance policies
Applications    → Defender for Cloud Apps, app governance
Data            → Purview sensitivity labels, DLP, encryption
Infrastructure  → Defender for Cloud, RBAC, Azure Policy
Network         → Network segmentation, Defender for Identity, Azure Firewall

Zero Trust access evaluation example:
User accesses SharePoint:
1. Is identity verified? (MFA complete) → Yes
2. Is device compliant? (Intune managed, patches current) → Yes
3. Is location allowed? (not blocked by Conditional Access) → Yes
4. Is data appropriately labelled? (sensitivity label applied) → Yes
All checks passed → access granted with minimum permissions

What is Conditional Access?

Conditional Access is the Entra ID policy engine that evaluates signals and enforces access decisions — the "if/then" engine of Zero Trust.

# Conditional Access policy structure:
WHEN (Assignments):
  Users: All users / Specific groups / Guest users
  Apps:  All cloud apps / Specific apps (SharePoint, Teams)
  Conditions:
    Sign-in risk:    Low / Medium / High (Identity Protection)
    Device platform: Windows / iOS / Android
    Location:        Trusted IP ranges / Named locations / Countries
    Client apps:     Browser / Mobile apps / Legacy auth clients
    Device state:    Compliant / Hybrid joined / Unmanaged

THEN (Access controls):
  Block access
  OR Grant with conditions:
    Require MFA
    Require device compliance (Intune)
    Require hybrid Azure AD join
    Require approved client app
  Session controls:
    Sign-in frequency (re-auth every X hours)
    Persistent browser: No (close browser = sign out)
    App-enforced restrictions (browser-only on SharePoint)

Key policies to implement:
1. Require MFA for all users (exclude break-glass accounts)
2. Block legacy authentication (highest-impact single policy)
3. Require compliant device for Office 365 apps
4. Allow SharePoint from personal devices — browser only
5. Block access from high-risk sign-in locations
6. Require MFA for admin roles always

Tip: Blocking legacy authentication is the highest-impact single security action. Legacy auth (POP, IMAP, SMTP, basic auth) cannot support MFA — it's the primary vector for credential stuffing attacks.


What is Privileged Identity Management (PIM)?

PIM provides just-in-time (JIT) privileged access — users hold elevated roles only when needed, for limited time, with approval and audit trail.

PIM role assignment types:
Eligible: user CAN activate the role but is not permanently assigned
Active:   role is live — user has elevated permissions now
Permanent active: always active (break-glass accounts only)

Activation flow:
User requests activation → provides justification → selects duration
  → Approval required from designated approver(s)
  → MFA required at activation
  → Role active for requested duration (max configurable)
  → After duration: role auto-deactivates

Without PIM:
Global Admin → permanently assigned → compromised account
= attacker has Global Admin permanently

With PIM:
Global Admin eligible → user activates for 2 hours when needed
→ Justification: "Creating new app registration for Project X"
→ Approval: IT Security manager approves
→ After 2 hours: role expires automatically
→ Compromised account: attacker has NO elevated access

Key roles to manage in PIM:
Global Administrator, SharePoint Administrator, Exchange Administrator,
Teams Administrator, Security Administrator, Compliance Administrator

Tip: PIM is one of the most impactful Microsoft 365 security controls. Nobody should have permanent standing Global Admin access — always a key recommendation in any security assessment.


7. Scenario-Based Questions

Scenario: Design a data protection strategy for a financial services firm under GDPR.

  1. Sensitivity label taxonomy: Public / Internal / Confidential (All Staff) / Confidential (Finance) / Highly Confidential (PII). Highly Confidential enforces encryption, no external sharing, watermarking.

  2. Auto-labelling policies: scan all SharePoint, OneDrive, Exchange — auto-apply "Confidential (PII)" label when EU GDPR SITs detected (EU National IDs, credit cards, IBAN, health info).

  3. DLP policies:

    • Block external sharing of content containing personal data SITs
    • Prevent emailing EU personal data to non-approved domains
    • Endpoint DLP: block USB copy of PII-containing files
  4. Retention policies:

    • 6-year financial records retention (regulatory requirement)
    • Separate 3-year personal data retention (GDPR data minimisation)
    • Disposition review at end of period — human review before deletion
  5. Audit: Enable E5 Advanced Audit. 1-year audit log retention. Alerts for mass file downloads and unusual external sharing.

  6. Conditional Access: MFA enforced for all, compliant device required, block legacy auth, browser-only for financial apps on personal devices.

  7. Compliance Manager: track GDPR assessment score, document evidence of controls, work through improvement actions.


Scenario: Suspected data exfiltration by a departing employee. How do you investigate?

  1. Insider Risk Management: check IRM "departing employee" alerts for the user. Review risk timeline showing download spikes, USB activity, cloud upload patterns.

  2. Audit log search:

    Search-UnifiedAuditLog -StartDate "2025-01-01" -EndDate "2025-03-01" `
      -UserIds "employee@contoso.com" `
      -Operations "FileDownloaded,FileSyncDownloadedFull,Send,MailItemsAccessed"
    

    Look for: volume/timing of downloads, mass sync, emails to personal accounts.

  3. eDiscovery case: create Standard/Premium case. Place hold on employee's Exchange mailbox and OneDrive to preserve evidence immediately.

  4. Content search: search for company IP (product names, internal codenames) in the employee's outbound email to personal addresses.

  5. Endpoint DLP: review DLP incident reports for USB copy activity from the employee's device.

  6. Legal hold notification: if proceeding legally, issue formal hold via eDiscovery Premium custodian management.

  7. Preserve before offboarding: convert mailbox to Inactive Mailbox before deleting the account. Apply retention hold on OneDrive. Do NOT delete the account immediately.


Scenario: A user cannot delete a document they created. Why and how do you resolve it?

Diagnose in order:

  1. Retention label: open document in SharePoint → View properties → check for applied retention label. "Record" or active retention period = deletion blocked.
  2. Retention policy: Purview → Data Lifecycle Management → Retention policies → check if any policy covers the site with active retention.
  3. eDiscovery hold: check active eDiscovery cases — if document is in scope of a hold, deletion is blocked by the hold, not the policy.
  4. Sensitivity label permissions: check if the label restricts deletion to owners only.

Resolution:

  • Retention label: site owner or records manager can remove non-regulatory labels. Regulatory records: cannot be removed until retention period expires.
  • Retention policy: wait for expiry, or modify policy scope if appropriate.
  • eDiscovery hold: must be released by the eDiscovery case manager — cannot be bypassed by admin.

Scenario: Implement comprehensive email security for a 1,000-person organisation.

  1. Block legacy authentication (highest-impact single action): Conditional Access → block POP, IMAP, SMTP AUTH, basic auth for all users.

  2. Enforce MFA: Conditional Access → require MFA for all users, all cloud apps. Use Authenticator app (not SMS for sensitive roles).

  3. Enable Defender for Office 365 Plan 2:

    • Safe Attachments: all internal + external email
    • Safe Links: email + Teams + Office apps
    • Anti-phishing: impersonation protection for all executives
  4. Configure email authentication (DNS):

    SPF:   v=spf1 include:spf.protection.outlook.com -all
    DKIM:  Add DKIM signatures via Exchange Admin Centre
    DMARC: v=DMARC1; p=reject; rua=mailto:dmarc@contoso.com
    

    DMARC p=reject prevents spoofing of your domain.

  5. DLP on email: detect and block exfiltration of sensitive data (credit cards, PII, financial data) via email.

  6. Attack Simulation Training: quarterly phishing simulations. Auto-enrol clickers in security awareness training.

  7. Mailbox audit (E5): enable MailItemsAccessed for all sensitive mailboxes (executives, finance, HR).

  8. PIM for admin roles: no standing Exchange Admin or Global Admin. Activate via PIM with approval + justification.


Scenario: How do you assess and improve your organisation's compliance posture?

  1. Compliance Manager: navigate to Purview → Compliance Manager. Review compliance score across regulations (GDPR, ISO 27001, NIST, HIPAA). Score is expressed as a percentage of controls implemented.

  2. Improvement actions: Compliance Manager lists specific improvement actions — each with description, implementation guidance, points value, and test status. Prioritise high-point, high-impact actions.

  3. Assessments: create regulation-specific assessments. Map Microsoft-managed controls (what Microsoft does) and customer-managed controls (what you must do).

  4. Evidence collection: for each customer-managed control, upload evidence (policies, screenshots, certificates). Compliance Manager stores evidence for audit.

  5. Regulatory templates: Compliance Manager includes 300+ pre-built templates for global regulations. Use the template for your specific regulation(s).

  6. Action tracking: assign improvement actions to team members with due dates. Track completion status. Compliance Manager integrates with Microsoft Secure Score.


8. Cheat Sheet — Quick Reference

Sensitivity Label Hierarchy

Public → General → Confidential → Highly Confidential → Regulatory

Each level adds more protection:
Public:            No restrictions
General:           Internal only, no encryption
Confidential:      May include encryption, content marking
Highly Confidential: Encryption required, no external sharing, watermark
Regulatory:        All above + immutable record declaration

Container label (Teams/SharePoint site):
→ Privacy enforcement (Public/Private)
→ External sharing restriction
→ Unmanaged device restriction
→ Documents created inherit the container label

DLP Policy Quick Reference

Locations:
Exchange (email)         SharePoint         OneDrive
Teams chat/channel       Endpoint devices   Power Platform

Conditions:
Contains SIT             Labelled as        Shared externally
Instance count ≥ N       Recipient domain   File extension

Actions (least to most restrictive):
Policy tip only          Notify + allow     Override with justification
Block with override      Block completely   Quarantine + alert admin

Priority of DLP policies:
Lower number = higher priority
First matching policy wins (unless "Stop processing more rules" disabled)

Retention Priority Rules

Rule 1: Retain wins over delete
  Retention policy says delete after 3 years
  Retention label says retain for 7 years
  → Content retained for 7 years (retain wins)

Rule 2: Longer retention wins over shorter
  Policy 1: retain 3 years
  Policy 2: retain 5 years
  → Content retained for 5 years (longer wins)

Rule 3: Explicit label wins over implicit policy
  Retention policy applies to entire SharePoint site
  Retention label applies to specific document
  → Label settings apply to that document (explicit wins)

Hold priority:
eDiscovery hold > Retention label > Retention policy
(Holds always win — preserve for legal proceedings)

eDiscovery Levels

Content Search (Free):
→ Search across all M365 locations
→ Export results
→ No holds, no case management

eDiscovery Standard (E3):
→ Case management
→ Custodian holds (preserve content)
→ Case-scoped searches and exports

eDiscovery Premium (E5):
→ All Standard features
→ Legal hold notifications to custodians
→ Review sets with tags, annotations, redactions
→ Predictive coding (AI relevance scoring)
→ Chain-of-custody audit trail
→ Multiple export formats with bates numbering

Conditional Access Key Policies

Policy 1 — Require MFA for all users:
  Users: All   Apps: All cloud apps
  Grant: Require MFA
  Exclude: Break-glass accounts, service accounts

Policy 2 — Block legacy authentication (HIGHEST IMPACT):
  Users: All   Apps: All cloud apps
  Conditions: Client apps = Exchange ActiveSync + Other clients
  Grant: Block access

Policy 3 — Require compliant device:
  Users: All   Apps: Office 365
  Grant: Require device compliance (Intune)

Policy 4 — Browser-only for personal devices:
  Users: All   Apps: SharePoint / OneDrive
  Conditions: Device state = Unregistered
  Session: App-enforced restrictions (browser only, no download)

Policy 5 — Admin MFA always:
  Users: All admin roles
  Apps: All cloud apps
  Grant: Require MFA + Require compliant device

PIM Quick Reference

Role states:
Eligible  → can activate, not currently active
Active    → currently has the elevated permissions
Permanent → always active (break-glass accounts only)

Activation settings (configurable per role):
Max duration:          1 hour to 24 hours
Require justification: Yes (always recommended)
Require approval:      Yes for Global Admin, Security Admin
Require MFA:           Yes always
Notification:          Email to approvers + admin

Recommended roles to manage in PIM:
Global Administrator      Security Administrator
SharePoint Administrator  Exchange Administrator
Teams Administrator       Compliance Administrator
Billing Administrator     User Administrator

Compliance Score Components

Microsoft Purview Compliance Manager score:
Total score = Points achieved / Total points possible × 100

Point categories:
Microsoft-managed controls: ~50% (what Microsoft does for you)
Customer-managed controls:  ~50% (what you must configure)

Priority improvement actions (high points):
→ Enable MFA for all users
→ Enable audit log recording
→ Configure sensitivity labels
→ Enable DLP policies
→ Configure retention policies
→ Enable Safe Attachments and Safe Links
→ Block legacy authentication
→ Enable PIM for privileged roles
→ Configure DMARC, SPF, DKIM
→ Enable Endpoint DLP

Top 10 Tips

  1. Retention priority: preserve beats delete, longer beats shorter, label beats policy — the most tested retention rule in every compliance . Know it by heart.

  2. Regulatory records are truly irreversible — once declared, no one including Global Admins can delete the content until retention expires. Always emphasise this for risk-aware recommendations.

  3. Block legacy authentication first — the highest single-impact security action. Legacy auth cannot support MFA and is the primary credential stuffing vector. Always recommend this before anything else.

  4. MailItemsAccessed requires E5 audit — this is the forensic differentiator. Shows which emails a compromised account READ. Knowing this detail separates candidates in breach investigation scenarios.

  5. Service-side auto-labelling covers existing content — client-side only works on content users open. Service-side scans ALL content in SharePoint/OneDrive/Exchange in the background. The answer to labelling millions of existing documents.

  6. PIM = no standing admin access — eligible assignments + JIT activation = minimal blast radius if accounts are compromised. Always recommend PIM over permanent admin roles.

  7. eDiscovery holds trump retention policies — a legal hold preserves content regardless of any retention policy configured to delete. Know this interaction for any litigation scenario.

  8. Preservation Lock is irreversible — only enable after thorough testing. Once locked, you cannot weaken the policy even as Global Admin. Regulatory requirement (SEC Rule 17a-4) is the primary use case.

  9. Endpoint DLP requires MDE onboarding — devices not in Microsoft Defender for Endpoint are not protected. Always check device onboarding coverage when designing Endpoint DLP.

  10. Compliance Manager score is actionable — it's not just a vanity metric. Each improvement action has specific guidance, evidence requirements, and point value. Walk through Compliance Manager to show you know how to systematically improve posture.



Microsoft Azure DevOps & GitHub Actions Complete Guide

Microsoft Azure DevOps & GitHub Actions — Complete Guide

CI/CD · Pipelines · Workflows · IaC · Bicep · GitOps · Deployment Strategies · Security · Scenarios · Cheat Sheet


Table of Contents

  1. Core Concepts — Basics
  2. Azure DevOps — Deep Dive
  3. GitHub Actions — Deep Dive
  4. CI/CD Patterns & Deployment Strategies
  5. Infrastructure as Code with Bicep
  6. Security & Governance
  7. Scenario-Based Questions
  8. Cheat Sheet — Quick Reference

1. Core Concepts — Basics

What is DevOps and what are its core principles?

DevOps is a culture and set of practices combining development (Dev) and IT operations (Ops) to shorten the development lifecycle and deliver high-quality software continuously. It is a methodology, not a tool.

Core principles (CALMS):

  • Culture: shared responsibility between dev and ops — no "throw it over the wall"
  • Automation: automate everything repeatable — builds, tests, deployments, infrastructure
  • Lean: minimise waste, reduce batch sizes, deliver value continuously
  • Measurement: data-driven decisions — track DORA metrics
  • Sharing: knowledge sharing, transparency, collaboration

The four DORA metrics (essential knowledge):

  1. Deployment Frequency: how often code is deployed to production
  2. Lead Time for Changes: time from commit to production
  3. Change Failure Rate: % of deployments causing failures
  4. Mean Time to Recovery (MTTR): time to restore service after failure

Tip: DORA metrics come up in every DevOps maturity question. Memorise them — they're the universal benchmarks.


What is the difference between Azure DevOps and GitHub Actions?

Azure DevOps GitHub Actions
Type Enterprise DevOps suite (Boards, Repos, Pipelines, Test Plans, Artifacts) Code hosting + CI/CD platform
Pipelines YAML or Classic (UI-based, deprecating) YAML only — event-driven workflows
Audience Enterprise IT, regulated industries Modern dev teams, open-source, GitHub-centric
Marketplace Azure DevOps extensions GitHub Marketplace (massive ecosystem)
Strengths Mature ALM, work tracking, compliance Modern UX, marketplace, OSS ecosystem
Future Maintained for existing customers Microsoft's strategic direction

Tip: For new projects in 2025, GitHub Actions is the modern recommended choice. Azure DevOps remains right for enterprises with existing investment, complex compliance needs, or deep Microsoft tooling integration.


What is Continuous Integration (CI) vs Continuous Delivery (CD) vs Continuous Deployment?

Continuous Integration (CI):
→ Developers merge code into main multiple times per day
→ Each merge triggers automated build and test
→ Detects integration issues early

Continuous Delivery (CD):
→ Every change passing CI is automatically prepared for release
→ Deployment to production is a manual decision (approval gate)
→ Always production-ready

Continuous Deployment:
→ Every change passing CI/CD is automatically deployed to production
→ No manual intervention
→ Highest automation maturity — requires extensive automated testing + feature flags

Maturity progression:
Manual deploys → CI → CI/CD (Delivery) → CI/CD (Deployment)

Most enterprises stop at CD (Delivery) due to compliance and risk tolerance.

What is Infrastructure as Code (IaC)?

IaC manages and provisions infrastructure through code rather than manual processes. Infrastructure definitions are stored in source control alongside application code.

Why IaC matters:

  1. Reproducibility: spin up identical environments anywhere from the same code
  2. Version control: changes tracked in Git — see who, what, when, why
  3. Code review: infrastructure changes go through PR review
  4. Drift detection: detect manual changes and remediate
  5. Disaster recovery: rebuild environments in minutes, not days
Tool Best For
Bicep Azure-native, modern syntax, recommended for Azure
ARM templates Original Azure IaC, JSON-based, complex syntax
Terraform Multi-cloud (Azure, AWS, GCP), industry standard
Pulumi IaC in real programming languages (TS, Python, C#)
Azure CLI/PowerShell Imperative scripts, one-off automation

Tip: For Microsoft-only stacks, Bicep is the modern recommendation — no state file management, native Azure integration. For multi-cloud, Terraform.


2. Azure DevOps — Deep Dive

What are the five core services of Azure DevOps?

Service Purpose
Azure Boards Work tracking — backlogs, sprints, kanban, user stories, tasks, bugs
Azure Repos Source control — Git (or legacy TFVC), branch policies, pull requests
Azure Pipelines CI/CD — YAML or Classic pipelines, any language, any platform
Azure Test Plans Manual and exploratory testing, test case management
Azure Artifacts Package management — NuGet, npm, Maven, Python, Universal

Tip: Azure DevOps services are modular. Use Pipelines with GitHub-hosted code. Or Azure Boards with GitHub Repos. The integration is built-in.


What is the structure of a YAML pipeline in Azure DevOps?

# azure-pipelines.yml
trigger:                          # When to run
  branches:
    include: [main, develop]
  paths:
    include: [src/**]
    exclude: [docs/**]

pr:                                # PR validation
  branches:
    include: [main]

variables:
  buildConfiguration: 'Release'
  vmImage: 'ubuntu-latest'

stages:                            # Top-level workflow
  - stage: Build
    jobs:
      - job: BuildJob
        pool:
          vmImage: $(vmImage)
        steps:
          - task: UseDotNet@2
            inputs:
              version: '8.0.x'
          - script: dotnet build --configuration $(buildConfiguration)
          - task: DotNetCoreCLI@2
            inputs:
              command: 'test'
              projects: '**/*.Tests.csproj'
          - task: PublishBuildArtifacts@1
            inputs:
              PathtoPublish: '$(Build.ArtifactStagingDirectory)'

  - stage: DeployToTest
    dependsOn: Build
    condition: succeeded()
    jobs:
      - deployment: DeployToTest
        environment: 'Test'        # Triggers approvals/checks
        strategy:
          runOnce:
            deploy:
              steps:
                - script: echo "Deploy to Test environment"

Hierarchy: Pipeline → Stages → Jobs → Steps (Tasks)


What are agents and agent pools?

Agent: a service running pipeline jobs. Each agent runs one job at a time. Agent pool: a collection of agents.

Microsoft-hosted agents:
→ Microsoft maintains the VMs
→ Free tier: 1,800 minutes/month per organisation
→ Pre-installed software (Node, .NET, Python, Docker)
→ Disposable: fresh VM per job
→ vmImage: ubuntu-latest, windows-latest, macos-latest
→ Best for: standard builds with no infrastructure dependencies

Self-hosted agents:
→ Run on YOUR infrastructure (Windows, Linux, macOS, Docker, K8s)
→ Connect outbound to Azure DevOps
→ Persistent — software cache persists between jobs
→ Access to internal resources (private networks, on-prem databases)
→ No build minute limits
→ Best for: regulated industries, on-prem integrations, large monorepos

Configure pool:
pool:
  vmImage: 'ubuntu-latest'    # Microsoft-hosted

# OR:
pool: 'MyOnPremPool'          # Self-hosted pool name

What are Service Connections and why are they critical?

A Service Connection is a configured authentication credential used by pipelines to connect to external systems (Azure subscriptions, GitHub, Docker registries, Kubernetes, NuGet feeds).

Authentication methods (Azure ARM):

  • Service Principal (manual): manually create SP, paste secret
  • Service Principal (automatic): Azure DevOps creates the SP
  • Managed Identity: pipeline uses managed identity (more secure)
  • Workload Identity Federation (OIDC): NO secrets — uses short-lived tokens. Modern best practice.
# OIDC-based service connection (no secrets stored):
- task: AzureCLI@2
  inputs:
    azureSubscription: 'MyAzureSub-OIDC'
    scriptType: 'bash'
    scriptLocation: 'inlineScript'
    inlineScript: |
      az group list
# Token issued at runtime via federation — no client secret

Warning: Workload Identity Federation (OIDC) is the modern, secure best practice. Avoid storing service principal secrets — they require rotation and are credential theft targets.


What are Environments in Azure Pipelines and how do approvals work?

Environments represent deployment targets (Dev, Test, Prod) with approval gates and resource references.

- stage: Production
  jobs:
    - deployment: DeployProd
      environment: 'Production'   # Approval check fires here
      strategy:
        runOnce:
          deploy:
            steps:
              - script: ./deploy.sh

Approval and check types:

  1. Approvals: configure approvers (any one or all required)
  2. Branch control: only allow deployments from specific branches
  3. Business hours: restrict deployments to time windows
  4. Required template: deployment must use specific YAML template
  5. Exclusive lock: only one deployment to environment at a time
  6. Invoke REST API / Azure Function check: external validation gates

3. GitHub Actions — Deep Dive

What is the GitHub Actions workflow structure?

Workflows are YAML files in .github/workflows/, triggered by GitHub events.

# .github/workflows/ci.yml
name: CI Pipeline

on:                              # Trigger events
  push:
    branches: [main, develop]
  pull_request:
    branches: [main]
  schedule:
    - cron: '0 2 * * *'         # Nightly at 2 AM
  workflow_dispatch:             # Manual trigger button

env:
  NODE_VERSION: '20'

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      
      - name: Setup Node
        uses: actions/setup-node@v4
        with:
          node-version: ${{ env.NODE_VERSION }}
          cache: 'npm'
      
      - name: Install dependencies
        run: npm ci
      
      - name: Run tests
        run: npm test
      
      - name: Upload artifact
        uses: actions/upload-artifact@v4
        with:
          name: build-output
          path: dist/

  deploy:
    needs: build
    runs-on: ubuntu-latest
    if: github.ref == 'refs/heads/main'
    environment: production        # Triggers approval if configured
    steps:
      - uses: actions/checkout@v4
      - name: Deploy
        run: ./scripts/deploy.sh

Hierarchy: Workflow → Jobs → Steps


What are GitHub Actions, the Marketplace, and how do you use them?

An "Action" is a reusable unit of code published as a marketplace component.

# Common marketplace actions:
actions/checkout@v4              # Clone the repo
actions/setup-node@v4            # Install Node.js
actions/cache@v4                 # Cache dependencies
actions/upload-artifact@v4       # Persist build outputs

azure/login@v2                   # Authenticate to Azure
azure/webapps-deploy@v3          # Deploy to App Service
azure/arm-deploy@v2              # Deploy ARM/Bicep

docker/build-push-action@v5      # Build and push Docker images
docker/login-action@v3           # Login to Docker registries

github/codeql-action@v3          # Run CodeQL security scan
microsoft/playwright-github-action@v1  # Run Playwright tests

# Three action types:
# 1. JavaScript actions: Node.js code on the runner
# 2. Docker container actions: run inside a container
# 3. Composite actions: multiple steps as a reusable action

Critical: Always pin actions to a specific version tag (@v4) or SHA hash. Using @main or @latest is a security risk — the action's behaviour can change without warning.


What are GitHub-hosted vs self-hosted runners?

GitHub-hosted Self-hosted
Infrastructure GitHub-managed VMs Your infrastructure
Free tier 2,000 min/month private repos (free public) No build minute limits
Pre-installed software Yes (Node, .NET, Python, Docker) What you install
Persistence Disposable (fresh VM per job) Persistent (cache survives)
Network access Public internet only Internal/private networks
Best for Standard builds Regulated industries, internal resources
runs-on: ubuntu-latest                # Hosted
runs-on: [self-hosted, linux, prod]   # Self-hosted with labels
runs-on: ubuntu-latest-16-cores       # Larger hosted runner

Warning: Self-hosted runners on PUBLIC repos are dangerous — anyone creating a PR can run code on your runner. Never expose self-hosted runners to public repos without strict controls.


What are GitHub Actions Environments?

Environments represent deployment targets with protection rules.

deploy-prod:
  needs: build
  runs-on: ubuntu-latest
  environment:
    name: production
    url: https://app.contoso.com    # Display URL
  steps:
    - name: Deploy to production
      run: ./deploy.sh
      env:
        API_KEY: ${{ secrets.PROD_API_KEY }}   # Environment secret

Environment protection rules:

  1. Required reviewers: up to 6 reviewers (any/all required)
  2. Wait timer: enforce delay before deployments proceed
  3. Deployment branches: restrict which branches can deploy
  4. Environment secrets: scoped to this environment only
  5. Environment variables: non-sensitive config per environment

What are reusable workflows vs composite actions?

Reusable workflows: complete workflows callable from other workflows. Used to share entire CI/CD logic across repos.

Composite actions: a series of steps bundled as a reusable action. Smaller scope — encapsulates a few related steps.

# .github/workflows/reusable-deploy.yml
name: Deploy
on:
  workflow_call:
    inputs:
      environment:
        required: true
        type: string
    secrets:
      AZURE_CREDENTIALS:
        required: true

jobs:
  deploy:
    runs-on: ubuntu-latest
    environment: ${{ inputs.environment }}
    steps:
      - uses: actions/checkout@v4
      - uses: azure/login@v2
        with:
          creds: ${{ secrets.AZURE_CREDENTIALS }}
      - run: ./deploy.sh ${{ inputs.environment }}

# Calling reusable workflow:
# .github/workflows/main.yml
jobs:
  call-deploy:
    uses: ./.github/workflows/reusable-deploy.yml
    with:
      environment: production
    secrets:
      AZURE_CREDENTIALS: ${{ secrets.AZURE_CREDENTIALS }}

4. CI/CD Patterns & Deployment Strategies

What deployment strategies are available?

Strategy Description Best For
Recreate Stop old → deploy new (downtime) Non-critical apps
Rolling Update instances in batches Stateless web apps
Blue-Green Two identical environments, swap traffic Production with instant rollback
Canary Deploy to small subset (5%→10%→50%→100%), monitor, scale High-traffic production
Feature flags Deploy code with features disabled, enable gradually Decouple deployment from release
# Azure Pipelines deployment strategies:
strategy:
  rolling:
    maxParallel: 25%
  canary:
    increments: [10, 20]
  runOnce:
    deploy:
      steps: [...]

Tip: For high-traffic production services, combine Canary + Feature Flags. The canary catches infrastructure/integration issues; flags let you control feature exposure independently of deployment.


What is GitOps and how does it differ from traditional CI/CD?

GitOps is a declarative deployment paradigm where infrastructure and application desired state is in Git, and an automated agent continuously reconciles the live environment to match.

Traditional CI/CD (push model):
Developer → Git → CI builds → CD pushes to environment
→ Pipeline has direct access to deploy
→ Deploy actions are imperative (run scripts)
→ Drift between Git and environment possible

GitOps (pull model):
Developer → Git ← Agent in target environment pulls
→ Agent runs IN target environment (Flux, ArgoCD on Kubernetes)
→ Continuously reconciles state to match Git
→ No drift — agent enforces Git as source of truth
→ Roll back = revert Git commit → agent auto-rolls back

Key tools:

  • Flux: Kubernetes-native GitOps controller (CNCF graduate)
  • ArgoCD: Kubernetes-native GitOps with rich UI
  • Azure GitOps Configuration: Microsoft-managed Flux on AKS

What are pipeline templates and how do they enforce standards?

Templates allow centrally maintained YAML defining standard CI/CD logic referenced by other pipelines.

# templates/build-dotnet.yml (in central pipeline-templates repo)
parameters:
  - name: configuration
    type: string
    default: 'Release'
  - name: runTests
    type: boolean
    default: true

steps:
  - task: UseDotNet@2
    inputs:
      version: '8.0.x'
  - script: dotnet restore
  - script: dotnet build --configuration ${{ parameters.configuration }}
  - ${{ if eq(parameters.runTests, true) }}:
    - script: dotnet test --collect:"XPlat Code Coverage"
  - task: PublishCodeCoverageResults@2
    inputs:
      codeCoverageTool: 'Cobertura'
      summaryFileLocation: '**/coverage.cobertura.xml'

# Consumer pipeline:
resources:
  repositories:
    - repository: templates
      type: git
      name: PlatformTeam/pipeline-templates

steps:
  - template: templates/build-dotnet.yml@templates
    parameters:
      configuration: 'Release'
      runTests: true

Tip: Templates enforce: consistent build steps, mandatory security scans, code coverage thresholds, branding/notifications. The platform team maintains gold-standard pipelines.


5. Infrastructure as Code with Bicep

How do you implement Bicep in CI/CD?

The recommended pattern: Validate → What-If → Deploy.

# GitHub Actions example:
name: Deploy Azure Infrastructure

on:
  push:
    branches: [main]
    paths: ['infra/**']

jobs:
  validate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: azure/login@v2
        with:
          client-id: ${{ secrets.AZURE_CLIENT_ID }}
          tenant-id: ${{ secrets.AZURE_TENANT_ID }}
          subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}
      
      - name: Validate Bicep
        uses: azure/arm-deploy@v2
        with:
          subscriptionId: ${{ secrets.AZURE_SUBSCRIPTION_ID }}
          resourceGroupName: rg-myapp-prod
          template: ./infra/main.bicep
          parameters: ./infra/main.bicepparam
          deploymentMode: Validate    # Validate only

  what-if:
    needs: validate
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: azure/login@v2
        with:
          client-id: ${{ secrets.AZURE_CLIENT_ID }}
          tenant-id: ${{ secrets.AZURE_TENANT_ID }}
          subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}
      - name: Run What-If
        run: |
          az deployment group what-if \
            --resource-group rg-myapp-prod \
            --template-file ./infra/main.bicep \
            --parameters ./infra/main.bicepparam

  deploy:
    needs: [validate, what-if]
    runs-on: ubuntu-latest
    environment: production    # Manual approval here
    steps:
      - uses: actions/checkout@v4
      - uses: azure/login@v2
        with:
          client-id: ${{ secrets.AZURE_CLIENT_ID }}
          tenant-id: ${{ secrets.AZURE_TENANT_ID }}
          subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}
      - name: Deploy Bicep
        uses: azure/arm-deploy@v2
        with:
          subscriptionId: ${{ secrets.AZURE_SUBSCRIPTION_ID }}
          resourceGroupName: rg-myapp-prod
          template: ./infra/main.bicep
          parameters: ./infra/main.bicepparam

Tip: The Validate → What-If → Deploy pattern is the IaC best practice. Validate confirms syntax. What-If shows what will change. Deploy applies. Always show What-If output before requiring approval.


6. Security & Governance

How do you securely manage secrets?

  1. Variable groups (Azure DevOps) / Secrets (GitHub): encrypted at rest, masked in logs
  2. Azure Key Vault integration: best practice — store in Key Vault, reference from pipeline
  3. Workload Identity Federation (OIDC): no stored secrets — short-lived OIDC tokens
  4. Environment-scoped secrets: production secrets only accessible from production jobs
# GitHub OIDC to Azure (no secrets stored):
- name: Azure Login
  uses: azure/login@v2
  with:
    client-id: ${{ secrets.AZURE_CLIENT_ID }}      # App registration ID
    tenant-id: ${{ secrets.AZURE_TENANT_ID }}      # Tenant GUID
    subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}
# No client secret! Token issued at run-time via federation
# Azure DevOps Key Vault integration:
- task: AzureKeyVault@2
  inputs:
    azureSubscription: 'MyServiceConnection'
    KeyVaultName: 'myKeyVault'
    SecretsFilter: 'database-password,api-key'

Critical: Never hardcode secrets in YAML files or commit to Git — even private repos. Secret scanners catch credentials in commit history. Use OIDC or Key Vault references.


What is GitHub Advanced Security?

Feature Description
CodeQL SAST — finds vulnerabilities in source code
Secret scanning Detects committed secrets, push protection blocks commits
Dependabot Identifies vulnerable dependencies, auto-creates update PRs
Security overview Org-wide vulnerability dashboard
Custom queries Define org-specific CodeQL security patterns
# Enable CodeQL in workflow:
- name: Initialize CodeQL
  uses: github/codeql-action/init@v3
  with:
    languages: javascript, python

- name: Perform CodeQL Analysis
  uses: github/codeql-action/analyze@v3

Tip: GitHub Advanced Security is free for public repos and a paid add-on for private repos. For enterprise customers it's part of the GitHub Advanced Security plan.


What are branch protection rules?

Recommended branch protection for "main":
☐ Require pull request reviews before merging
   → Require X reviewers (typically 1-2)
   → Dismiss stale reviews when new commits pushed
   → Require review from code owners (CODEOWNERS file)

☐ Require status checks to pass
   → CI build must succeed
   → Tests must pass
   → Code coverage threshold met
   → Security scans clean

☐ Require branches to be up to date before merging
☐ Require signed commits (advanced)
☐ Require linear history (no merge commits)
☐ Restrict who can push to matching branches
☐ Do not allow bypassing the above settings (even for admins)

Critical: Always enable "Do not allow bypassing the above settings" for production branches. Without it, admins bypass all rules — defeating compliance.


What is the principle of least privilege in DevOps?

  1. Service Principal scopes: each pipeline SP has access ONLY to needed resources. Production SPs separate from non-prod.
  2. Environment-specific credentials: production secrets only accessible from production deployments
  3. RBAC on pipelines: who can edit vs who can run — separate roles
  4. Approval gates: production requires approval from someone other than committer (two-person rule)
  5. Read-only by default: GitHub Actions GITHUB_TOKEN read-only by default
  6. Restricted runners: production deployments use dedicated runners with prod-only network access
# GitHub Actions GITHUB_TOKEN permissions:
permissions:
  contents: read              # Default for most jobs
  pull-requests: write        # Only if PR commenting needed
  packages: write             # Only if pushing packages
# All other permissions implicitly denied

7. Scenario-Based Questions

Scenario: Design a CI/CD pipeline for a .NET microservice deployed to Azure App Service.

  1. Source control: GitHub or Azure Repos with branch protection on main (1 PR review required, status checks must pass)

  2. CI workflow (on push/PR):

    • Restore NuGet packages (with cache)
    • Build with dotnet build --configuration Release
    • Run unit tests with code coverage (min 80%)
    • Static analysis: SonarCloud or CodeQL
    • Container scan if building Docker image
    • Publish artifacts
  3. CD to Test (on merge to main):

    • Deploy Bicep IaC to test resource group
    • Deploy app to Test App Service
    • Run integration tests against Test environment
    • Run smoke tests
  4. CD to Production (manual trigger or tag-based):

    • Required reviewer approval (production environment)
    • Deploy to staging slot of Production App Service
    • Smoke test the staging slot
    • Slot swap → traffic moves to new version (zero downtime)
    • Run production smoke tests
    • On failures → automatic slot swap back (rollback)
  5. Authentication: GitHub OIDC to Azure (no secrets stored)

  6. Monitoring: Application Insights integrated, alert on failure rate spike post-deploy


Scenario: Migrate from Azure DevOps to GitHub Actions.

  1. Inventory current state: pipelines, variable groups, service connections, environments, agent pools, work items, repos
  2. Migrate code first: GitHub Importer migrates Azure Repos to GitHub (preserves history, branches, tags)
  3. Use GitHub Actions Importer: Microsoft CLI tool that converts Azure DevOps YAML to GitHub Actions automatically (~80% of conversion)
  4. Manual conversion:
    • Custom tasks → find GitHub Marketplace equivalents or write composite actions
    • Variable groups → repository or environment secrets
    • Service connections → GitHub OIDC federated credentials
    • Environments → GitHub Environments with protection rules
  5. Migrate Azure Boards to GitHub Issues + Projects: use Azure DevOps Migration Tools or Azure Boards GitHub action for bidirectional integration during transition
  6. Parallel run period: 4-8 weeks running both systems, validate Actions work correctly
  7. Decommission: archive Azure DevOps project (don't delete — keep audit history)

Scenario: Implement a multi-stage CI/CD pipeline with security gates.

Stage 1: Build & Unit Test
  → Compile, run unit tests, fail if coverage < 80%

Stage 2: Security Scans (parallel jobs)
  ├─ SAST: CodeQL or SonarCloud → fail on critical/high
  ├─ Secret scan: GitHub secret scanning
  ├─ License compliance: detect non-compliant licenses
  └─ Dependency vulnerabilities: Dependabot / Snyk
  → ALL must pass to proceed

Stage 3: Container/Package Build
  → Build Docker image
  → Container scan: Trivy, Defender for Containers
  → Sign image with Cosign
  → Push to ACR with proper tags

Stage 4: IaC Deploy to Test
  → Bicep what-if (review changes)
  → Apply IaC to test resource group
  → Run integration tests

Stage 5: DAST (Dynamic security testing)
  → OWASP ZAP scan on test environment
  → Penetration testing automation
  → Fail on critical findings

Stage 6: UAT (Manual approval)
  → Deploy to UAT environment
  → Business stakeholder approval gate

Stage 7: Production Deploy
  → 2-person approval (security + product owner)
  → Deploy to staging slot
  → Smoke tests
  → Blue/green slot swap
  → Post-deploy verification

Stage 8: Post-deploy
  → Send Teams notification
  → Update change management system
  → Monitor Application Insights for 30 minutes
  → Auto-rollback if error rate spikes

Scenario: A pipeline run failed and rolled back production. How do you investigate and prevent recurrence?

Immediate triage:

  1. Check pipeline logs — which step failed and what error?
  2. Check Application Insights / Log Analytics — what errors are users seeing?
  3. Confirm rollback was successful — production reverted to previous version

Root cause analysis:

  1. What changed in this deploy vs the last successful one? Diff the commits.
  2. Was it code, infrastructure, or configuration drift?
  3. Was there a database migration that didn't roll back cleanly?
  4. Were there environment-specific differences not caught in test?

Document findings: post-mortem in Confluence/Wiki — timeline, root cause, customer impact, resolution actions

Preventive actions:

  • Add a smoke test that catches this specific failure mode pre-prod
  • Improve test environment to match production more closely
  • Add alerting that catches the issue before users do
  • Update runbooks with rollback procedure for this scenario

Tip: Post-mortems should be blameless. Focus on systems and processes, not individual blame. Teams that punish failures get less honest reporting and more hidden incidents.


8. Cheat Sheet — Quick Reference

DORA Metrics

Deployment Frequency  → How often you deploy to production
Lead Time for Changes → Commit to production time
Change Failure Rate   → % of deployments causing failures
MTTR                  → Mean Time to Recovery from failure

Performance levels:
Elite:    Multiple deploys/day, < 1 hour lead time, < 15% failure, < 1 hour MTTR
High:     Daily deploys, < 1 day lead time, < 30% failure, < 1 day MTTR
Medium:   Weekly deploys, < 1 week lead time, < 30% failure, < 1 week MTTR
Low:      Monthly deploys, > 1 month lead time, > 30% failure, > 1 week MTTR

Azure DevOps Services

Boards    → Work tracking (User Stories, Tasks, Bugs, Features, Epics)
Repos     → Git source control + branch policies + pull requests
Pipelines → CI/CD (YAML or Classic — Classic deprecating)
Test Plans→ Manual/exploratory testing + test case management
Artifacts → NuGet, npm, Maven, Python, Universal package management

GitHub Actions Workflow Events

# Common triggers:
on:
  push:                          # Code pushed to repo
    branches: [main, develop]
    paths: ['src/**']
  
  pull_request:                  # PR opened/updated
    branches: [main]
    types: [opened, synchronize, reopened]
  
  schedule:
    - cron: '0 2 * * *'         # Cron schedule
  
  workflow_dispatch:             # Manual UI trigger
    inputs:
      environment:
        type: choice
        options: [dev, staging, prod]
  
  workflow_call:                 # Reusable workflow
  
  release:
    types: [published]           # Release events
  
  issues:
    types: [opened, labeled]     # Issue events

Deployment Strategies Comparison

Strategy      Downtime  Rollback   Cost      Best For
Recreate      Yes       Slow       Low       Non-critical
Rolling       No        Slow       Low       Stateless apps
Blue-Green    No        Instant    2× infra  Production w/ rollback
Canary        No        Fast       Low       High-traffic prod
Feature flags No        Instant    Low       All (combine w/ above)

OIDC Authentication Pattern

1. Create Azure AD app registration with federated credential
2. Configure subject: repo:org/repo:environment:production
3. Grant role to Service Principal (e.g., Contributor on RG)
4. In GitHub: store client-id, tenant-id, subscription-id as secrets
5. In workflow: use azure/login@v2 with id-token: write permission

Workflow snippet:
permissions:
  id-token: write   # Required for OIDC
  contents: read

steps:
  - uses: azure/login@v2
    with:
      client-id: ${{ secrets.AZURE_CLIENT_ID }}
      tenant-id: ${{ secrets.AZURE_TENANT_ID }}
      subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}
  
  - run: az group list

# No client secret stored — token issued at runtime

Top 10 Tips

  1. OIDC over client secrets — Workload Identity Federation is the modern best practice. No stored credentials, short-lived tokens, no rotation. Mention this in any authentication question.
  2. Branch protection is non-negotiable — production branches must require PR reviews, status checks, and "do not allow bypassing" enabled even for admins. This is the compliance answer.
  3. Blameless post-mortems — focus on systems, not individuals. Teams that punish failures get hidden incidents. The cultural answer for any incident scenario.
  4. Validate → What-If → Deploy — the IaC pipeline pattern. Validate confirms syntax, What-If previews changes, Deploy applies. Always show What-If before approval.
  5. Know the DORA metrics — Deployment Frequency, Lead Time, Change Failure Rate, MTTR. Universal benchmarks, asked in every DevOps maturity discussion.
  6. GitOps for Kubernetes — Flux or ArgoCD with Git as source of truth. Pull-based reconciliation eliminates drift. The modern Kubernetes deployment answer.
  7. Pipeline templates enforce standards — central YAML maintained by platform team, consumed by all repos. Updates propagate automatically. The governance answer.
  8. Environment-scoped secrets — production secrets only accessible from production jobs. GitHub Environments and Azure DevOps environments enforce this. Critical for least-privilege.
  9. Pin actions to SHA@v4 is the minimum, @SHA is the gold standard. Never use @main or @latest — supply chain attack vector.
  10. GitHub Advanced Security — CodeQL (SAST), secret scanning, Dependabot (dependency vulnerabilities). The integrated security answer for GitHub-based pipelines.


Featured Post

Microsoft Graph API & Development Complete Guide

  Microsoft Graph API & Development — Complete Guide Graph Fundamentals · Authentication · OData Queries · Webhooks · Delta Query · Bat...

Popular posts