GRC Integration Academy

Start Here: Your Lab Setup

Before we write code, let's set up your workspace. ~20 minutes.

🎯 WHAT YOU'RE ABOUT TO LEARN

→Install everything needed for this course

→Create your project folder

→Verify every tool works

📋 FOLLOW ALONG

Go to python.org/downloads. On Windows, check "Add Python to PATH" — forgetting this is the #1 beginner mistake.

After installing, open a terminal and type:

python --version

✅ WHAT YOU SHOULD SEE

Something like Python 3.12.4. Any 3.11+ works.

Download from code.visualstudio.com. After installing, click Extensions (four squares icon), search "Python", install the one by Microsoft.

Download from git-scm.com. Git tracks every change to your code. Verify:

git --version git config --global user.name "Your Name" git config --global user.email "you@email.com"

Download from postman.com/downloads. Postman lets you test API calls visually — like a browser for APIs.

pip install requests

✅ WHAT YOU SHOULD SEE

"Successfully installed requests-2.x.x"

mkdir grc-integration-portfolio cd grc-integration-portfolio git init

Open this folder in VS Code: File → Open Folder

✅ WHAT YOU SHOULD SEE

VS Code shows an empty project. You're ready to write code.

✅ LAB SETUP CHECKLIST

☐ Python 3.11+ installed and verified

☐ VS Code with Python extension

☐ Git installed and configured

☐ Postman installed

☐ requests library installed

☐ Project folder created and opened in VS Code

⚠️ COMMON MISTAKE: 'python' not recognized

Re-run the Python installer and check "Add Python to PATH". On Mac, try python3 --version instead.

Lesson 1: Your First Python Script

Week 1 · Technical — We'll write real code together, one line at a time. No prior experience needed.

🎯 WHAT YOU'RE ABOUT TO LEARN

→How to create and run a Python file (from zero)

→Dictionaries — how every API sends you data

→How to safely read data without crashing your script

→How to filter a list to find only the important items

→How to check if data is valid before using it

→How to call a real API and get data from the internet

💼 WHY THIS MATTERS IN A REAL GRC JOB

Every GRC integration you'll ever build does exactly four things: (1) receive data from an API, (2) check if the data is valid, (3) transform it into a different format, and (4) send it somewhere else. That's it. Today you'll learn each of those four skills, one step at a time.

Before You Start

Make sure you've completed the Lab Setup lesson. You need:

✅ CHECKLIST — DO NOT SKIP

☐ Python installed (type python --version in your terminal — you should see Python 3.x.x)

☐ VS Code installed with the Python extension

☐ The requests library installed (type pip install requests)

☐ Your project folder open in VS Code

⚠️ IF "PYTHON" DOESN'T WORK IN YOUR TERMINAL

Windows: Try py --version instead of python --version. If neither works, reinstall Python and check "Add Python to PATH".

Mac: Try python3 --version. On Mac, you may need to use python3 and pip3 everywhere this lesson says python and pip.

Part 1: Create and Run Your First Python File

Let's start with the absolute basics. We're going to create a file, type one line of code, and run it.

🔧 DO THIS NOW — STEP 1

In VS Code, create a new file: File → New File. Save it as lesson1.py inside your project folder. The .py ending tells your computer "this is a Python file."

🔧 DO THIS NOW — STEP 2

Type this single line into your file:

📄 lesson1.py — your first line of code

print("Hello, GRC world!")

Let's break that down:

LINE-BY-LINE EXPLANATION

print( — This is a command that tells Python to display something on screen. Think of it as "show me this."

"Hello, GRC world!" — This is the text you want to display. Text in Python is always wrapped in quotation marks. Python calls text a "string".

) — Closes the print command.

🔧 DO THIS NOW — STEP 3: RUN IT

Open your terminal in VS Code: Terminal → New Terminal (or press Ctrl+`). Type this and press Enter:

python lesson1.py

✅ WHAT YOU SHOULD SEE

Hello, GRC world!

If you see that, congratulations — you just ran your first Python script. If you see an error, check the troubleshooting section below.

⚠️ WHAT TO DO IF THIS BREAKS

Error: "python is not recognized" → Try python3 lesson1.py instead. Or revisit Lab Setup.

Error: "No such file or directory" → Your terminal isn't in the right folder. Type cd grc-integration-portfolio first, then try again.

Error: "SyntaxError" → Check that you typed the line exactly. Common mistake: using the wrong type of quotation marks or forgetting the closing parenthesis.

Part 2: Storing Data in Variables

A variable is a name you give to a piece of data so you can use it later. Think of it like a labeled box — you put something in and can take it out whenever you need it.

🔧 DO THIS NOW

Replace everything in lesson1.py with this:

📄 lesson1.py — variables

# A variable stores data with a name # The "#" symbol starts a comment — Python ignores these lines # Comments are notes for humans reading the code severity = "High" # A string (text) asset_name = "web-server-01" # Another string days_open = 45 # A number (integer) is_resolved = False # A boolean (True or False) print("Severity:", severity) print("Asset:", asset_name) print("Days open:", days_open) print("Resolved?", is_resolved)

LINE-BY-LINE EXPLANATION

# A variable stores data... — Lines starting with # are comments. Python ignores them completely. They're notes for you.

severity = "High" — Creates a variable called severity and stores the text "High" in it. The = sign means "put this value into this name."

days_open = 45 — Stores a number. Notice: no quotation marks. Numbers don't need them.

is_resolved = False — A boolean — can only be True or False. Notice the capital letter. In Python, it must be True not true.

print("Severity:", severity) — The comma lets you print multiple things on one line. Python adds a space between them automatically.

Run it: python lesson1.py

✅ EXPECTED OUTPUT

Severity: High Asset: web-server-01 Days open: 45 Resolved? False

Part 3: Dictionaries — How APIs Send You Data

This is the most important concept in this lesson. When your integration calls an API (a security scanner, a cloud service, a GRC platform), the data comes back as a dictionary.

A dictionary is a collection of labeled values. Think of it like a form: each field has a name (the "key") and a value.

🔧 DO THIS NOW

Replace your file with this:

📄 lesson1.py — your first dictionary

# A dictionary stores labeled data — like a form with fields # It uses curly braces { } and each field is "key": value finding = { "id": "VULN-2024-0847", # The finding's unique ID "severity": "High", # How bad is it? "cve": "CVE-2024-3094", # The public vulnerability ID "asset": "web-server-prod-03", # Which system is affected "status": "open" # Has it been fixed yet? } # Read a value by its key name — use square brackets [ ] print("Finding ID:", finding["id"]) print("Severity:", finding["severity"]) print("Asset:", finding["asset"])

LINE-BY-LINE EXPLANATION

finding = { — Start a dictionary. The curly brace { means "here come the labeled values."

"id": "VULN-2024-0847", — One field. The key is "id", the value is "VULN-2024-0847". The colon : separates key from value. The comma , separates this field from the next one.

} — End of the dictionary.

finding["severity"] — Read the value stored under the key "severity". This gives you "High".

✅ EXPECTED OUTPUT

Finding ID: VULN-2024-0847 Severity: High Asset: web-server-prod-03

🏢 WHY THIS MATTERS ON THE JOB

This dictionary looks exactly like what a real vulnerability scanner (Tenable, Qualys, AWS Security Hub) sends back when you ask it for findings. Every field — id, severity, asset — is data your integration will pull, validate, and load into a GRC platform. You're already working with realistic GRC data.

Part 4: Reading Data Safely (Without Crashing)

Here's a problem you'll hit immediately in real integrations: sometimes a field is missing. A scanner might not include a "remediation" field for every finding. If you try to read a field that doesn't exist, Python crashes.

🔧 TRY THIS — WATCH IT BREAK

Add this line to the bottom of your script and run it:

print(finding["remediation"]) # This field doesn't exist!

💥 WHAT HAPPENS

Python crashes with: KeyError: 'remediation'

This means "I looked for a key called 'remediation' but it doesn't exist in this dictionary." In a production integration running at 2 AM, this crash means your entire pipeline stops and no data gets loaded.

The fix: use .get() instead of brackets. It returns a safe default value when the key is missing:

📄 Safe access with .get()

# DANGEROUS — crashes if the key is missing # print(finding["remediation"]) ← DON'T DO THIS # SAFE — returns "Not specified" if the key is missing remediation = finding.get("remediation", "Not specified") print("Remediation:", remediation)

HOW .get() WORKS

finding.get("remediation", "Not specified")

→ First argument "remediation" — the key to look for

→ Second argument "Not specified" — what to return if the key is missing

→ If the key exists, you get its value. If not, you get the safe default.

✅ EXPECTED OUTPUT

Remediation: Not specified

❌ DANGEROUS — crashes if missing

finding["remediation"]
→ KeyError crash! Script dies.

✅ SAFE — returns default

finding.get("remediation", "N/A")
→ Returns "N/A" safely. Script continues.

🧠 QUICK CHECK

Your script runs finding["owner"] but there's no "owner" key. What happens?

Explanation: Using brackets [ ] on a missing key always crashes with KeyError. Use .get("owner", "Unknown") instead to return a safe default. In GRC integrations, API responses frequently have missing fields, so .get() is essential.

✏️ MINI EXERCISE 1

Add two more .get() calls to your script to safely read:

1. A field called "assigned_to" with a default of "Unassigned"

2. A field called "due_date" with a default of "No due date set"

Print both values.

assigned = finding.get("assigned_to", "Unassigned") due = finding.get("due_date", "No due date set") print("Assigned to:", assigned) print("Due date:", due)

✅ EXPECTED OUTPUT

Assigned to: Unassigned Due date: No due date set

Part 5: Lists — Working With Multiple Findings

A scanner doesn't return one finding — it returns hundreds. In Python, a list holds multiple items in order. A list of dictionaries is exactly what a real API returns.

🔧 DO THIS NOW

Create a new file called filter_findings.py:

📄 filter_findings.py

# A list of vulnerability findings — this is what a real API returns # Notice the square brackets [ ] — that means "this is a list" # Each item in the list is a dictionary { } findings = [ {"id": "V-001", "severity": "Critical", "asset": "db-prod-01"}, {"id": "V-002", "severity": "Low", "asset": "web-dev-03"}, {"id": "V-003", "severity": "High", "asset": "api-prod-02"}, {"id": "V-004", "severity": "Medium", "asset": "web-prod-01"}, {"id": "V-005", "severity": "Critical", "asset": "auth-prod-01"}, ] # How many findings total? print("Total findings:", len(findings)) # Loop through each finding and print it for f in findings: print(f" {f['id']} | {f['severity']} | {f['asset']}")

LINE-BY-LINE EXPLANATION

findings = [ — Start a list. Square brackets mean "this is a list of items."

{"id": "V-001", ...}, — Each item in the list is a dictionary. The comma at the end separates it from the next item.

len(findings) — len() counts how many items are in a list. Here it returns 5.

for f in findings: — A loop. This means "take each finding one at a time, call it f, and run the indented code below for each one." The colon : is required.

print(...) — This line is indented (4 spaces). Indentation tells Python "this code belongs to the loop above." Everything indented under for runs once for each item.

f" {f['id']} | {f['severity']}" — An f-string. The f before the quote means "I want to put variables inside this text." Anything in {curly braces} gets replaced with its value.

✅ EXPECTED OUTPUT

Now let's filter to keep only the urgent findings (Critical and High):

🔧 ADD THIS to the bottom of filter_findings.py

# Filter: keep ONLY Critical and High severity findings # This one line does what a 10-line loop would do urgent = [f for f in findings if f["severity"] in ["Critical", "High"]] print(f"\nUrgent findings: {len(urgent)}") for f in urgent: print(f" ⚠ {f['id']} | {f['severity']} | {f['asset']}")

THAT FILTER LINE, EXPLAINED PIECE BY PIECE

urgent = [ — We're creating a new list called urgent

f for f in findings — Go through each finding (call it f)

if f["severity"] in ["Critical", "High"] — Only keep it if the severity is Critical or High

] — End of the filter

In plain English: "Give me every finding where the severity is Critical or High."

✅ EXPECTED OUTPUT (added to previous output)

🏢 ON THE JOB, THIS LOOKS LIKE...

This exact pattern — "pull all findings, filter to Critical and High, process only those" — is what your vulnerability-to-POA&M integration does every night. Critical and High findings become POA&M items automatically. Medium and Low might get tracked differently or just monitored.

✏️ MINI EXERCISE 2

Write a filter that creates a list called low_risk containing only "Low" and "Medium" findings. Print how many there are.

low_risk = [f for f in findings if f["severity"] in ["Low", "Medium"]] print(f"Low risk: {len(low_risk)}") # Should print: Low risk: 2

Part 6: Validation — Checking Data Before You Use It

In a real integration, you never load data into the GRC platform without checking it first. What if a finding is missing its severity? What if the asset field is blank? Loading garbage data corrupts dashboards and confuses compliance teams.

A validation function is code that checks each record before it's loaded. Think of it as a security guard at the door.

🔧 DO THIS NOW

Create a new file called validate.py:

📄 validate.py — your first validation function

def validate_finding(finding): """Check that a finding has all required fields.""" errors = [] # Start with an empty list of errors # Check each required field for field in ["id", "severity", "asset"]: if not finding.get(field): errors.append(f"Missing required field: {field}") # Check severity is a valid value valid = ["Critical", "High", "Medium", "Low"] if finding.get("severity") not in valid: errors.append(f"Invalid severity: {finding.get('severity')}") # Return: is it valid? and what were the errors? if errors: return False, errors return True, ["Valid"] # Test with good data good = {"id": "V-001", "severity": "High", "asset": "web-01"} ok, msgs = validate_finding(good) print(f"Good finding: valid={ok}, messages={msgs}") # Test with bad data — wrong severity AND missing asset bad = {"id": "V-002", "severity": "Urgent"} ok, msgs = validate_finding(bad) print(f"Bad finding: valid={ok}, messages={msgs}")

LINE-BY-LINE EXPLANATION

def validate_finding(finding): — def means "define a function." A function is a reusable block of code you give a name to. finding in parentheses is the input — the data you want to check.

errors = [] — Create an empty list to collect any problems we find.

errors.append(...) — append adds an item to the end of a list. If we find a problem, we add a description of it to our errors list.

if not finding.get(field): — If the field is missing (returns None) or empty (returns ""), this is True.

return False, errors — Send back two things: whether it passed (False = failed) and the list of errors. The calling code receives both.

ok, msgs = validate_finding(good) — Call the function, and capture the two return values into two variables.

✅ EXPECTED OUTPUT

Good finding: valid=True, messages=['Valid'] Bad finding: valid=False, messages=['Missing required field: asset', 'Invalid severity: Urgent']

🧠 QUICK CHECK

A finding has {"id": "V-010", "severity": "High"} but no "asset" field. What does your validation function return?

Explanation: The function uses .get() to check for the "asset" field. Since it's missing, .get("asset") returns None, which is falsy, so "Missing required field: asset" gets added to the errors list. The function returns False with that error message.

Part 7: Your First API Call

Everything so far used data you typed by hand. In a real integration, data comes from an API — an application programming interface. It's like a window into another system: you send a request, and data comes back.

Here's exactly what happens when your script calls an API:

📊 WHAT HAPPENS WHEN YOUR SCRIPT CALLS AN API

🔧 DO THIS NOW

Create a new file called first_api.py. Type exactly this:

📄 first_api.py — calling a real API

import requests # Load the requests library (you installed this earlier) # Call a free practice API — this returns fake blog posts url = "https://jsonplaceholder.typicode.com/posts" response = requests.get(url) # Did it work? Status 200 means "success" print("Status code:", response.status_code) # Convert the raw response into Python data posts = response.json() # How many records did we get? print(f"Received {len(posts)} posts") # Look at the first one print("\nFirst post:") print(f" ID: {posts[0]['id']}") print(f" Title: {posts[0]['title'][:50]}...")

LINE-BY-LINE EXPLANATION

import requests — Load the requests library. import means "I want to use code someone else wrote." You installed this library with pip install requests.

url = "https://..." — The address of the API. Like a web address, but instead of a webpage, it returns data.

response = requests.get(url) — Send a GET request to the API. "GET" means "give me data." The server sends back a response.

response.status_code — A number the server sends back: 200 means "success, here's your data." 401 means "you're not authorized." 500 means "the server is broken."

posts = response.json() — Convert the raw text response into Python data (a list of dictionaries). Before .json(), the response is just text. After, it's data you can work with.

posts[0] — Get the first item from the list. Python counts from 0, not 1. So [0] is the first item, [1] is the second, etc.

['title'][:50] — Get the title, but only the first 50 characters. The [:50] is called "slicing" — it trims long text.

Run it: python first_api.py

✅ WHAT YOU SHOULD SEE

Status code: 200 Received 100 posts First post: ID: 1 Title: sunt aut facere repellat provident occaecati exc...

If you see "Status code: 200" — you just called a real API from Python! The 200 means the server said "here's your data, everything worked."

📊 WHAT THE RAW API RESPONSE LOOKS LIKE

Before .json(), the response is raw text that looks like this:

[ {"userId": 1, "id": 1, "title": "sunt aut facere...", "body": "quia et suscipit..."}, {"userId": 1, "id": 2, "title": "qui est esse...", "body": "est rerum tempore..."}, ... 98 more ... ]

After .json(), Python turns that text into a list of dictionaries — the same data structure you've been working with all lesson. Each post becomes a dictionary you can access with brackets and .get().

⚠️ WHAT TO DO IF THIS BREAKS

"ModuleNotFoundError: No module named 'requests'" → You haven't installed the library yet. Run pip install requests in your terminal.

"ConnectionError" or "timeout" → You might not have internet access, or the URL might be wrong. Check your connection and make sure you typed the URL exactly.

Status code is not 200 → The API might be temporarily down. Wait a minute and try again.

🧠 QUICK CHECK

Your script calls an API and gets status code 200. What does this mean?

Explanation: Status 200 = "OK — success." The server processed your request and sent back data. Other codes you'll see often: 401 = authentication failed, 429 = rate limited (too many requests), 500 = server error.

✏️ MINI EXERCISE 3

Modify your first_api.py to also print the first post's userId and body (first 80 characters of body). Use posts[0]["userId"] and posts[0]["body"][:80].

print(f" User: {posts[0]['userId']}") print(f" Body: {posts[0]['body'][:80]}...")

📝 LESSON RECAP — WHAT YOU LEARNED TODAY

✓Variables store data with a name: severity = "High"

✓Dictionaries store labeled fields: {"key": "value"} — every API response is one

✓.get() safely reads fields without crashing: finding.get("field", "default")

✓Lists hold multiple items: [item1, item2, item3]

✓Loops process each item: for f in findings:

✓Filters keep what you need: [f for f in findings if condition]

✓Validation functions check data before loading — never skip this

✓API calls: requests.get(url) → .json() → Python data you can work with

✓Status 200 = success. Anything else = something went wrong.

📂 FILES YOU SHOULD HAVE NOW

✓ lesson1.py — variables and dictionaries

✓ filter_findings.py — lists, loops, and filtering

✓ validate.py — validation function

✓ first_api.py — your first API call

Push them all to GitHub: git add . && git commit -m "Lesson 1: Python fundamentals and first API call" && git push

🏢 WHAT COMES NEXT

You just learned the exact building blocks of every production GRC integration: dictionaries (API data), safe access (.get), filtering, validation, and API calls. Next lesson, you'll learn the compliance framework (RMF) that tells you WHY you're building these integrations and WHAT data the compliance team needs. The technical and GRC tracks run in parallel because the role requires both.

Lesson 2: The Risk Management Framework

Week 1 · GRC — The process that creates demand for everything you'll build. No prior compliance knowledge needed.

🎯 WHAT YOU'RE ABOUT TO LEARN

→What "compliance" actually means — in plain English

→The 7-step RMF process and what happens at each step

→Where YOUR integration work fits in the lifecycle

→Key vocabulary you'll hear in every GRC conversation

Let's Start With: What Is "Compliance"?

Imagine a hospital needs to prove it keeps patient data safe. A government agency needs to prove its email system can't be hacked. "Compliance" means proving you follow the security rules. Not just saying you do — actually showing evidence.

Your job as an integration specialist is to automate that evidence collection. Instead of someone manually taking screenshots every quarter, your scripts pull data from security tools automatically, every single night.

What Is RMF?

RMF stands for Risk Management Framework. It's the 7-step process the U.S. federal government uses to decide: "Is this system secure enough to turn on?"

The goal is an ATOATO (Authority to Operate): A formal decision by a senior official saying "I understand the security risks of this system and I accept them — it's approved to operate." Without an ATO, a federal system cannot run. — Authority to Operate. A senior executive signs off saying "the risks are acceptable."

📊 THE RMF LIFECYCLE — WHERE YOUR WORK LIVES

The Seven Steps — In Plain English

Let's walk through each step. For every step, I'll explain: what happens, and what YOUR integration work contributes.

Prepare — "Get organized"

What happens: Before anything else, the organization decides: Who owns this system? What's included in it? How much risk can we accept?

Your role: The system boundarySystem Boundary: The line around what's "in" the system being authorized. Everything inside needs security controls. Your asset inventory must track exactly these assets. (what's "in" the system) determines what your integrations track. Wrong boundary = wrong inventory = wrong compliance data.

Categorize — "How bad would a breach be?"

What happens: Rate the system's impact: if it got hacked, how bad would it be? Each of three dimensions (confidentialityConfidentiality: Can unauthorized people see the data?, integrityIntegrity: Can unauthorized people change the data?, availabilityAvailability: Can the system go down?) gets rated Low, Moderate, or High.

Your role: A "Moderate" system needs ~300 security controls. A "High" system needs even more. This determines how much evidence your integrations must produce.

Select — "Pick the security rules"

What happens: Choose which controlsSecurity Control: A specific security requirement. "You must review audit logs weekly" is a control. "You must disable accounts within 24 hours of termination" is a control. NIST 800-53 has ~1,000 of them. (security rules) apply from the NIST 800-53 catalog. Some controls are inherited from shared infrastructure (like a cloud platform).

Your role: Understanding which controls are inherited tells you WHERE to pull evidence from. If logging is centralized, your integration pulls from the central system, not each individual application.

Implement — "Build the security"

What happens: Actually put the security controls in place and document how each one works in the SSPSSP (System Security Plan): The main document describing a system — what it is, what controls apply, and how each control is implemented. Think of it as the "owner's manual" for the system's security. (System Security Plan).

Your role: Your integrations can auto-populate parts of the SSP — like the asset inventory section. The integration itself becomes part of the control implementation ("we use an automated pipeline to continuously monitor vulnerabilities").

Assess — "Prove it works"

What happens: An assessorAssessor: A person (internal or third-party) who tests whether security controls actually work. They review evidence, interview people, and poke at the system looking for weaknesses. tests whether the controls actually work. They review evidence, interview people, and check the system.

Your role: Your integrations produce the evidence assessors review. System-generated reports with timestamps are MUCH stronger than manual screenshots. Assessors love automated evidence because it's harder to fake and proves continuous practice, not just point-in-time compliance.

Authorize — "The decision"

What happens: A senior official looks at the assessment results, the SSP, and the list of known weaknesses (POA&MPOA&M (Plan of Action & Milestones): A list of known security weaknesses and the plan to fix them. Every unresolved finding becomes a POA&M item with a severity, owner, and due date. Your first integration project will automate this.) and makes a risk-based decision: approve, deny, or approve with conditions.

Your role: The quality and completeness of YOUR integration data directly influences this decision. If your vulnerability feed is missing findings, the executive is making a decision based on incomplete information.

Monitor — "Keep watching forever"

What happens: After authorization, continuously track: are controls still working? Did anything change? Are there new vulnerabilities? This step never ends.

This is YOUR home step. Continuous monitoring is literally what GRC integration enables. Without your pipelines, monitoring is a manual quarterly exercise. With them, the GRC platform reflects reality every day.

🧠 QUICK CHECK

Your integration pulls vulnerability scan results every night and loads them into the GRC platform. Which RMF step does this primarily support?

Explanation: Nightly automated scanning = continuous monitoring (Step 7). While scan results also help during assessments (Step 5), the continuous and automated nature makes this primarily monitoring. This is the future of compliance: always-on, not point-in-time.

✏️ MINI EXERCISE

On paper or in a notes app, draw the 7 steps as a circle (they cycle continuously). For each step, write one sentence answering: "What data could an integration specialist provide at this step?"

Example for Step 7: "An integration that pulls vulnerability scan results nightly and loads them into the GRC platform for continuous tracking."

Key Vocabulary — Words You'll Hear Every Day

ATO

Authority to Operate — The formal approval to run a system. The goal of the entire RMF process.

SSP

System Security Plan — The document describing the system and how every control is implemented. Your integrations help keep it current.

POA&M

Plan of Action & Milestones — The list of known weaknesses. Your first integration project automates creating and closing these.

800-53

NIST SP 800-53 — The catalog of ~1,000 security controls. You don't memorize it; you navigate it.

🔧 DO THIS NOW

Go to the NIST website and download SP 800-37 Rev. 2 (it's a free PDF). You don't need to read the whole thing today — just read pages 1-15 (the introduction and overview). This is the official document behind everything you just learned.

📝 LESSON RECAP

✓Compliance = proving you follow security rules with evidence

✓RMF = 7-step process to authorize federal systems

✓ATO = the formal approval to operate — the goal

✓Step 7 (Monitor) is your home step — continuous monitoring depends on your integrations

✓Key docs: SSP (security plan), POA&M (weakness tracker), 800-53 (control catalog)

🏢 INTERVIEW TIP

"How does your work fit into RMF?" → "My integrations support continuous monitoring by automating evidence collection between security tools and the GRC platform, ensuring control effectiveness is tracked continuously, not just during annual assessments." That answer shows you understand both the technical work and the compliance purpose.

Lesson 3: REST APIs — How Systems Talk to Each Other

Week 2 · Technical — Every integration you build uses APIs. Let's demystify them completely.

🎯 WHAT YOU'RE ABOUT TO LEARN

→What an API is — explained without jargon

→The 4 actions you can take: GET, POST, PUT, PATCH

→Status codes — how the server tells you what happened

→Pagination — what to do when there's too much data for one response

→Retry logic — making your script survive failures

What Is an API? (No Jargon Version)

You know how you type a web address into a browser and get a webpage? An API works the same way, except instead of a pretty webpage, you get raw data back. It's a way for your script to ask another system a question and get a structured answer.

When your integration calls the ServiceNow API, it's saying: "Give me all the open POA&M items." ServiceNow responds with a list of records in a format your script can read (JSON — the format you learned in Lesson 1).

📊 API CALL: WHAT ACTUALLY HAPPENS

The 4 Actions (HTTP Methods)

Every API call uses a "method" that tells the server what you want to do. Think of them as verbs:

GET

Read — "Give me data"

Example: Pull all open POA&M items

Safe to repeat — reading doesn't change anything

POST

Create — "Make a new record"

Example: Create a new finding from a scan

⚠ Repeating creates duplicates!

PUT

Replace — "Overwrite this record"

Example: Replace a stale inventory entry

PATCH

Update — "Change just these fields"

Example: Change a POA&M status to "Closed"

⚠️ THE #1 INTEGRATION MISTAKE: POST DUPLICATES

If your script crashes halfway through and you re-run it, every POST call creates a second copy of the record. Imagine 500 vulnerability findings loaded twice — 1,000 items in the GRC platform, double the risk showing on dashboards, everyone panics. Solution: Always check if a record exists (GET) before creating it (POST). This is called "upsert logic" — you'll build it in Phase 2.

Status Codes — The Server's Answer

Every response includes a number telling you what happened. Memorize these five:

200

OK — It worked! Parse the data and continue. This is the response you want.

401

Unauthorized — Your login credentials are wrong or expired. Fix: refresh your authentication token and retry.

403

Forbidden — You're logged in but don't have permission for this action. Fix: check your service account's roles and permissions.

429

Rate Limited — You're sending too many requests too fast. Fix: wait, then retry with increasing pauses between attempts.

500

Server Error — The remote system is broken (not your fault). Fix: log the error, wait, retry. Alert someone if it keeps happening.

🧠 QUICK CHECK

Your integration gets status code 429. What should your script do?

Explanation: 429 = "you're sending too many requests." Retrying instantly makes it worse. Exponential backoff waits 1 second, then 2 seconds, then 4 seconds between retries — giving the server time to recover. This is standard practice for every production integration.

Pagination — Getting ALL the Data

APIs don't return 10,000 records at once. They return a "page" (e.g., 100 records) and you ask for the next page, and the next, until you have everything. It's like reading a book — you read one page at a time.

📄 Pagination — getting every record, page by page

# Start with an empty list to collect all records all_records = [] offset = 0 # Start at the beginning limit = 100 # Ask for 100 records per page while True: # Keep going until we break out # Ask for one page of records resp = requests.get(url, params={"offset": offset, "limit": limit}) records = resp.json()["result"] # Add this page's records to our collection all_records.extend(records) # If we got fewer than 100, this was the last page if len(records) < limit: break # Stop looping — we have everything offset += limit # Move to the next page print(f"Total records: {len(all_records)}")

LINE-BY-LINE EXPLANATION

while True: — "Keep doing the code below forever." We use break to stop when we're done.

params={"offset": offset, "limit": limit} — Tell the API "start at record #offset and give me #limit records." First loop: start at 0, get 100. Second: start at 100, get 100. And so on.

all_records.extend(records) — Add all records from this page to our master list. extend adds multiple items; append adds one item.

if len(records) < limit: break — If we got fewer than 100 records, we've reached the last page. break exits the loop.

Retry with Exponential Backoff

APIs fail temporarily — network hiccups, server overload, rate limiting. Your script must survive this. Exponential backoffExponential Backoff: A retry strategy where you wait longer after each failure: 1 second, 2 seconds, 4 seconds, 8 seconds. This gives the struggling server time to recover instead of hammering it with retries. means: wait 1 second, then 2, then 4, then 8...

📄 Retry logic — your script survives failures

import time # For time.sleep() — pausing your script import random # For random.uniform() — adding randomness to wait times def api_get_with_retry(url, headers, max_retries=5): """Try an API call up to 5 times, waiting longer each time.""" for attempt in range(max_retries): # Try 0, 1, 2, 3, 4 resp = requests.get(url, headers=headers) if resp.status_code == 200: # Success! return resp.json() elif resp.status_code in [429, 500, 503]: # Retryable errors wait = (2 ** attempt) + random.uniform(0, 1) # attempt 0: wait ~1s. attempt 1: ~2s. attempt 2: ~4s. print(f"Retry {attempt+1}: waiting {wait:.1f}s...") time.sleep(wait) # Pause before retrying else: # Non-retryable error (401, 403, etc.) resp.raise_for_status() # Crash with error details raise Exception("Max retries exceeded") # All 5 attempts failed

🔧 DO THIS NOW

Open Postman. Create a new request: set the method to GET, enter the URL https://jsonplaceholder.typicode.com/posts, and click Send. Look at three things: (1) the status code (should be 200), (2) the response body (JSON data), and (3) the response time.

📝 LESSON RECAP

✓An API is a way for your script to ask another system for data

✓GET reads, POST creates (careful — duplicates!), PUT replaces, PATCH updates

✓Status 200 = success, 401 = auth failed, 429 = too fast, 500 = server broken

✓Pagination loops through pages until all data is collected

✓Exponential backoff retries with 1s, 2s, 4s, 8s waits

Lesson 4: Your First Controls — AC & AU

Week 2 · GRC — The specific security rules your integrations produce evidence for.

🎯 WHAT YOU'RE ABOUT TO LEARN

→What a "control" is — in plain English

→AC-2: Account Management — who has access to what?

→AU-6: Audit Log Review — is anyone watching the logs?

→How to think about each control: "what data proves it works?"

What Is a "Control"?

A control is a specific security rule. Not vague like "be secure" — specific like "disable user accounts within 24 hours of an employee leaving the company." NIST 800-53 has about 1,000 of these rules organized into 20 families (groups), each identified by a two-letter code.

For every control, your job is to answer one question: "What data proves this control is working, and how do I pull it automatically?"

AC-2: Account Management

IN PLAIN ENGLISH

The organization must manage user accounts properly: create them correctly, review them regularly, disable them when people leave, and give extra scrutiny to accounts with special privileges (admin accounts).

📊 AC-2: WHERE THE DATA FLOWS

WHAT AN ASSESSOR CHECKS FOR AC-2

• Account list: A current list of all user accounts with their roles and permissions

• Access reviews: Evidence that someone reviews accounts quarterly — with dates

• Stale accounts: Accounts inactive 90+ days identified and disabled

• Terminated users: Proof accounts are removed when employees leave

• Privileged accounts: Admin accounts inventoried and justified

AU-6: Audit Record Review

IN PLAIN ENGLISH

The organization must review security logs regularly, looking for suspicious activity — and act on what they find. Not just "we have logs" — someone must actually look at them and respond to problems.

WHAT AN ASSESSOR CHECKS FOR AU-6

• Log coverage: Dashboard showing which systems send logs to the SIEM

• Alert evidence: Proof that alerts are generated when anomalies are detected

• Review timestamps: Proof someone reviews logs regularly (not just "we do it")

• Response process: Documentation of what happens when a problem is found

The pattern for every control: Each control says "you must do X." Your job: (1) Which system already has the data that proves X is happening? (2) Does that system have an API? (3) How do I pull the proof automatically and deliver it to the GRC platform? If you can answer these three questions, you can design the integration.

🧠 QUICK CHECK

For AC-2 evidence, which system would your integration most likely pull data from?

Explanation: AC-2 is about managing accounts. The identity provider (Entra ID, Okta, Active Directory) is the authoritative source for user accounts, group memberships, roles, and login activity. Your integration pulls this data and delivers it to the GRC platform for access reviews.

✏️ MINI EXERCISE

Create a table (on paper or in a spreadsheet) with these columns: Control ID | Control Name | Source System | Data Type | How Often. Fill in two rows — one for AC-2 and one for AU-6.

Control	Name	Source	Data	Frequency
AC-2	Account Mgmt	Entra ID	Users, roles, last login	Weekly
AU-6	Audit Review	Splunk (SIEM)	Alert summaries, log coverage	Daily

This "control-to-integration mapping" table is a real artifact you'll create on the job for every system you integrate.

📝 LESSON RECAP

✓A control is a specific security rule from NIST 800-53

✓AC-2 (Account Management) → pull from identity providers → access reviews

✓AU-6 (Audit Review) → pull from SIEMs → log coverage and review evidence

✓For every control: what data proves it? what system has it? how do I pull it?

Lesson 5: JSON and Git — Data Format & Change Control

Week 3 · Technical — The language APIs speak and the system that tracks every change you make.

🎯 WHAT YOU'RE ABOUT TO LEARN

→How to read and write JSON — the format every API uses

→How to navigate nested JSON (data inside data)

→How to save data to a file and read it back

→Git basics — tracking every change for compliance (CM-3)

Part 1: JSON — What Every API Speaks

In Lesson 1, you worked with Python dictionaries. JSON looks almost identical — because Python dictionaries ARE how Python represents JSON data. When an API sends you data, it arrives as JSON text. When you call .json(), Python converts that text into dictionaries and lists you can work with.

📄 A GRC system record in JSON — annotated

{ "system_name": "HR Portal", ← text (string) "impact_level": "Moderate", ← determines which controls apply "is_cloud": true, ← true/false (boolean) "open_poams": 12, ← number "owner": null, ← null means "no value" (None in Python) "controls": [ ← a LIST of objects (nested data!) {"id": "AC-2", "status": "Implemented", "evidence": "Entra ID"}, {"id": "AU-6", "status": "Partial", "evidence": null} ] }

NAVIGATING NESTED JSON — STEP BY STEP

system["system_name"] → "HR Portal" — simple, top-level access

system["controls"] → gets the entire list of control objects

system["controls"][0] → gets the FIRST control (Python counts from 0)

system["controls"][0]["id"] → "AC-2" — the first control's ID

system["controls"][1]["evidence"] → null (None) — AU-6 has no automated evidence yet!

🔧 DO THIS NOW

Create json_practice.py — find controls that are missing automated evidence:

📄 json_practice.py

import json # Python's built-in JSON library system = { "system_name": "HR Portal", "controls": [ {"id": "AC-2", "evidence": "Entra ID"}, {"id": "AU-6", "evidence": None}, # No evidence yet! {"id": "RA-5", "evidence": "Tenable"}, {"id": "CM-8", "evidence": None}, # No evidence yet! ] } # Find controls WITHOUT automated evidence gaps = [c["id"] for c in system["controls"] if not c.get("evidence")] print(f"Controls needing integration: {gaps}") # Save to a JSON file with open("system_report.json", "w") as f: json.dump(system, f, indent=2) # indent=2 makes it readable print("Saved to system_report.json")

✅ EXPECTED OUTPUT

Controls needing integration: ['AU-6', 'CM-8'] Saved to system_report.json

Check your project folder — you should see a new file system_report.json. Open it in VS Code to see the formatted JSON.

Part 2: Git — Your Change Tracking System

In regulated environments, you must track every change to your code: who changed it, when, and why. Git does this automatically. It also directly supports CM-3CM-3 (Configuration Change Control): The NIST 800-53 control requiring a formal process for proposing, approving, and tracking changes. Your Git commit history IS CM-3 evidence. — your Git history IS compliance evidence.

🔧 DO THIS NOW — Save your work to Git

In your terminal, run these commands one at a time:

git add . # Stage ALL files for saving git commit -m "Lesson 5: JSON practice and system report export" git push origin main # Upload to GitHub

WHAT THOSE COMMANDS DO

git add . — "Mark all changed files as ready to save." The dot . means "everything in this folder."

git commit -m "..." — "Save a snapshot of all staged files with this description." The -m flag means "here's my message."

git push origin main — "Upload my saved snapshots to GitHub."

❌ BAD COMMIT MESSAGE

git commit -m "fixed stuff"
An auditor learns nothing from this.

✅ GOOD COMMIT MESSAGE

git commit -m "Add JSON export for system controls with evidence gap detection"
Clear, specific, auditable.

📝 LESSON RECAP

✓JSON uses {} for objects and [] for lists — every API speaks this

✓Navigate nested data: system["controls"][0]["id"]

✓json.dump() writes to files; json.load() reads from files

✓Git tracks every change — add → commit → push

✓Good commit messages are compliance evidence (CM-3)

Lesson 6: The Controls You'll Automate Most

Week 3 · GRC — CM-8, RA-5, and SI-4: the three controls behind your most common integrations.

🎯 WHAT YOU'RE ABOUT TO LEARN

→CM-8: Why knowing what you have is the foundation of everything

→RA-5: The vulnerability pipeline — your first real project

→SI-4: How SIEM data proves you're monitoring for threats

In Lesson 4, you learned AC-2 and AU-6. Now let's cover the three controls that drive the most integration work. For each one, I'll explain: what it requires, where the data comes from, what YOU build, and what an assessor wants to see.

CM-8: System Component Inventory

In plain English: You must know exactly what's in your system — every server, database, application, and cloud resource. And the list must be current, not a year-old spreadsheet.

Source systems: CMDBs (ServiceNow CMDB), cloud APIs (AWS Config, Azure Resource Graph), network scanners (Axonius, Tanium)

Your integration: Pull asset lists from cloud APIs and CMDBs into the GRC platform. Reconcile: does every cloud asset appear in the CMDB? Flag anything missing.

Assessor wants: A current, complete inventory that's refreshed automatically — not manually maintained.

RA-5: Vulnerability Monitoring and Scanning

In plain English: Scan your systems for security weaknesses regularly. Track every finding. Fix them within required timeframes. Prove you did it.

Source systems: Vulnerability scanners (Tenable, Qualys, Rapid7), cloud security (AWS Security Hub)

Your integration: Pull scan results → validate → create POA&M items → track remediation → auto-close when verified fixed. This is your first real integration project in Phase 2.

Assessor wants: Proof that scans run on schedule, all findings are tracked to closure, and the whole process is documented and automated.

SI-4: System Monitoring

In plain English: Actively watch your systems for attacks, unauthorized connections, and suspicious behavior. Don't just hope nothing bad happens — monitor for it.

Source systems: SIEMs (Splunk, Microsoft Sentinel, Elastic), EDR tools (CrowdStrike, Defender)

Your integration: Pull SIEM alert summaries and monitoring metrics into the GRC platform. Build dashboards showing monitoring coverage and response times.

Assessor wants: Evidence that monitoring is active, alerts are generated, and someone is responding to them.

🧠 QUICK CHECK

Which control does the vulnerability-to-POA&M pipeline support?

Explanation: RA-5 requires vulnerability scanning with findings tracked to remediation. The vulnerability-to-POA&M pipeline — pulling scan results, creating tracking items, monitoring fixes, closing when resolved — directly automates RA-5 compliance. This is the most common GRC integration.

🔧 DO THIS NOW

Add CM-8, RA-5, and SI-4 to the control mapping table you started in Lesson 4. You now have 5 controls mapped — this is a real deliverable that demonstrates to employers you understand the control-to-integration connection.

📝 LESSON RECAP

✓CM-8 → CMDBs and cloud APIs → asset inventory is the foundation

✓RA-5 → vulnerability scanners → your first real integration project

✓SI-4 → SIEMs → proves active threat monitoring

✓For every control: What data? What system? What API? How often?

Lesson 7: OAuth & Your First GRC Platform

Week 4 · Technical — How integrations log in, and your free lab environment.

🎯 WHAT YOU'RE ABOUT TO LEARN

→How OAuth 2.0 works — in plain English

→Why credentials must NEVER be in your code

→Set up a free ServiceNow lab environment

→Make your first API call to a real GRC platform

How Does Your Script "Log In" to an API?

When you log into a website, you type a username and password. When your integration script connects to an API, it needs to prove its identity too. OAuth 2.0OAuth 2.0 Client Credentials: An authentication method for machine-to-machine communication. Your script uses a client ID + client secret to get a short-lived access token, then uses that token for API calls. The token expires (usually in 1 hour), forcing regular re-authentication. More secure than static passwords. is the standard way to do this for automated integrations.

The 4-Step Flow

Register your integration — Create an "app registration" in the target platform. You receive a client ID (like a username) and a client secret (like a password).

Request a token — Your script sends the ID and secret to the platform's login endpoint. The platform validates them and returns a short-lived access token (usually expires in 1 hour).

Use the token — Your script includes the token in every API call: Authorization: Bearer <token>

Handle expiration — When the token expires, your script requests a new one. Good integrations do this proactively.

The #1 Security Rule: Never Hardcode Credentials

❌ NEVER DO THIS

password = "MyS3cret123!"
If you push this to GitHub, the entire internet has your password.

✅ ALWAYS DO THIS

import os
password = os.environ.get("SNOW_PWD")
Credential lives in the environment, not your code.

🔧 SET YOUR ENVIRONMENT VARIABLE

Windows (Command Prompt): set SNOW_PWD=your-password-here

Mac/Linux (Terminal): export SNOW_PWD=your-password-here

Set Up Your Free ServiceNow Lab

📋 FOLLOW ALONG

Go to developer.servicenow.com and create a free account. This is 100% free — no credit card needed.

After signing up, click "Start Building" or navigate to your instances. Request a Personal Developer Instance (PDI). It takes 2-5 minutes. You'll get a URL like dev12345.service-now.com and admin credentials.

Create snow_test.py and run it:

import requests, os instance = "dev12345" # Replace with YOUR instance name url = f"https://{instance}.service-now.com/api/now/table/incident" resp = requests.get(url, auth=("admin", os.environ.get("SNOW_PWD", "your-password")), params={"sysparm_limit": 3}, headers={"Accept": "application/json"}) print("Status:", resp.status_code) for inc in resp.json()["result"]: print(f" {inc['number']}: {inc['short_description']}")

✅ EXPECTED

Status: 200, followed by a few sample incidents from your PDI.

⚠️ TROUBLESHOOTING

Status 401: Wrong username or password. Double-check your PDI credentials.

ConnectionError: Check the instance URL — it should be devXXXXX.service-now.com (no https:// in the instance variable if you're using the f-string format shown above).

PDI is "hibernating": Free instances sleep after inactivity. Go to developer.servicenow.com and wake it up.

📝 LESSON RECAP

✓OAuth: ID+secret → token → use token → handle expiration

✓NEVER hardcode credentials — use os.environ.get()

✓ServiceNow PDI is free at developer.servicenow.com

✓The Table API: /api/now/table/{table_name}

✓This PDI is your lab for the rest of the course

Lesson 8: FISMA, SSPs & POA&Ms

Week 4 · GRC — The laws that create demand and the documents your integrations feed.

🎯 WHAT YOU'RE ABOUT TO LEARN

→FISMA and FedRAMP — why your job exists

→What an SSP contains and which parts you automate

→The POA&M lifecycle — the exact workflow your first integration automates

Why Does GRC Integration Work Exist?

FISMA

Federal Information Security Modernization Act

The U.S. law that says: every federal agency must manage cybersecurity risk. FISMA is WHY RMF exists, why NIST 800-53 matters, why agencies buy GRC platforms, and ultimately why YOU have a job.

FedRAMP

Federal Risk and Authorization Management Program

Applies RMF to cloud services. If AWS or Azure want federal government customers, they must get FedRAMP authorized. FedRAMP requires rigorous continuous monitoring — creating massive demand for exactly what you build: automated evidence collection pipelines.

The POA&M — Your First Integration Target

A POA&MPOA&M (Plan of Action & Milestones): A list tracking every known security weakness — what it is, how severe, who's responsible for fixing it, and when it's due. Every unresolved finding becomes a POA&M item. tracks every known security weakness. Think of it as a to-do list for security fixes, but formal and audited. Your first real integration project automates this lifecycle:

Discovery — Scanner finds a vulnerability.
Your integration detects new findings in the scanner's API.

Creation — A POA&M item is created with all details: what's wrong, how bad, which system, who owns it, when it's due.
Your integration creates this record automatically via API.

Tracking — The responsible team works on fixing it.
Human action — but your integration tracks status.

Remediation — The fix is applied (patch, config change, etc.).
Human action.

Verification — Scanner re-scans and the finding is gone.
Your integration detects the finding no longer appears.

Closure — POA&M item is closed with evidence.
Your integration updates the status and adds verification evidence.

This is your first integration: Scanner finds vulnerability → your integration creates POA&M item. Scanner confirms fix → your integration closes the POA&M item. This is the most common and most valuable GRC integration. You'll build it in Phase 2.

⚠️ THE AUTO-CLOSE DEBATE

Should your integration automatically close POA&M items when the scanner says they're fixed? This is a policy decision, not a technical one. Some organizations allow it. Others require a human to review before closure. Safe default: move items to "Pending Verification" and let a human close them. Automate more once trust is established.

🧠 QUICK CHECK

Your integration should auto-close every POA&M when the scanner says it's fixed. True or false?

Explanation: Auto-closure is a policy decision. Some orgs allow it, others don't. Start with "Pending Verification" status and let humans close. Once trust is established, you can propose more automation.

📝 LESSON RECAP

✓FISMA = the law creating federal cybersecurity requirements

✓FedRAMP = RMF applied to cloud — drives continuous monitoring demand

✓SSP = the document describing the system and its controls

✓POA&M = the weakness tracker — your first integration automates its lifecycle

✓Steps 1, 2, 5, 6 are automated by your integration; steps 3, 4 are human

🏢 INTERVIEW TIP

The vulnerability-to-POA&M pipeline is what employers ask about most: "Walk me through how a vulnerability finding becomes a POA&M item." If you can answer with specifics — which API, what field mapping, how you handle duplicates, how you calculate due dates — you stand out from every other candidate.

Phase 1 Checkpoint

Day 30 — Test yourself honestly. Click each item you can confidently do.

YOUR READINESS SCORE

0/16

Click items below to check them off

Technical Skills

Can you do these without looking everything up?

PYTHON & APIs

GRC Knowledge

Can you explain these in plain English to someone non-technical?

COMPLIANCE & FRAMEWORKS

📂 YOUR PORTFOLIO AT DAY 30

✓ A GitHub repo with Python scripts from Lessons 1, 3, 5, and 7

✓ A working API call to your ServiceNow PDI

✓ A control-to-integration mapping table (5 controls)

✓ Written notes on RMF, SSPs, and POA&Ms

✓ A JSON file exported by your script

If you scored 12+ out of 16: You're ready for Phase 2!
If 8-11: Review your weak areas, then proceed.
If below 8: Spend another week on the lessons above before moving on. It's better to have a solid foundation.

WHAT'S COMING IN PHASE 2

You'll build your first real end-to-end integration: pulling security findings from AWS Security Hub, transforming them, validating them, and loading them into ServiceNow as POA&M-style records — with upsert logic, structured logging, and error handling. Everything you learned in Phase 1 comes together into a working product.

Phase 2: Days 31–60

Everything from Phase 1 comes together. You'll build a real, working integration from scratch.

🎯 THE PHASE 2 MISSION

→Build a working pipeline: AWS Security Hub → Python → ServiceNow

→Learn authentication, data mapping, validation, and upsert logic

→Add structured logging and error handling

→Understand POA&M field mapping and evidence quality

→By Day 60, you'll have done — in miniature — exactly what this role does in production

💼 WHAT CHANGES IN PHASE 2

In Phase 1, you learned skills separately — Python in one lesson, GRC concepts in another. In Phase 2, they merge. Every lesson builds toward one goal: a working integration that pulls security findings from a cloud platform, validates them, and loads them into a GRC tool. The technical and GRC tracks are no longer separate — they're one workflow.

📊 WHAT YOU'RE BUILDING

Week-by-Week Plan

ServiceNow API mastery + POA&M field mapping
Build a reusable API client. Learn the upsert pattern. Map scanner fields to POA&M fields.

AWS Security Hub + Control inheritance
Pull real cloud findings with boto3. Understand common vs hybrid vs system-specific controls.

Build the integration + Evidence quality
Wire the full Extract → Transform → Validate → Load pipeline. Understand why your pipeline IS the evidence.

Logging & error handling + SCAP/STIGs
Make it production-grade with structured logging, error levels, and exit codes.

✅ PREREQUISITES — MAKE SURE YOU HAVE

☐ Python with requests installed and working

☐ A ServiceNow PDI provisioned and accessible via API

☐ Git set up with your GitHub repo

☐ Understanding of dictionaries, .get(), lists, loops, and validation

☐ Understanding of RMF, controls, SSPs, and POA&Ms

If anything feels shaky, revisit the Phase 1 lessons first. Phase 2 builds directly on everything above.

Lesson 10: ServiceNow API Mastery

Week 5 · Technical — Build a reusable client that talks to your GRC platform. Step by step.

🎯 WHAT YOU'RE ABOUT TO LEARN

→How ServiceNow's Table API works — the URL pattern you'll use for everything

→How to create, read, update, and delete records via API (CRUD)

→The find_or_create "upsert" pattern — the most important pattern in this course

→What sys_id is and why it trips up beginners

💼 WHY THIS MATTERS

ServiceNow is the most common GRC platform in enterprise and federal environments. The API client you build in this lesson will be the foundation of every integration that loads data into your GRC platform — vulnerability findings, asset inventories, access reviews, and more. You'll reuse this code for the rest of the course and your career.

Part 1: The Table API — One Pattern for Everything

ServiceNow stores everything in tables. Incidents are in the incident table. Users are in sys_user. CMDB assets are in cmdb_ci. GRC items are in sn_grc_item. The URL to access any table always follows the same pattern:

📄 The ServiceNow Table API pattern

# The URL pattern is always: # https://YOUR-INSTANCE.service-now.com/api/now/table/TABLE_NAME # Examples: GET /api/now/table/incident # List all incidents GET /api/now/table/incident/{sys_id} # Get ONE specific incident POST /api/now/table/incident # Create a new incident PATCH /api/now/table/incident/{sys_id} # Update specific fields

WHAT EACH LINE MEANS

GET /table/incident — Read — "Give me a list of incidents." Returns multiple records.

GET /table/incident/{sys_id} — Read one — "Give me this specific incident." The {sys_id} is the record's unique ID.

POST /table/incident — Create — "Make a new incident." You send the data in the request body.

PATCH /table/incident/{sys_id} — Update — "Change these fields on this specific incident."

Part 2: The sys_id Trap

Every record in ServiceNow has a sys_idsys_id: A 32-character hex string (like "6816f79cc0a8016401c5a33be04be441") that uniquely identifies every record in ServiceNow. When the API returns a related field like "assigned_to," it gives you the sys_id, not the person's name. — a 32-character identifier. When you ask for "assigned_to," the API returns 6816f79cc0a8016401c5a33be04be441, not "John Smith."

😖 WHAT THE API RETURNS BY DEFAULT

"assigned_to": "6816f79cc0a80164..."

Who is that?!

😊 WITH sysparm_display_value=true

"assigned_to": "John Smith"

Much better.

Add sysparm_display_value=true to your query parameters to get human-readable names instead of sys_ids.

Part 3: Building Your Reusable Client

Instead of copying the same API call code everywhere, we'll build a class — a reusable container for related functions. Think of it like a toolbox: you build it once, then grab the right tool whenever you need it.

🔧 DO THIS NOW

Create a new file called snow_client.py. This will be your reusable ServiceNow API toolbox:

📄 snow_client.py — your reusable API client

import requests class ServiceNowClient: """A reusable client for talking to the ServiceNow API.""" def __init__(self, instance, username, password): """Set up the connection info. Called when you create the client.""" self.base_url = f"https://{instance}.service-now.com/api/now" self.auth = (username, password) self.headers = { "Accept": "application/json", "Content-Type": "application/json" } def get_records(self, table, query="", limit=100): """Pull all records from a table, handling pagination.""" all_records = [] offset = 0 while True: resp = requests.get( f"{self.base_url}/table/{table}", auth=self.auth, headers=self.headers, params={ "sysparm_query": query, "sysparm_limit": limit, "sysparm_offset": offset } ) resp.raise_for_status() # Crash with details if not 200 records = resp.json()["result"] all_records.extend(records) if len(records) < limit: break # Last page offset += limit return all_records def create_record(self, table, data): """Create a single new record.""" resp = requests.post( f"{self.base_url}/table/{table}", auth=self.auth, headers=self.headers, json=data ) resp.raise_for_status() return resp.json()["result"] def update_record(self, table, sys_id, data): """Update specific fields on an existing record.""" resp = requests.patch( f"{self.base_url}/table/{table}/{sys_id}", auth=self.auth, headers=self.headers, json=data ) resp.raise_for_status() return resp.json()["result"] def find_or_create(self, table, query, data): """The UPSERT pattern — prevents duplicates! Check if a record exists. If yes, update it. If no, create it.""" existing = self.get_records(table, query=query, limit=1) if existing: print(" → Record exists, updating") return self.update_record(table, existing[0]["sys_id"], data), "updated" print(" → New record, creating") return self.create_record(table, data), "created"

KEY CONCEPTS IN THIS CODE

class ServiceNowClient: — A class is a blueprint for creating objects. Think of it like a template: you define it once, then create instances of it. Each instance remembers your connection settings.

def __init__(self, ...): — The constructor. This runs automatically when you create a new client. self refers to "this specific client instance."

self.base_url = ... — Stores the URL on the client so every method can use it. self. means "save this on the client object."

resp.raise_for_status() — If the API returned an error (401, 500, etc.), this line crashes with details instead of silently continuing with bad data.

find_or_create — The most important method. It checks if a record already exists before creating. This prevents duplicates.

Part 4: Test It

🔧 DO THIS NOW

Create test_snow.py to test your client:

📄 test_snow.py

import os from snow_client import ServiceNowClient # Create the client snow = ServiceNowClient( instance=os.environ.get("SNOW_INSTANCE", "dev12345"), username="admin", password=os.environ.get("SNOW_PWD", "your-password") ) # Test: Create an incident result = snow.create_record("incident", { "short_description": "Test from Python - GRC Integration Course", "priority": "3" }) print(f"Created: {result['number']} (sys_id: {result['sys_id']})") # Test: find_or_create — run this TWICE record, action = snow.find_or_create( "incident", query="short_description=Upsert Test Finding V-001", data={"short_description": "Upsert Test Finding V-001", "priority": "2"} ) print(f"Action: {action}")

✅ FIRST RUN — EXPECTED OUTPUT

Created: INC0010043 (sys_id: a1b2c3d4...) → New record, creating Action: created

🔧 NOW RUN IT AGAIN — same command

python test_snow.py

✅ SECOND RUN — EXPECTED OUTPUT

Created: INC0010044 (sys_id: e5f6g7h8...) → Record exists, updating Action: updated

Notice: create_record made a NEW incident (INC0010044) — that's a duplicate! But find_or_create found the existing record and updated it instead. That's the upsert pattern working.

⚠️ WHY UPSERT IS CRITICAL — A REAL SCENARIO

Your integration runs every night at 2 AM. Monday: it loads 500 vulnerability findings. Tuesday: nothing changed, but it runs again. Without upsert: you now have 1,000 records — 500 duplicates. The CISO's dashboard shows double the risk. Compliance team panics. You get a call at 8 AM.

With upsert: Tuesday's run finds all 500 records already exist and updates them. Zero duplicates. Dashboard is accurate. You sleep peacefully.

🧠 QUICK CHECK

Your integration uses POST to load 200 findings on Monday. Tuesday, nothing changed, it runs again. What happens?

Explanation: POST always creates a new record. Without upsert logic (find_or_create), you get 400 records — 200 duplicates. The API doesn't check for you. YOU must check with GET before creating with POST.

📝 LESSON RECAP

✓ServiceNow Table API: /api/now/table/{table_name} for everything

✓sys_id = 32-char unique ID for every record. Use sysparm_display_value=true for readable names

✓Build a reusable client class — you'll use it for every integration

✓find_or_create = the upsert pattern. Prevents duplicates. Use it ALWAYS for recurring integrations

📂 FILES YOU SHOULD HAVE NOW

✓ snow_client.py — your reusable ServiceNow API client

✓ test_snow.py — test script proving CRUD and upsert work

git add . && git commit -m "Lesson 10: ServiceNow API client with upsert" && git push

Lesson 11: POA&M Field Mapping

Week 5 · GRC — The design document for your integration: which scanner field goes where.

🎯 WHAT YOU'RE ABOUT TO LEARN

→Every field in a POA&M item and where its data comes from

→How to calculate remediation due dates from severity

→The auto-close debate — and your safe default

→How to create a field mapping document (a real job deliverable)

What Is a Field Mapping?

A field mapping is a document that says: "This field in System A becomes this field in System B, with this transformation." It's the blueprint for your integration. Before you write any code, you write the mapping. This is a real deliverable you'll create on the job for every integration.

Scanner Fields → POA&M Fields

Here's the mapping for your Security Hub → ServiceNow integration. Each row shows: what the scanner calls it, what ServiceNow calls it, and any transformation needed:

SCANNER FIELD (AWS Security Hub)	POAM FIELD (ServiceNow)	TRANSFORMATION
`Title`	`short_description`	Copy directly, truncate to 160 characters
`Severity.Label`	`priority`	Map: CRITICAL→1, HIGH→2, MEDIUM→3, LOW→4
`Resources[0].Id`	`u_resource_id`	Store the AWS resource ARN
`CreatedAt`	`u_first_seen`	Copy the ISO date string
`Id`	`u_correlation_id`	Store for deduplication (used by find_or_create)
`Compliance.Status`	`state`	FAILED → "Open", PASSED → trigger closure
(calculated)	`due_date`	discovery_date + days based on severity
(static)	`u_source`	Always "AWS Security Hub"

The correlation_id is the key to upsert. In Lesson 10, you learned find_or_create checks if a record already exists. HOW does it check? It searches for a record matching the correlation_id — the scanner's unique finding ID. If it finds one, it updates. If not, it creates. That's why storing the source ID is critical.

Calculating Due Dates

Different severity levels get different remediation deadlines. Most organizations follow a policy like this:

Critical

30 days

High

90 days

Medium

180 days

Low

365 days

🔧 DO THIS NOW

Create due_dates.py — a function to calculate due dates:

📄 due_dates.py

from datetime import datetime, timedelta # Store timeframes in a config dictionary — NOT hardcoded in the function # Different organizations have different policies REMEDIATION_DAYS = { "Critical": 30, "High": 90, "Medium": 180, "Low": 365, } def calculate_due_date(severity, discovery_date): """Calculate when a finding must be fixed, based on severity.""" days = REMEDIATION_DAYS.get(severity, 180) # Default 180 if unknown discovered = datetime.fromisoformat(discovery_date.replace("Z", "")) due = discovered + timedelta(days=days) return due.strftime("%Y-%m-%d") # Test it print(calculate_due_date("Critical", "2024-12-01T08:00:00Z")) # 30 days later print(calculate_due_date("High", "2024-12-01T08:00:00Z")) # 90 days later print(calculate_due_date("Low", "2024-12-01T08:00:00Z")) # 365 days later

✅ EXPECTED OUTPUT

2024-12-31 2025-03-01 2025-12-01

⚠️ THE AUTO-CLOSE DEBATE

When a scanner shows a finding is resolved, should your integration automatically close the POA&M item?

This is a policy decision, not a technical one. Some organizations allow full automation. Others require a human to verify before closure.

Your safe default: Move resolved items to "Pending Verification" and let a human close them. Once the compliance team trusts your integration (usually after a few months), you can propose more automation.

✏️ MINI EXERCISE

Create your own field mapping document — a spreadsheet or table with columns: Source Field | Target Field | Transformation | Notes. Fill in all 8 rows from the table above. This is a real deliverable you'd create on the job.

📝 LESSON RECAP

✓A field mapping documents exactly which source field becomes which target field

✓The correlation_id (source finding ID) enables upsert/deduplication

✓Due dates are calculated from severity — make the timeframes configurable

✓Auto-close is a policy decision — default to human verification

✓The mapping document is a real job deliverable — create it before writing code

Lesson 12: AWS Security Hub & boto3

Week 6 · Technical — Pulling real cloud security findings, step by step.

🎯 WHAT YOU'RE ABOUT TO LEARN

→What AWS Security Hub is and why it's your ideal first source

→The ASFF format — how every finding is structured

→How to use boto3 (AWS's Python library) to pull findings

→Setting up least-privilege credentials

💼 WHY SECURITY HUB IS YOUR IDEAL FIRST SOURCE

Security Hub aggregates findings from dozens of AWS security services (GuardDuty, Inspector, Config, IAM Access Analyzer) into one API with a standard format. One integration gives you findings from many tools, already normalized. It's the perfect learning source because you don't need to learn 10 different scanner APIs — just one.

Part 1: What a Finding Looks Like

Every finding in Security Hub follows the ASFFASFF (AWS Security Finding Format): The standard data format for all Security Hub findings. Every finding has the same fields regardless of which service generated it — Title, Severity, Resources, Compliance status, etc. format. Here are the fields you'll map to POA&M items:

📄 A real Security Hub finding — the fields you'll use

{ "Id": "arn:aws:securityhub:us-east-1:123456:finding/abc", ← unique ID "Title": "S3 bucket does not have encryption enabled", ← what's wrong "Severity": { "Label": "HIGH", ← CRITICAL, HIGH, MEDIUM, LOW, or INFORMATIONAL "Normalized": 70 ← numeric score (0-100) }, "Compliance": { "Status": "FAILED" ← PASSED or FAILED — did the check pass? }, "Resources": [{ ← which AWS resource is affected "Type": "AwsS3Bucket", "Id": "arn:aws:s3:::my-unencrypted-bucket", "Region": "us-east-1" }], "CreatedAt": "2024-11-15T08:30:00Z", ← when it was found "RecordState": "ACTIVE" ← ACTIVE or ARCHIVED }

READING NESTED FIELDS — STEP BY STEP

finding["Title"] → "S3 bucket does not have encryption enabled"

finding["Severity"]["Label"] → "HIGH" — two levels deep: first get Severity object, then get Label inside it

finding["Resources"][0]["Id"] → "arn:aws:s3:::my-unencrypted-bucket" — Resources is a list, [0] gets the first item

Part 2: Pulling Findings with boto3

🔧 SETUP FIRST

Install boto3: pip install boto3

You need AWS credentials. Set up a free-tier account at aws.amazon.com, enable Security Hub, and create an IAM user with only AWSSecurityHubReadOnlyAccess.

Set your credentials as environment variables:

export AWS_ACCESS_KEY_ID="your-access-key" export AWS_SECRET_ACCESS_KEY="your-secret-key" export AWS_DEFAULT_REGION="us-east-1"

⚠️ LEAST PRIVILEGE — THIS IS AC-6 APPLIED TO YOUR OWN WORK

Your IAM user should have ONLY AWSSecurityHubReadOnlyAccess. Not admin. Not power user. Why? If this credential ever leaks, an attacker can read findings but can't modify anything. This is the same AC-6 (Least Privilege) control you're building evidence for — practice what you preach.

🔧 DO THIS NOW

Create pull_findings.py:

📄 pull_findings.py — pulling findings from AWS

import boto3 import json # Create a Security Hub client client = boto3.client("securityhub", region_name="us-east-1") # Use a paginator — it handles multi-page results automatically paginator = client.get_paginator("get_findings") # Pull only ACTIVE, FAILED, Critical/High findings findings = [] for page in paginator.paginate(Filters={ "RecordState": [{"Value": "ACTIVE", "Comparison": "EQUALS"}], "ComplianceStatus": [{"Value": "FAILED", "Comparison": "EQUALS"}], "SeverityLabel": [ {"Value": "CRITICAL", "Comparison": "EQUALS"}, {"Value": "HIGH", "Comparison": "EQUALS"} ] }): findings.extend(page["Findings"]) print(f"Found {len(findings)} active Critical/High findings") # Show first 3 for f in findings[:3]: print(f" {f['Severity']['Label']:10} | {f['Title'][:60]}") # Save to file for the next lesson with open("findings.json", "w") as f: json.dump(findings, f, indent=2) print("Saved to findings.json")

KEY CONCEPTS

boto3.client("securityhub") — Create a connection to the Security Hub service. boto3 reads your credentials from environment variables automatically.

client.get_paginator(...) — A paginator handles multi-page results for you. Instead of writing your own pagination loop (like you did for ServiceNow), boto3 does it automatically.

Filters={...} — Tell Security Hub "only give me findings that match these criteria." We want ACTIVE (not archived), FAILED (not passing), and Critical or High severity.

⚠️ IF YOU DON'T HAVE AWS SET UP YET

No problem — you can use mock data instead. Create findings.json with 3-5 fake findings following the ASFF structure above. The rest of the course works the same whether you pull from real AWS or a local file. Real AWS is better for your portfolio, but the learning is identical.

📝 LESSON RECAP

✓Security Hub aggregates findings from many AWS services into one API

✓ASFF is the standard format — learn its key fields: Id, Title, Severity, Resources, Compliance

✓boto3 paginators handle multi-page results automatically

✓IAM credentials should be read-only (least privilege = AC-6)

✓Save findings to a JSON file — you'll use it in the next lesson

Lesson 13: Control Inheritance

Week 6 · GRC — Where evidence comes from when systems share infrastructure.

🎯 WHAT YOU'RE ABOUT TO LEARN

→Why one control can have evidence from different sources

→Three types: common, system-specific, and hybrid

→The cloud shared responsibility model

→Why this matters for where your integration pulls data from

The Problem: 200 Systems, 1 Datacenter

Imagine an agency has 200 systems, all in the same cloud environment. Does each system independently prove that the data center has physical security guards? Of course not — physical security is handled once by the cloud provider, and all 200 systems inherit that protection.

This concept — control inheritanceControl Inheritance: When a shared service implements a security control once and multiple systems inherit that implementation. The shared provider maintains the evidence; inheriting systems just reference it. — determines where your integration pulls evidence from. Get it wrong, and you're pulling from the wrong source.

Three Types of Controls

Common Controls

What it means: A shared service implements the control ONCE. All systems using that service inherit it. The shared provider maintains the evidence.

Example: Physical security of a data center. The cloud provider locks the doors, runs the cameras, checks badges. You don't need to prove this for each system — you inherit it.

Your integration: Pull evidence from the shared provider's systems, NOT from each individual system.

System-Specific Controls

What it means: Each system implements the control independently. Evidence comes directly from that system.

Example: Application-level role assignments. The HR app decides who gets admin access within the app — that's specific to the HR app, not shared.

Your integration: Pull directly from the system's own APIs and tools.

Hybrid Controls

What it means: Split responsibility. The shared provider does part, the system does part. Both must produce evidence for their portion.

Example: AC-2 (Account Management) — the organization's identity provider (Entra ID) manages authentication centrally, but each application manages its own role assignments within the app.

Your integration: Pull from BOTH sources and stitch the evidence together. Identity data from Entra ID AND role data from the application.

The Cloud Shared Responsibility Model

Every cloud provider (AWS, Azure, GCP) has a model splitting responsibilities:

WHAT	CLOUD PROVIDER HANDLES	YOU HANDLE
Physical security (PE)	✓ Common — inherited	N/A
Network infrastructure (SC)	Base network fabric	VPCs, security groups, firewall rules
OS patching (SI-2)	Managed services (RDS, Lambda)	EC2 instances — you patch these
Access control (AC-2)	IAM platform	Users, roles, policies you configure
Data encryption (SC-28)	Offers encryption services	YOU must turn them on and configure them

🧠 QUICK CHECK

AC-2 (Account Management) in a cloud environment is typically which type of control?

Explanation: AC-2 is usually hybrid in cloud: the cloud provider's IAM platform handles centralized authentication (common part), but each application manages its own role assignments and access policies (system-specific part). Your integration needs to pull from BOTH sources — identity data from the provider AND role data from each application.

Why this matters for YOUR integration design: Before building any integration, ask: "Who implements this control?" If it's common, pull from the shared provider. If system-specific, pull from the system. If hybrid, pull from both. Getting this wrong means your evidence comes from the wrong source — and an assessor will catch it.

📝 LESSON RECAP

✓Common = shared provider implements once, everyone inherits

✓System-Specific = each system implements and provides its own evidence

✓Hybrid = split responsibility, evidence from multiple sources

✓Cloud shared responsibility maps directly to control inheritance

✓Always ask "who implements this control?" before designing the integration

Lesson 14: Build the Integration

Week 7 · Technical — This is it. You're wiring the full pipeline end to end.

🎯 WHAT YOU'RE ABOUT TO LEARN

→Write the transform function — converting Security Hub format to ServiceNow format

→Write the validate function — rejecting bad records before loading

→Wire the complete Extract → Transform → Validate → Load pipeline

→Run it twice and prove no duplicates are created

💼 THIS IS THE REAL THING

Everything you've learned in 13 lessons comes together right now. This is not a practice exercise — this is the same pipeline architecture used in production GRC integrations. The only difference is scale: production handles thousands of findings; yours handles dozens. The code patterns are identical.

The ETL Pipeline — What You're Building

📊 THE FOUR STAGES

WHAT EACH STAGE DOES

E — Extract: Pull raw findings from Security Hub (or from your findings.json file)

T — Transform: Convert each finding from ASFF format to ServiceNow format using your field mapping

V — Validate: Check each transformed record has all required fields and valid values. Reject bad records.

L — Load: Use find_or_create to upsert each valid record into ServiceNow

Step 1: The Transform Function

🔧 DO THIS NOW

Create pipeline.py — this will be your complete integration:

📄 pipeline.py — the transform function

import json, os from snow_client import ServiceNowClient # ── CONFIGURATION ── SEVERITY_MAP = { "CRITICAL": "1", # ServiceNow priority 1 = Critical "HIGH": "2", # ServiceNow priority 2 = High "MEDIUM": "3", "LOW": "4", } # ── TRANSFORM: ASFF → ServiceNow format ── def transform_finding(finding): """Convert one Security Hub finding into a ServiceNow record.""" resource = finding.get("Resources", [{}])[0] # First resource, or empty dict severity = finding.get("Severity", {}).get("Label", "MEDIUM") return { "short_description": finding.get("Title", "No title")[:160], "priority": SEVERITY_MAP.get(severity, "3"), "u_correlation_id": finding.get("Id", ""), "u_resource_id": resource.get("Id", ""), "u_source": "AWS Security Hub", "state": "1" # 1 = New in ServiceNow }

LINE-BY-LINE EXPLANATION

finding.get("Resources", [{}])[0] — Safely get the Resources list. If it's missing, use [{}] (a list with one empty dict). Then get the first item with [0]. This never crashes even if Resources is missing.

finding.get("Severity", {}).get("Label", "MEDIUM") — Two levels of safe access. Get Severity dict (default empty), then get Label inside it (default "MEDIUM"). Chain of .get() calls = can't crash.

[:160] — Truncate to 160 characters. ServiceNow's short_description field has a length limit.

Step 2: The Validate Function

📄 Add to pipeline.py — validation

# ── VALIDATE: check before loading ── def validate_record(record): """Check that a transformed record is safe to load.""" errors = [] # Required fields must exist and not be empty for field in ["short_description", "priority", "u_correlation_id"]: if not record.get(field): errors.append(f"Missing: {field}") # Priority must be 1, 2, 3, or 4 if record.get("priority") not in ["1", "2", "3", "4"]: errors.append(f"Invalid priority: {record.get('priority')}") return len(errors) == 0, errors

Step 3: Wire the Full Pipeline

📄 Add to pipeline.py — the main pipeline

# ── EXTRACT: load findings ── with open("findings.json", "r") as f: findings = json.load(f) # ── SET UP SERVICENOW CLIENT ── snow = ServiceNowClient( instance=os.environ.get("SNOW_INSTANCE", "dev12345"), username="admin", password=os.environ.get("SNOW_PWD", "password") ) # ── RUN THE PIPELINE ── stats = {"extracted": len(findings), "valid": 0, "invalid": 0, "created": 0, "updated": 0, "errors": 0} print(f"Processing {stats['extracted']} findings...") for finding in findings: try: # T: Transform record = transform_finding(finding) # V: Validate is_valid, errors = validate_record(record) if not is_valid: print(f" ❌ Invalid: {errors}") stats["invalid"] += 1 continue # Skip this record, move to the next # L: Load (upsert) query = f"u_correlation_id={record['u_correlation_id']}" _, action = snow.find_or_create("incident", query, record) stats[action] += 1 stats["valid"] += 1 except Exception as e: print(f" 💥 Error: {e}") stats["errors"] += 1 # ── PRINT RESULTS ── print(f"\n{'='*40}") print(f"Extracted: {stats['extracted']}") print(f"Valid: {stats['valid']}") print(f"Invalid: {stats['invalid']}") print(f"Created: {stats['created']}") print(f"Updated: {stats['updated']}") print(f"Errors: {stats['errors']}")

Step 4: Run It!

🔧 DO THIS NOW — FIRST RUN

python pipeline.py

✅ FIRST RUN — EXPECTED OUTPUT

Processing 47 findings... → New record, creating → New record, creating ... (many more) ======================================== Extracted: 47 Valid: 45 Invalid: 2 Created: 45 Updated: 0 Errors: 0

🔧 NOW RUN IT AGAIN — same command

python pipeline.py

✅ SECOND RUN — THE UPSERT PROOF

Processing 47 findings... → Record exists, updating → Record exists, updating ... ======================================== Extracted: 47 Valid: 45 Invalid: 2 Created: 0 ← ZERO new records! Updated: 45 ← All existing records updated! Errors: 0

Created: 0, Updated: 45. That's upsert working perfectly. No duplicates.

🧠 QUICK CHECK

Your pipeline shows "Invalid: 2". What happened to those 2 findings?

Explanation: The continue statement skips invalid records and moves to the next one. This is "record-level error handling" — one bad record doesn't kill the entire pipeline. The 45 valid records were loaded successfully.

📝 LESSON RECAP

✓Transform converts source format to target format using your field mapping

✓Validate checks required fields and valid values BEFORE loading

✓The pipeline: Extract → Transform → Validate → Load (ETVL)

✓Track stats: extracted, valid, invalid, created, updated, errors

✓Second run should show 0 created, all updated — that's upsert working

Lesson 15: Evidence Quality

Week 7 · GRC — Your pipeline IS the evidence, not just the data it moves.

🎯 WHAT YOU'RE ABOUT TO LEARN

→The evidence strength spectrum — from worthless to bulletproof

→Why your pipeline itself is compliance evidence (not just the data)

→What "chain of custody" metadata to capture on every run

The Evidence Strength Spectrum

When an assessor checks a control, they want proof. Not all proof is created equal:

❌ WEAK EVIDENCE

• "We review logs" — just a claim, no proof

• Dashboard screenshot — one moment in time, could be from last year

• Manual PDF export — no timestamp, no chain of custody, could be edited

✅ STRONG EVIDENCE (WHAT YOU PRODUCE)

• Automated logs showing pipeline ran 90/90 nights

• Record counts: 1,247 extracted, 1,230 valid, 17 rejected

• Live dashboard with timestamps updated daily

• Code in Git with full change history

The key insight: Your integration pipeline IS the evidence. Not just the data it moves — the pipeline itself, its logs, its run history, and its documentation collectively prove that the organization has a continuous, reliable, automated process. This is what assessors want to see: proof of ongoing practice, not a one-time effort.

Chain-of-Custody Metadata

Every time your pipeline runs, it should record these details. Think of it as a receipt for each run:

Run timestamp — When the pipeline executed (UTC, ISO 8601 format). Proves the pipeline ran at this specific time.

SRC

Source system — Which system was queried ("AWS Security Hub, us-east-1"). Proves where the data came from.

EXT

Records extracted — How many records pulled from the source. Proves completeness.

VAL

Records validated/rejected — How many passed or failed, with reasons. Proves data quality.

Records created/updated — What the pipeline actually did. Proves the work was done.

VER

Integration version — Git commit hash of the code. Proves which version produced these results.

🧠 QUICK CHECK

An assessor asks: "How do you know your vulnerability data is complete?" Which answer is stronger?

Explanation: The second answer provides specific, verifiable metrics with a 90-day track record. This is what continuous monitoring looks like. Your pipeline's run logs and metrics ARE the proof of completeness.

🔧 DO THIS NOW

The stats dictionary in your pipeline.py already captures extracted/valid/invalid/created/updated counts. Add a timestamp at the top of your pipeline:

from datetime import datetime, timezone run_time = datetime.now(timezone.utc).isoformat() print(f"Pipeline run started: {run_time}")

📝 LESSON RECAP

✓Screenshots are weak evidence; automated pipelines with metrics are strong

✓Your pipeline itself — its logs, its code, its run history — IS the evidence

✓Capture chain-of-custody metadata on every run: timestamp, counts, version

✓90 consecutive nightly runs > one screenshot from last Tuesday

Lesson 16: Structured Logging & Error Handling

Week 8 · Technical — Make your pipeline production-grade: observable, debuggable, audit-ready.

🎯 WHAT YOU'RE ABOUT TO LEARN

→Why print() isn't good enough — and what to use instead

→Structured JSON logging that's searchable and audit-ready

→Three levels of error handling: record, connection, fatal

→Exit codes that tell schedulers what happened

Part 1: Why print() Isn't Enough

Your pipeline will fail. At 2 AM. On a Saturday. When it does, your logs are the only witness. print() produces logs nobody can search, filter, or feed into a monitoring tool.

❌ UNSTRUCTURED (print)

Got 50 records
Error on some record
Done

No timestamp. No context. No severity. Useless at 2 AM.

✅ STRUCTURED JSON

{"ts":"2024-12-01T02:00:15Z","level":"INFO","msg":"Extraction complete","count":247}
{"ts":"...","level":"WARN","msg":"Validation failed","id":"V-042","reason":"Missing severity"}

Timestamped. Leveled. Searchable. Audit-ready.

Part 2: Three Error Handling Levels

Not all errors are equal. Your pipeline needs different responses for different failures:

⚡ Record-Level Errors — Log and Continue

What happens: One finding out of 500 has a bad field value.

What to do: Log the error with details (which record, what was wrong), skip that record, continue processing the other 499. Don't let one bad record kill the entire run.

for finding in findings: try: record = transform(finding) snow.find_or_create(table, query, record) except Exception as e: logger.warning(f"Skipped: {e}") # Log and move on stats["errors"] += 1

🔄 Connection-Level Errors — Retry with Backoff

What happens: The API returns 429 (rate limited) or 500 (server error).

What to do: Wait with exponential backoff (1s, 2s, 4s...) and retry. These are temporary problems that usually resolve themselves.

🛑 Fatal Errors — Log and Abort

What happens: Authentication fails (401), config is missing, can't reach any endpoint.

What to do: Log the error clearly and stop immediately. Do NOT retry 401 errors in a loop — you'll lock out the service account.

if resp.status_code == 401: logger.critical("Auth failed — check credentials. Aborting.") sys.exit(1) # Exit code 1 = failure

Part 3: Exit Codes

When a scheduler (cron, CloudWatch) runs your pipeline, it checks the exit codeExit Code: A number your script returns when it finishes. 0 = success, non-zero = something went wrong. Schedulers use this to decide whether to send alerts. to know what happened:

📄 Add to the end of pipeline.py

import sys if stats["errors"] > 0: print("❌ COMPLETED WITH ERRORS") sys.exit(1) # Failure — scheduler should alert elif stats["invalid"] > stats["valid"] * 0.1: print("⚠️ HIGH REJECTION RATE") sys.exit(2) # Warning — ran but something's off else: print("✅ SUCCESS") sys.exit(0) # All good

🔧 DO THIS NOW

Update your pipeline.py: replace all print() statements with descriptive messages that include timestamps. Add the exit code logic at the end. Push to GitHub.

git add . && git commit -m "Add structured logging and exit codes to pipeline" && git push

📝 LESSON RECAP

✓Structured JSON logs are searchable, timestamped, and audit-ready

✓Record errors: log and continue. Connection errors: retry. Fatal errors: abort.

✓Exit codes: 0 = success, 1 = failure, 2 = warning

✓Never retry 401 errors — you'll lock out the service account

Lesson 17: SCAP & STIGs

Week 8 · GRC — Configuration compliance standards that feed your pipeline.

🎯 WHAT YOU'RE ABOUT TO LEARN

→What SCAP and STIGs are — in plain English

→How configuration scan results feed the same pipeline you just built

→STIG severity categories (CAT I, II, III)

The Problem STIGs Solve

Imagine a 200-page document that says exactly how to configure a Windows server securely: what settings to enable, what ports to close, what services to disable. Now imagine a human manually checking every setting on every server. That's impossibly slow and error-prone.

SCAPSCAP (Security Content Automation Protocol): A suite of machine-readable specifications for expressing security configuration requirements. Instead of humans reading 200-page guides, tools use SCAP content to check systems automatically. solves this by making those rules machine-readable. STIGsSTIG (Security Technical Implementation Guide): DoD-specific configuration standards. Each STIG defines hundreds of rules for how a specific technology (Windows, Linux, Oracle, etc.) must be configured to be secure. are the DoD's configuration checklists. Together, they let scanners check hundreds of settings in minutes.

STIG Severity Categories

CAT I

High Severity

Directly results in loss of confidentiality, integrity, or availability

CAT II

Medium Severity

Could result in loss if combined with other weaknesses

CAT III

Low Severity

Degrades security measures but doesn't directly cause loss

How This Connects to Your Pipeline

Here's the good news: STIG scan results have the same shape as the vulnerability findings you've been working with — severity, rule ID, affected system, compliance status. Your existing pipeline patterns apply directly:

Scanner evaluates system against STIG rules (hundreds of configuration checks)

Non-compliant rules become findings with severity, rule ID, and affected system

Your pipeline pulls these findings, transforms, validates, and creates POA&M items

GRC platform tracks remediation — evidence for CM-6 (Configuration Settings) and CM-2 (Baseline Configuration)

You don't need to memorize STIG rules. You need to understand: STIGs exist, they define "correctly configured," tools scan against them, results produce findings in a familiar format, and those findings feed the same pipeline you just built. Same architecture, different data source.

📝 LESSON RECAP

✓SCAP makes security configuration checks machine-readable

✓STIGs are DoD configuration standards — hundreds of rules per technology

✓CAT I = High, CAT II = Medium, CAT III = Low

✓STIG findings have the same shape as vulnerability findings — same pipeline applies

✓Configuration compliance supports CM-6 and CM-2 controls

Phase 2 Checkpoint

Day 60 — Do you have a working integration? Click each item you can confidently do.

YOUR READINESS SCORE

0/16

Click items below to check them off

Technical Skills

Can you do these with your working integration as proof?

INTEGRATION ENGINEERING

GRC Knowledge

Can you explain these in the context of your integration?

COMPLIANCE & EVIDENCE

📂 YOUR PORTFOLIO AT DAY 60

✓ snow_client.py — Reusable ServiceNow API client with CRUD and upsert

✓ pipeline.py — Complete ETL pipeline: extract, transform, validate, load

✓ pull_findings.py — AWS Security Hub data extraction

✓ due_dates.py — Remediation deadline calculator

✓ findings.json — Sample data from your source system

✓ Field mapping document (spreadsheet or table)

✓ Control-to-integration mapping table (5+ controls)

✓ All code on GitHub with clear, descriptive commit messages

🏢 YOUR INTERVIEW STATEMENT

"I built an integration that pulls security findings from AWS Security Hub, transforms them using a documented field mapping, validates each record before loading, and upserts them into ServiceNow using the Table API with deduplication logic. The pipeline tracks extraction, validation, and load metrics, uses structured logging for audit readiness, and handles record-level errors without killing the full run."

If you can say that sentence and back it up with your GitHub repo, you can interview for junior GRC integration roles right now.

What's next: Phase 3 (Days 61-90) will add data reconciliation, monitoring and alerting, dashboards, and comprehensive documentation — transforming your working pipeline into a fully portfolio-ready capstone project.

Phase 3: Days 61–90

Your pipeline works. Now make it production-ready, monitored, documented, and portfolio-worthy.

🎯 THE PHASE 3 MISSION

→Make your pipeline run on a schedule — automatically, every night

→Add data reconciliation — prove nothing gets lost between source and destination

→Build monitoring and alerting — know when something breaks before anyone asks

→Create dashboards and reports that compliance teams actually use

→Write professional documentation — README, runbooks, architecture diagrams

→Polish your GitHub portfolio for job applications

💼 WHAT CHANGES IN PHASE 3

In Phase 2, you built a pipeline that works when you manually run python pipeline.py. That's a prototype. A production integration runs unattended — nobody types a command. It runs on a schedule, handles problems gracefully, alerts you when something goes wrong, and produces evidence that auditors can verify. Phase 3 transforms your prototype into something you'd deploy at a real organization.

📊 WHAT YOU'RE ADDING IN PHASE 3

Week-by-Week Plan

Data Reconciliation + Scheduling
Prove data completeness. Run your pipeline automatically on a schedule.

Monitoring & Alerting + Advanced Controls
Know when your pipeline breaks. Expand your control knowledge beyond the basics.

Dashboards + Documentation
Build compliance dashboards. Write professional documentation and runbooks.

Portfolio Polish + Interview Prep
Finalize your GitHub portfolio. Practice explaining your work to interviewers.

Lesson 18: Data Reconciliation

Week 9 · Technical — Proving that every record from the source system made it to the destination.

🎯 WHAT YOU'RE ABOUT TO LEARN

→What reconciliation is and why it matters for compliance

→Three reconciliation checks: count, ID, and freshness

→How to build a reconciliation report your integration produces automatically

→What to do when counts don't match

💼 WHY RECONCILIATION MATTERS

An assessor asks: "How do you know your vulnerability data is complete?" If your answer is "I assume it's fine," you fail. Reconciliation means proving that every record from the source system made it to the destination — and flagging anything that didn't.

Three Types of Reconciliation Checks

1. Count Reconciliation

The question: Did the same number of records arrive as were sent?

How it works: Compare the count from the source API against records in the destination. If the source has 500 findings and ServiceNow has 498, you have a 2-record gap to investigate.

def check_counts(source_count, dest_count, rejected_count): """Verify: source = destination + rejected""" expected = dest_count + rejected_count match = source_count == expected if not match: gap = source_count - expected print(f"⚠️ COUNT MISMATCH: {gap} records unaccounted for") print(f" Source: {source_count}, Loaded: {dest_count}, Rejected: {rejected_count}") else: print(f"✅ Counts match: {source_count} = {dest_count} loaded + {rejected_count} rejected") return match

2. ID Reconciliation

The question: Is every specific finding from the source present in the destination?

How it works: Get the list of finding IDs from the source. Get the list of correlation IDs from ServiceNow. Compare the two sets. Any ID in the source but not in the destination is a gap.

def check_ids(source_ids, dest_ids): """Find specific records that exist in source but not destination.""" source_set = set(source_ids) dest_set = set(dest_ids) missing = source_set - dest_set # In source but not destination extra = dest_set - source_set # In destination but not source (stale?) if missing: print(f"⚠️ {len(missing)} findings in source but NOT in destination") if extra: print(f"ℹ️ {len(extra)} records in destination but NOT in source (may be resolved)") if not missing and not extra: print("✅ All IDs match perfectly") return missing, extra

3. Freshness Check

The question: Is the data current? When did the pipeline last run successfully?

How it works: Record the timestamp of each successful run. If the last run was more than 25 hours ago (for a nightly pipeline), something is wrong — the pipeline may have silently stopped.

from datetime import datetime, timezone, timedelta def check_freshness(last_run_time, max_hours=25): """Alert if the pipeline hasn't run recently.""" now = datetime.now(timezone.utc) age = now - last_run_time hours = age.total_seconds() / 3600 if hours > max_hours: print(f"🚨 STALE DATA: last run was {hours:.1f} hours ago!") return False print(f"✅ Data is fresh: last run {hours:.1f} hours ago") return True

🔧 DO THIS NOW

Add the check_counts function to the end of your pipeline.py. After the main loop finishes, call it with your stats:

# Add after the pipeline loop check_counts(stats["extracted"], stats["created"] + stats["updated"], stats["invalid"])

🧠 QUICK CHECK

Your pipeline extracted 500 findings, loaded 495, and rejected 3. What does count reconciliation tell you?

Explanation: 500 extracted - 495 loaded - 3 rejected = 2 unaccounted. These 2 records were lost somewhere — maybe they caused an uncaught exception. This is exactly the kind of gap reconciliation catches. Investigate the error logs to find what happened to those 2 records.

📝 LESSON RECAP

✓Count reconciliation: source = loaded + rejected (no records lost)

✓ID reconciliation: every source ID exists in the destination

✓Freshness check: pipeline ran recently (data isn't stale)

✓Reconciliation proves completeness — assessors love this

Lesson 19: Scheduling & Automation

Week 9 · Technical — Making your pipeline run automatically, without you typing a command.

🎯 WHAT YOU'RE ABOUT TO LEARN

→How cron (Linux/Mac) and Task Scheduler (Windows) work

→How to schedule your pipeline to run nightly

→Wrapper scripts that handle logging, environment variables, and error capture

→Why scheduling is itself a compliance requirement (SI-4)

Why Schedule?

Right now, your pipeline only runs when you type python pipeline.py. In production, integrations run on a schedule — typically nightly at 2 AM when systems are quiet and API rate limits are less likely to trigger. Nobody types a command; the scheduler does it automatically.

Option 1: cron (Linux/Mac)

croncron: A built-in Linux/Mac scheduler that runs commands at specific times. You define the schedule in a "crontab" — a configuration file with one line per scheduled task. is the standard scheduler on Linux and Mac. The schedule format uses 5 fields:

📄 cron schedule format

# ┌─── minute (0-59) # │ ┌─── hour (0-23) # │ │ ┌─── day of month (1-31) # │ │ │ ┌─── month (1-12) # │ │ │ │ ┌─── day of week (0=Sun, 6=Sat) # │ │ │ │ │ # * * * * * command # Run pipeline at 2:00 AM every day: 0 2 * * * /home/you/grc-integration-portfolio/run_pipeline.sh

COMMON SCHEDULES

0 2 * * * — Every day at 2:00 AM (most common for GRC integrations)

0 */6 * * * — Every 6 hours

0 2 * * 1 — Every Monday at 2:00 AM (weekly)

*/15 * * * * — Every 15 minutes (for high-frequency monitoring)

The Wrapper Script

Don't schedule your Python file directly. Create a wrapper script that sets up the environment, runs the pipeline, and captures the output:

📄 run_pipeline.sh — your wrapper script

#!/bin/bash # GRC Integration Pipeline — nightly run wrapper # Set environment variables export SNOW_INSTANCE="dev12345" export SNOW_PWD="$(cat /etc/secrets/snow_pwd)" # Read from secure file export AWS_DEFAULT_REGION="us-east-1" # Create log directory LOG_DIR="/var/log/grc-pipeline" mkdir -p "$LOG_DIR" LOG_FILE="$LOG_DIR/run_$(date +%Y%m%d_%H%M%S).log" # Run pipeline and capture ALL output cd /home/you/grc-integration-portfolio python pipeline.py > "$LOG_FILE" 2>&1 EXIT_CODE=$? # Check result if [ $EXIT_CODE -ne 0 ]; then echo "Pipeline failed with exit code $EXIT_CODE" | \ mail -s "🚨 GRC Pipeline FAILED" you@company.com fi

LINE-BY-LINE EXPLANATION

#!/bin/bash — Tells the system "this is a bash script." Must be the first line.

export SNOW_PWD="$(cat /etc/secrets/snow_pwd)" — Read the password from a secure file, not typed in the script. $(command) runs a command and uses its output.

date +%Y%m%d_%H%M%S — Creates a timestamp like 20241201_020000 for the log filename. Each run gets its own log file.

> "$LOG_FILE" 2>&1 — Redirect both normal output (>) and errors (2>&1) to the log file.

EXIT_CODE=$? — Capture the exit code from your Python script (0=success, 1=failure, 2=warning).

mail -s "..." — Send an email alert if the pipeline failed. In production, this might be a Slack webhook or PagerDuty instead.

🔧 DO THIS NOW

Create run_pipeline.sh in your project folder. Make it executable: chmod +x run_pipeline.sh. Test it manually: ./run_pipeline.sh. Check that a log file was created in your log directory.

⚠️ ON WINDOWS?

Use Task Scheduler instead of cron. Open Task Scheduler → Create Basic Task → Set trigger (daily, 2 AM) → Action: Start a program → Program: python, Arguments: C:\path\to\pipeline.py. The concept is identical — just a different tool.

Scheduling is itself compliance evidence. The fact that your pipeline runs on a reliable schedule supports SI-4 (System Monitoring) and CM-3 (Change Control). Your cron entry + 90 days of log files proves continuous operation — exactly what assessors want to see.

📝 LESSON RECAP

✓cron (Linux/Mac) or Task Scheduler (Windows) runs your pipeline automatically

✓Wrapper scripts handle environment, logging, and error notification

✓Each run gets its own timestamped log file

✓Exit codes tell the scheduler whether to alert

✓90 days of nightly logs = powerful compliance evidence

Lesson 20: Pipeline Monitoring & Alerting

Week 10 · Technical — Knowing something is wrong before anyone else does.

🎯 WHAT YOU'RE ABOUT TO LEARN

→The 4 things to monitor on every integration pipeline

→How to implement alerts (email, Slack webhook, or log file)

→Alert fatigue — why too many alerts is worse than no alerts

→Building a run history log for trend analysis

What to Monitor

You don't monitor "everything." You monitor the four things that tell you whether your pipeline is healthy:

🚨 1. Did the pipeline RUN?

The worst failure mode is silence — the pipeline stops running and nobody notices for weeks. Monitor: was a log file created today? If not, the scheduler or the script is broken.

# Check: has the pipeline run in the last 25 hours? import os, time log_dir = "/var/log/grc-pipeline" logs = sorted(os.listdir(log_dir)) if logs: newest = os.path.getmtime(os.path.join(log_dir, logs[-1])) hours_ago = (time.time() - newest) / 3600 if hours_ago > 25: print(f"🚨 Pipeline hasn't run in {hours_ago:.0f} hours!")

⚠️ 2. Did it SUCCEED?

Check the exit code. 0 = success, non-zero = something went wrong. Your wrapper script already captures this.

📊 3. Are the NUMBERS normal?

If your pipeline normally processes 500 findings and today it processed 5, something changed — even if it "succeeded." Track counts over time and alert on dramatic changes.

# Alert if extracted count drops more than 50% from yesterday if today_count < yesterday_count * 0.5: print(f"⚠️ Extracted {today_count} vs {yesterday_count} yesterday — 50%+ drop")

✅ 4. Does the DATA reconcile?

Your reconciliation checks from Lesson 18: counts match, IDs match, data is fresh. If any check fails, alert.

Alert Fatigue — The Silent Killer

⚠️ THE RULE: EVERY ALERT MUST REQUIRE ACTION

If your pipeline sends an email for every warning, people stop reading the emails. Soon they miss the critical failures too. This is alert fatigue — and it's killed real compliance programs.

The rule: Only alert when someone needs to DO something. Informational messages go in logs, not inboxes. Reserve alerts for: pipeline didn't run, pipeline failed (exit code 1), data counts dropped dramatically, reconciliation failed.

🔧 DO THIS NOW

Create a run_history.json file that your pipeline appends to after each run. Each entry should include: timestamp, extracted count, valid count, invalid count, created, updated, errors, exit code. After a week of manual runs, you'll have trend data.

📄 Appending to run history

import json from datetime import datetime, timezone def save_run_history(stats, exit_code): history_file = "run_history.json" try: with open(history_file, "r") as f: history = json.load(f) except FileNotFoundError: history = [] history.append({ "timestamp": datetime.now(timezone.utc).isoformat(), "stats": stats, "exit_code": exit_code }) with open(history_file, "w") as f: json.dump(history, f, indent=2)

📝 LESSON RECAP

✓Monitor 4 things: did it run, did it succeed, are numbers normal, does data reconcile

✓Alert fatigue kills compliance programs — only alert when action is needed

✓Save run history for trend analysis — catch gradual degradation

✓The pipeline's monitoring system is itself compliance evidence

Lesson 21: Advanced Control Families

Week 10 · GRC — Expanding your control knowledge beyond the basics.

🎯 WHAT YOU'RE ABOUT TO LEARN

→CA (Assessment) — the controls about assessing OTHER controls

→SC (System & Communications Protection) — network security evidence

→IR (Incident Response) — how incident data feeds GRC

→How to quickly learn any new control family on the job

In Phase 1, you learned 5 controls: AC-2, AU-6, CM-8, RA-5, SI-4. On the job, you'll encounter many more. This lesson teaches you the pattern for learning any new control family quickly — and introduces three families you'll see often.

The Pattern for Learning Any Control

For any control, answer these 5 questions:
1. What does it require? (Read the control text in 800-53)
2. What data proves it works? (What would an assessor check?)
3. Which system has that data? (SIEM? Identity provider? Cloud API?)
4. Does that system have an API? (Can you pull it automatically?)
5. How often does the data change? (Daily? Weekly? Real-time?)

CA — Security Assessment and Authorization

What it's about: Ensuring you regularly check whether controls actually work. Think of it as "the controls about assessing controls" — meta-compliance.

Key controls:

• CA-7: Continuous monitoring — ongoing assessment of control effectiveness. Your entire pipeline IS CA-7 evidence.

• CA-2: Control assessments — periodic formal testing. Your run history + reconciliation reports support this.

Integration angle: Your pipeline's run history, reconciliation reports, and dashboards ARE the evidence for CA-7 continuous monitoring.

SC — System and Communications Protection

What it's about: Protecting data in transit and at rest. Network segmentation, encryption, boundary protection.

Key controls:

• SC-7: Boundary protection — firewall rules, network segmentation

• SC-28: Protection of information at rest — encryption

Integration angle: Pull firewall rule sets from cloud APIs (AWS Security Groups, Azure NSGs). Pull encryption status from AWS Config or Azure Policy.

IR — Incident Response

What it's about: Being ready for security incidents: having a plan, training people, detecting incidents, analyzing them, and recovering.

Key controls:

• IR-4: Incident handling — detect, analyze, contain, recover

• IR-6: Incident reporting — report incidents to the right people

Integration angle: Pull incident ticket data from ticketing systems. Track mean time to detect (MTTD) and mean time to respond (MTTR). Feed incident counts and response metrics into GRC dashboards.

✏️ MINI EXERCISE

Pick one control you haven't learned yet (try PE-3, MP-6, or CP-9). Look it up in the NIST 800-53 catalog (free online). Answer the 5 questions above. Add it to your control mapping table.

📝 LESSON RECAP

✓The 5-question pattern works for learning any new control

✓CA-7 (Continuous Monitoring) — your pipeline IS the evidence

✓SC (System Protection) — network and encryption evidence from cloud APIs

✓IR (Incident Response) — incident metrics feed GRC dashboards

✓You don't memorize 1,000 controls — you learn the pattern for any control

Lesson 22: GRC Dashboards & Reporting

Week 11 · Technical — Build the reports that compliance teams actually use.

🎯 WHAT YOU'RE ABOUT TO LEARN

→The 5 metrics every GRC dashboard should show

→How to generate a summary report from your pipeline's run history

→The difference between operational dashboards and compliance reports

→How to create a simple HTML report your pipeline generates automatically

💼 WHY DASHBOARDS MATTER

Your pipeline produces data. But data sitting in ServiceNow isn't useful until someone can see trends, spot problems, and make decisions. Dashboards translate your raw data into actionable visibility — the thing compliance teams and CISOs actually care about.

The 5 Essential GRC Metrics

Open Findings by Severity
How many Critical, High, Medium, Low findings are currently open? Is the trend improving or worsening?

Overdue POA&M Items
How many items have passed their due date? This is the #1 metric CISOs and auditors look at.

Mean Time to Remediate (MTTR)
On average, how long does it take to fix a finding? Break it down by severity.

Pipeline Health
Is the integration running successfully? What's the success rate over the last 30 days?

Control Coverage
How many controls have automated evidence vs. manual-only evidence? What's the automation percentage?

Building a Summary Report

Your pipeline already saves run history. Let's turn that into a readable report:

🔧 DO THIS NOW

Create generate_report.py:

📄 generate_report.py — your automated summary

import json from datetime import datetime # Load run history with open("run_history.json", "r") as f: history = json.load(f) print("═" * 50) print("GRC INTEGRATION — PIPELINE HEALTH REPORT") print(f"Generated: {datetime.now().strftime('%Y-%m-%d %H:%M')}") print(f"Total runs analyzed: {len(history)}") print("═" * 50) # Calculate metrics successful = sum(1 for r in history if r["exit_code"] == 0) total_extracted = sum(r["stats"]["extracted"] for r in history) total_loaded = sum(r["stats"]["created"] + r["stats"]["updated"] for r in history) print(f"\nSuccess rate: {successful}/{len(history)} runs ({successful/len(history)*100:.0f}%)") print(f"Total records extracted: {total_extracted:,}") print(f"Total records loaded: {total_loaded:,}") # Latest run details latest = history[-1] print(f"\nLatest run: {latest['timestamp']}") print(f" Extracted: {latest['stats']['extracted']}") print(f" Loaded: {latest['stats']['created'] + latest['stats']['updated']}") print(f" Rejected: {latest['stats']['invalid']}")

Operational vs. Compliance Reports: Operational dashboards show real-time health (is the pipeline working?). Compliance reports prove continuous operation over time (it ran every night for 90 days). You need both. The report above is operational. Your 90-day run history log is compliance evidence.

📝 LESSON RECAP

✓5 essential metrics: open findings, overdue POA&Ms, MTTR, pipeline health, control coverage

✓Dashboards translate raw data into actionable visibility

✓Operational dashboards = real-time health; compliance reports = proof over time

✓Your pipeline can generate its own health report from run history

Lesson 23: Writing Integration Documentation

Week 11 · GRC — Professional documentation that makes your work understandable and maintainable.

🎯 WHAT YOU'RE ABOUT TO LEARN

→What goes in a good README for a GRC integration project

→How to write a runbook (operational guide for your pipeline)

→Architecture diagrams that explain your integration visually

→Why documentation is a compliance requirement (CM-3, SA-5)

💼 WHY DOCUMENTATION MATTERS MORE THAN YOU THINK

You'll leave your job someday. Someone else will maintain your pipeline. If they can't understand it from the documentation, they'll rewrite it — wasting months. Good documentation also supports SA-5SA-5 (System Documentation): The control requiring that system documentation is available and current. Your README, architecture diagram, and runbook ARE SA-5 evidence. and CM-3CM-3 (Configuration Change Control): Requires documentation of changes. Your Git history and README together satisfy this. — your documentation IS compliance evidence.

1. The README

Every GitHub project needs a README.md. For a GRC integration, it should include:

🔧 DO THIS NOW

Create README.md in your project root with this structure:

📄 README.md — template for GRC integration projects

# GRC Integration: AWS Security Hub → ServiceNow ## What This Does Pulls security findings from AWS Security Hub, transforms and validates them, and loads them into ServiceNow as trackable compliance items using upsert logic to prevent duplicates. ## Architecture Source: AWS Security Hub (ASFF format) Target: ServiceNow Table API (incident table) Schedule: Nightly at 02:00 UTC via cron Auth: OAuth client credentials (ServiceNow), IAM keys (AWS) ## Quick Start 1. Clone this repo 2. Install dependencies: `pip install requests boto3` 3. Set environment variables (see Configuration below) 4. Run: `python pipeline.py` ## Configuration | Variable | Description | Example | |----------|-------------|---------| | SNOW_INSTANCE | ServiceNow instance name | dev12345 | | SNOW_PWD | ServiceNow admin password | (from secrets) | | AWS_ACCESS_KEY_ID | IAM access key | AKIA... | | AWS_SECRET_ACCESS_KEY | IAM secret | (from secrets) | ## File Descriptions - `pipeline.py` — Main ETL pipeline - `snow_client.py` — Reusable ServiceNow API client - `pull_findings.py` — AWS Security Hub extraction - `due_dates.py` — Remediation deadline calculator - `run_pipeline.sh` — Nightly run wrapper script ## Controls Supported | Control | Evidence Provided | |---------|-------------------| | RA-5 | Vulnerability findings tracked to closure | | CM-8 | Asset inventory via AWS resource ARNs | | CA-7 | Continuous monitoring via nightly pipeline | | CM-3 | Change control via Git commit history |

2. The Runbook

A runbook is an operational guide for the person who maintains the pipeline day-to-day. It answers: "What do I do when something goes wrong?"

RUNBOOK SECTIONS

Normal operation: Where logs are stored, how to verify a successful run, expected run time

Common failures: For each type of failure (auth, timeout, data quality), exact steps to diagnose and fix

Restarting after failure: Is it safe to re-run? (Yes, because of upsert logic.) How to re-run for a specific date range

Escalation: When to alert the team lead, when to contact the vendor, who to notify for compliance implications

Credential rotation: How to update API credentials when they expire. Where secrets are stored.

✏️ MINI EXERCISE

Write a "Common Failures" section for your runbook covering: (1) 401 Unauthorized — what to check, how to fix. (2) 429 Rate Limited — why it happens, what the pipeline does. (3) No findings extracted — possible causes (wrong region, filter too strict, Security Hub disabled).

📝 LESSON RECAP

✓README: what it does, how to set up, configuration, files, controls supported

✓Runbook: normal operation, failure handling, restart procedures, escalation

✓Documentation supports SA-5 and CM-3 — it IS compliance evidence

✓Write for the person who replaces you — they'll thank you

Lesson 24: Multi-Source Integration Patterns

Week 12 · Technical — Scaling your architecture to pull from multiple systems.

🎯 WHAT YOU'RE ABOUT TO LEARN

→How to structure code for multiple source systems

→The adapter pattern — a standard way to add new sources

→Config-driven pipelines vs. hardcoded ones

→Planning your next integrations

The Problem: One Pipeline Isn't Enough

Right now you have one pipeline: Security Hub → ServiceNow. On the job, you'll need many: Tenable → ServiceNow, Splunk → ServiceNow, Entra ID → ServiceNow, and more. If you copy your pipeline code for each source, you'll have 10 slightly different scripts to maintain. When you fix a bug in one, you forget to fix it in the others.

The Adapter Pattern

Instead, create one shared pipeline and a small adapter for each source. Each adapter does only two things: extract data from its source and transform it into a common format. The shared pipeline handles validation, loading, logging, and reconciliation.

📊 THE ADAPTER PATTERN

📄 adapter_base.py — the adapter template

class SourceAdapter: """Base class — every source adapter follows this pattern.""" def extract(self): """Pull raw records from the source system. Returns a list.""" raise NotImplementedError("Subclass must implement extract()") def transform(self, raw_record): """Convert one source record into the common format.""" raise NotImplementedError("Subclass must implement transform()") def get_correlation_id(self, raw_record): """Return the unique ID used for deduplication.""" raise NotImplementedError # Your Security Hub adapter class SecurityHubAdapter(SourceAdapter): def extract(self): # Your existing pull_findings code goes here ... def transform(self, finding): # Your existing transform_finding code goes here ... def get_correlation_id(self, finding): return finding.get("Id", "")

WHY THIS MATTERS

Adding a new source: Create a new adapter class (e.g., TenableAdapter) with its own extract() and transform(). The rest of the pipeline doesn't change.

Fixing bugs: Fix validation, logging, or loading once in the shared pipeline. Every source benefits.

Testing: You can test each adapter independently with mock data.

You don't need to build this right now. This lesson shows you the direction your code should grow. When your employer says "now add Tenable findings too," you already know the architecture. You'll refactor your existing code into the adapter pattern rather than copying pipeline.py and changing the extract function.

📝 LESSON RECAP

✓One shared pipeline + small adapters per source = maintainable architecture

✓Each adapter implements extract(), transform(), and get_correlation_id()

✓Fix bugs once in the shared pipeline, every source benefits

✓Config-driven pipelines scale better than copy-paste

Lesson 25: Interview Prep & Portfolio Review

Week 12 · GRC — Making your work presentable and practicing how you talk about it.

🎯 WHAT YOU'RE ABOUT TO LEARN

→How to present your GitHub portfolio for job applications

→The 5 interview questions you'll get and how to answer them

→How to talk about GRC integration work to both technical and non-technical people

Polishing Your GitHub Portfolio

✅ PORTFOLIO CHECKLIST

☐ README.md with clear description, setup instructions, architecture

☐ Clean commit history — each commit has a descriptive message

☐ No credentials or secrets in any file (check your .gitignore)

☐ Code is organized into files with clear names

☐ A .gitignore file that excludes logs, secrets, __pycache__

☐ Field mapping document (spreadsheet or markdown table)

☐ Control-to-integration mapping table

☐ At least one sample output or screenshot of the pipeline running

⚠️ BEFORE SHARING YOUR REPO — SECURITY CHECK

Search your entire repo for leaked secrets: git log --all -p | grep -i "password\|secret\|key\|token". If anything shows up, you need to rotate those credentials immediately AND remove them from Git history (use git filter-branch or BFG Repo-Cleaner).

The 5 Interview Questions

1. "Walk me through a GRC integration you've built."

Your answer structure:

• Source: "I pull security findings from AWS Security Hub using boto3 with paginated API calls."

• Transform: "I map ASFF fields to ServiceNow format — severity labels to priority values, resource ARNs to asset IDs."

• Validate: "Every record is validated for required fields and valid values before loading. Invalid records are logged and skipped."

• Load: "I use an upsert pattern — query by correlation ID first. If the record exists, update. If not, create. This prevents duplicates when the pipeline re-runs."

• Observe: "Structured JSON logging, run statistics, exit codes, and reconciliation checks."

2. "How do you prevent duplicates?"

"I store the source system's unique finding ID as a correlation_id in ServiceNow. Before creating a record, I query for an existing record with that correlation_id. If found, I update it instead of creating a new one. This means the pipeline is idempotent — running it twice produces the same result as running it once."

3. "How does your work fit into the RMF lifecycle?"

"My integrations primarily support Step 7 — Continuous Monitoring. By automating evidence collection from security tools into the GRC platform, I ensure control effectiveness is tracked continuously, not just during annual assessments. The pipeline also supports RA-5 by tracking vulnerabilities to closure and CA-7 by providing evidence of ongoing monitoring."

4. "What happens when your pipeline fails?"

"I handle three levels of errors. Record-level errors — one bad record — are logged and skipped; the pipeline continues with the remaining records. Connection errors like rate limiting trigger exponential backoff retries. Fatal errors like authentication failures cause an immediate abort with a clear error message. The exit code tells the scheduler whether to alert."

5. "How do you know your data is complete?"

"Reconciliation. After every run, I compare: source count should equal loaded count plus rejected count. If there's a gap, I investigate. I also do ID-level reconciliation — every source finding ID should exist as a correlation_id in the destination. And I check freshness — if the pipeline hasn't run in 25 hours, something is wrong."

🔧 DO THIS NOW

Practice saying each answer out loud. Time yourself — each answer should be 30-60 seconds. Record yourself on your phone and listen back. You'll be surprised how much clearer you sound after 2-3 practice rounds.

📝 LESSON RECAP

✓Clean your GitHub repo: README, .gitignore, no secrets, clear commits

✓Practice the 5 core interview questions until they're natural

✓Structure answers: Source → Transform → Validate → Load → Observe

✓Connect every technical answer to a compliance purpose

Phase 3 Checkpoint

Day 90 — You've built a production-ready, documented, portfolio-worthy integration.

YOUR READINESS SCORE

0/16

Click items below to check them off

Production Readiness

Is your pipeline truly production-grade?

ENGINEERING

Portfolio & Communication

Could you show and explain your work to an employer?

PRESENTATION

📂 YOUR COMPLETE PORTFOLIO AT DAY 90

✓ Working pipeline: AWS Security Hub → Python → ServiceNow (with upsert)

✓ Reusable client: ServiceNow API class with CRUD operations

✓ Data quality: Transform, validate, reconcile — nothing gets lost

✓ Observability: Structured logging, exit codes, run history

✓ Automation: Scheduler wrapper script (cron or Task Scheduler)

✓ Monitoring: Freshness checks, count reconciliation, alert thresholds

✓ Reporting: Automated pipeline health report

✓ Documentation: README, runbook, field mapping, control mapping

✓ Architecture: Multi-source adapter pattern (designed, ready to implement)

✓ Interview prep: 5 core questions practiced and polished

🏢 WHERE YOU ARE NOW

You have built, from scratch, the same type of integration that GRC teams deploy in production environments. You can explain it technically (to engineers) and in compliance terms (to assessors and program managers). Your GitHub repo demonstrates hands-on skills. You are ready to interview for junior GRC integration, GRC engineering, or compliance automation roles.

90 days ago, you installed Python for the first time. Look at what you've built.

What's next: Phases 4 and 5 (months 4-12) will cover: webhooks and bi-directional sync, PowerShell and Microsoft Graph API, CI/CD pipelines, FedRAMP deep dive, OSCAL, Zero Trust architecture, advanced observability, and senior-level interview preparation. But those are growth topics — you already have enough to start applying for roles.

Phase 4: Months 4–6

Advanced integration patterns, CI/CD, FedRAMP, and multi-platform skills.

🎯 THE PHASE 4 MISSION

→Move beyond pull-based pipelines to event-driven architectures

→Learn PowerShell and Microsoft Graph API for Azure/M365 environments

→Deep dive into FedRAMP continuous monitoring requirements

→Add CI/CD and automated testing to your integration workflow

→Build bi-directional sync and advanced ServiceNow patterns

💼 WHAT CHANGES IN PHASE 4

Phases 1-3 built one integration from scratch. Phase 4 expands your toolkit: new languages (PowerShell), new platforms (Azure, M365), new patterns (webhooks, CI/CD), and deeper compliance knowledge (FedRAMP). These are the skills that separate a junior from a mid-level engineer.

Week-by-Week Plan

Webhooks & Bi-Directional Sync
Event-driven integration and keeping two systems in sync.

PowerShell & Microsoft Graph API
The second language of GRC integration + M365/Azure data.

FedRAMP & CI/CD
Deep compliance knowledge + automated deployment pipelines.

Advanced ServiceNow & Integration Testing
GRC module APIs, business rules, and writing tests that prove correctness.

Lesson 26: Webhooks & Event-Driven Integration

Week 13 · Technical — Instead of asking for data, let systems tell you when something happens.

🎯 WHAT YOU'RE ABOUT TO LEARN

→The difference between polling (pull) and webhooks (push)

→How webhooks work — registering, receiving, and processing events

→When to use webhooks vs. scheduled polling

→Security considerations: verifying webhook signatures

Pull vs. Push

Your Phase 2 pipeline uses polling (pull): every night, your script asks the source system "give me all findings." This works, but it means changes aren't reflected until the next run. A webhookWebhook: A callback mechanism where a source system sends data TO your integration automatically when something happens — like a new finding, status change, or alert. Instead of you asking for data on a schedule, the data comes to you in real time. flips this: the source system sends data to YOU the instant something happens.

⏰ POLLING (your current approach)

Your script runs at 2 AM
Asks: "Any new findings?"
Processes everything at once
23-hour delay between runs

⚡ WEBHOOKS (event-driven)

Source detects new finding
Instantly sends it to your endpoint
You process it immediately
Near-real-time updates

How Webhooks Work

Register — Tell the source system: "When event X happens, send a POST request to this URL."

Receive — Your endpoint (a small web server) listens for incoming POST requests.

Verify — Check the webhook signature to confirm it's really from the source system, not an attacker.

Process — Transform, validate, and load the data — the same pipeline steps you already know.

📄 A simple webhook receiver (Flask)

from flask import Flask, request, jsonify import hmac, hashlib app = Flask(__name__) WEBHOOK_SECRET = "your-shared-secret" @app.route("/webhook/findings", methods=["POST"]) def receive_finding(): # Step 3: Verify signature signature = request.headers.get("X-Signature") expected = hmac.new( WEBHOOK_SECRET.encode(), request.data, hashlib.sha256 ).hexdigest() if signature != expected: return jsonify({"error": "Invalid signature"}), 403 # Step 4: Process — same as your pipeline finding = request.get_json() record = transform_finding(finding) valid, errors = validate_record(record) if valid: snow.find_or_create("incident", query, record) return jsonify({"status": "received"}), 200

When to use which: Webhooks for real-time needs (critical alerts, incident response). Polling for bulk data loads (nightly vulnerability sync, weekly access reviews). Most GRC programs use both — webhooks for urgent events, polling for comprehensive data reconciliation.

⚠️ WEBHOOK SECURITY

Always verify signatures. Without verification, anyone who discovers your webhook URL can send fake data into your GRC platform. The source system signs each request with a shared secret; your receiver must verify that signature before processing.

📝 LESSON RECAP

✓Polling = you ask on a schedule; webhooks = source tells you immediately

✓Register → Receive → Verify signature → Process (same ETVL pipeline)

✓Always verify webhook signatures — unsigned webhooks are a security hole

✓Use webhooks for urgent events, polling for comprehensive reconciliation

Lesson 27: Bi-Directional Sync

Week 13 · Technical — When data needs to flow both ways between systems.

🎯 WHAT YOU'RE ABOUT TO LEARN

→Why some integrations need to write back to the source system

→Conflict resolution — what happens when both systems change the same record

→The "last write wins" problem and how to solve it

→Timestamp-based sync strategies

When Data Flows Both Ways

Your Phase 2 pipeline is one-directional: data flows from Security Hub → ServiceNow. But sometimes the GRC platform needs to write back. Example: when a POA&M item is marked "Remediated" in ServiceNow, you might need to update the finding's status in the scanner or trigger a re-scan.

📊 BI-DIRECTIONAL SYNC

The Conflict Problem

⚠️ WHAT HAPPENS WHEN BOTH SYSTEMS CHANGE THE SAME RECORD

At 2 PM, someone in ServiceNow changes a POA&M's status to "In Progress." At 2:05 PM, the scanner re-runs and your forward sync overwrites it back to "Open." The analyst's work just disappeared. This is a sync conflict.

Solution: Timestamp-Based Conflict Resolution

def should_update(source_record, dest_record): """Only update if the source change is newer than the destination change.""" source_updated = source_record.get("updated_at", "") dest_updated = dest_record.get("sys_updated_on", "") if source_updated > dest_updated: return True # Source is newer — safe to update return False # Dest was modified more recently — don't overwrite

Rule of thumb: Start one-directional. Only add write-back when there's a clear business need. Every direction of sync doubles the complexity and the potential for conflicts. Most junior/mid-level GRC integrations are one-directional.

📝 LESSON RECAP

✓Bi-directional sync: findings flow in, status updates flow back

✓Conflicts happen when both systems modify the same record

✓Timestamp comparison prevents overwriting newer changes

✓Start one-directional; add write-back only when truly needed

Lesson 28: PowerShell & Microsoft Graph API

Week 14 · Technical — The second language of GRC integration and the gateway to Azure/M365.

🎯 WHAT YOU'RE ABOUT TO LEARN

→Why PowerShell matters in GRC (many federal environments are Microsoft-heavy)

→PowerShell basics for someone who already knows Python

→Microsoft Graph API — one API for all of M365, Azure AD, Intune

→Pulling user data from Entra ID for AC-2 compliance

Python vs. PowerShell — Quick Translation

CONCEPT	PYTHON	POWERSHELL
Variable	`name = "value"`	`$name = "value"`
Print	`print("hello")`	`Write-Host "hello"`
API call	`requests.get(url)`	`Invoke-RestMethod -Uri $url`
Loop	`for item in list:`	`foreach ($item in $list) {`
JSON parse	`data = resp.json()`	`$data = $resp \| ConvertFrom-Json`

Graph API — One API for Everything Microsoft

The Microsoft Graph APIMicrosoft Graph API: A unified API for accessing data across all Microsoft 365 services — users, groups, mail, calendar, Teams, Intune devices, security alerts, and more. One authentication, one endpoint pattern. gives you access to users, groups, devices, security alerts, and more — all from one API. For GRC, the key data: user accounts (AC-2), device compliance (CM-8), and security alerts (SI-4).

📄 Pulling Entra ID users for AC-2 evidence (PowerShell)

# Authenticate with client credentials $body = @{ grant_type = "client_credentials" client_id = $env:GRAPH_CLIENT_ID client_secret = $env:GRAPH_CLIENT_SECRET scope = "https://graph.microsoft.com/.default" } $token = (Invoke-RestMethod -Uri "https://login.microsoftonline.com/$tenantId/oauth2/v2.0/token" ` -Method POST -Body $body).access_token # Pull all users $headers = @{ Authorization = "Bearer $token" } $users = Invoke-RestMethod -Uri "https://graph.microsoft.com/v1.0/users" ` -Headers $headers # Show accounts with no recent sign-in (stale accounts for AC-2) foreach ($user in $users.value) { $lastLogin = $user.signInActivity.lastSignInDateTime Write-Host "$($user.displayName) | Last login: $lastLogin" }

You don't need to master PowerShell. You need to read it, understand it, and write basic scripts. Many GRC environments use both Python and PowerShell — Python for cross-platform integrations, PowerShell for Microsoft-specific tasks. Being comfortable in both makes you significantly more employable.

📝 LESSON RECAP

✓PowerShell uses $ for variables, {} for blocks, | for piping

✓Graph API: one API for all Microsoft 365 data — users, devices, alerts

✓AC-2 evidence from Entra ID: user accounts, last login, group memberships

✓Being comfortable in both Python and PowerShell doubles your employability

Lesson 29: FedRAMP Deep Dive

Week 14 · GRC — The compliance framework driving the largest demand for GRC integration work.

🎯 WHAT YOU'RE ABOUT TO LEARN

→FedRAMP authorization levels and what they require

→Continuous monitoring (ConMon) — the monthly deliverables

→How your integration skills map directly to FedRAMP ConMon

→Why FedRAMP jobs pay well and have high demand

FedRAMP Impact Levels

Low

~125 controls

Public-facing info, no PII

Moderate

~325 controls

Most common. Controlled data.

High

~421 controls

Law enforcement, healthcare, financial

Monthly ConMon Deliverables

FedRAMP authorized cloud providers must deliver these every month. Each one is an integration opportunity:

📋 Vulnerability scan results — all systems scanned, findings tracked (your pipeline does this)

📋 POA&M updates — status of every open weakness, new items, closed items

📋 Inventory changes — what was added, removed, or changed in the system

📋 Significant changes — architecture changes, new integrations, new data flows

📋 Incident reports — any security incidents and their resolution

🏢 THE JOB MARKET

FedRAMP-related roles consistently pay 15-30% more than general GRC positions because: (1) the work is technically complex, (2) the compliance requirements are strict, (3) demand exceeds supply, and (4) federal contractors often require clearances that limit the candidate pool. Your integration skills map directly to ConMon automation — the highest-demand area.

📝 LESSON RECAP

✓FedRAMP has Low (~125), Moderate (~325), and High (~421) control baselines

✓ConMon requires monthly vulnerability scans, POA&M updates, and inventory changes

✓Your existing pipeline skills directly automate ConMon deliverables

✓FedRAMP roles pay well due to technical complexity and limited talent pool

Lesson 30: CI/CD for Integration Pipelines

Week 15 · Technical — Automated testing and deployment for your integration code.

🎯 WHAT YOU'RE ABOUT TO LEARN

→What CI/CD means and why it matters for GRC integrations

→Setting up GitHub Actions to test your code on every push

→Writing unit tests for your transform and validate functions

→Why automated testing supports CM-3 and SA-11

What Is CI/CD?

CI/CDCI/CD: Continuous Integration / Continuous Deployment. CI = every code change is automatically tested. CD = tested code is automatically deployed. Together, they prevent broken code from reaching production. means: every time you push code to GitHub, automated tests run. If they pass, the code can be deployed. If they fail, you know immediately — before broken code reaches production.

📄 .github/workflows/test.yml — your first CI pipeline

name: Test Integration Pipeline on: [push, pull_request] jobs: test: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - uses: actions/setup-python@v5 with: python-version: "3.12" - run: pip install requests boto3 pytest - run: pytest tests/ -v

📄 tests/test_transform.py — unit tests for your transform function

from pipeline import transform_finding, validate_record def test_transform_critical_finding(): finding = {"Title": "S3 unencrypted", "Severity": {"Label": "CRITICAL"}, "Id": "abc-123", "Resources": [{"Id": "arn:aws:s3:::bucket"}]} result = transform_finding(finding) assert result["priority"] == "1" # Critical = priority 1 assert result["u_correlation_id"] == "abc-123" def test_validate_rejects_missing_fields(): bad_record = {"short_description": "test"} # Missing priority, correlation_id valid, errors = validate_record(bad_record) assert valid == False assert len(errors) >= 1

Why this is compliance evidence: Automated tests support SA-11 (Developer Security Testing) and CM-3 (Change Control). Every push is tested, every test result is logged, and the green checkmark on GitHub proves your code was validated before deployment.

📝 LESSON RECAP

✓CI/CD = automatic testing and deployment on every code push

✓GitHub Actions runs your tests for free on every push

✓Unit tests verify transform and validate functions work correctly

✓Automated testing supports SA-11 and CM-3 compliance

Lesson 31: Configuration as Code

Week 15 · GRC — Managing integration configuration the same way you manage code.

🎯 WHAT YOU'RE ABOUT TO LEARN

→Externalizing configuration from code (config files, not hardcoded values)

→Environment-specific configs (dev vs. staging vs. production)

→Secrets management best practices

→How config-as-code supports CM-2 (Baseline Configuration)

The Problem: Hardcoded Values

Right now your pipeline probably has values like "dev12345" and REMEDIATION_DAYS = {"Critical": 30, ...} scattered through the code. When you move to production, you need to change all of these. If you miss one, things break.

❌ HARDCODED

instance = "dev12345"
DAYS = {"Critical": 30}

✅ EXTERNALIZED

config = load_config("config.yml")
instance = config["instance"]

📄 config.yml — your integration configuration

# config.yml — change this without touching code servicenow: instance: "dev12345" table: "incident" source: type: "security_hub" region: "us-east-1" severity_filter: ["CRITICAL", "HIGH"] remediation_days: Critical: 30 High: 90 Medium: 180 Low: 365

Why this matters for CM-2: Configuration Management (CM-2) requires a documented baseline. When your integration settings live in a version-controlled config file, every change is tracked in Git. The assessor can see: what the config was, when it changed, and who changed it.

📝 LESSON RECAP

✓Externalize configuration: YAML or JSON file, not hardcoded in Python

✓Separate configs per environment (dev, staging, prod)

✓Secrets stay in environment variables or secrets managers — never in config files

✓Config files in Git = CM-2 evidence

Lesson 32: Advanced ServiceNow Integration

Week 16 · Technical — Beyond the Table API: GRC-specific modules and patterns.

🎯 WHAT YOU'RE ABOUT TO LEARN

→ServiceNow GRC module (Governance, Risk, Compliance) tables and APIs

→Working with custom fields (u_ prefix fields)

→Attachment API — uploading evidence files via API

→Business rules and their impact on your integrations

ServiceNow GRC Tables

In Phase 2 you used the incident table for practice. In production, GRC data lives in specialized tables:

TABLE	PURPOSE	YOUR INTEGRATION
`sn_grc_item`	GRC items (POA&Ms, findings)	Load vulnerability findings here
`sn_compliance_policy`	Compliance policies/controls	Map controls to evidence sources
`cmdb_ci`	Configuration Items (assets)	Reconcile cloud assets with CMDB
`sn_risk_risk`	Risk register entries	Create risks from aggregated findings

Uploading Evidence via API

📄 Attaching a file to a ServiceNow record

def upload_evidence(self, table, sys_id, filepath, filename): """Attach an evidence file to a GRC record.""" url = f"{self.base_url}/attachment/file" headers = {**self.headers, "Content-Type": "application/octet-stream"} params = {"table_name": table, "table_sys_id": sys_id, "file_name": filename} with open(filepath, "rb") as f: resp = requests.post(url, auth=self.auth, headers=headers, params=params, data=f.read()) resp.raise_for_status() return resp.json()["result"]

Business rules warning: ServiceNow has "business rules" — server-side scripts that run when records are created or updated. These can modify your data, reject your API calls, or trigger workflows you didn't expect. Always test in your PDI before production. Ask the ServiceNow admin: "What business rules run on this table?"

📝 LESSON RECAP

✓GRC data uses specialized tables: sn_grc_item, sn_compliance_policy, cmdb_ci

✓Attachment API lets you upload evidence files programmatically

✓Custom fields (u_ prefix) are organization-specific — check the data dictionary

✓Business rules can modify your data — always test in PDI first

Lesson 33: Integration Testing

Week 16 · Technical — Proving your pipeline works correctly, automatically.

🎯 WHAT YOU'RE ABOUT TO LEARN

→Three types of tests: unit, integration, and end-to-end

→Mocking API responses so tests don't need live systems

→Testing the upsert pattern: run-twice, same-result verification

→Test data management — creating and cleaning up test records

Three Types of Tests

Unit Tests — Test one function in isolation

"Does transform_finding() correctly map CRITICAL to priority 1?" No API calls needed — just input and output.

Integration Tests — Test functions working together

"Does the pipeline correctly extract, transform, validate, and load one finding?" Uses mocked API responses.

End-to-End Tests — Test the full pipeline against real systems

"Does the pipeline run successfully against the PDI?" Uses the actual ServiceNow API. Run in a test environment only.

The Idempotency Test

The most important test for any GRC integration: run the pipeline twice with the same data and verify the results are identical. Same number of records, no duplicates, all updates.

def test_pipeline_idempotent(): """Running twice should produce the same result as running once.""" # First run stats1 = run_pipeline(test_findings) # Second run — same data stats2 = run_pipeline(test_findings) assert stats2["created"] == 0 # No new records on second run assert stats2["updated"] == stats1["created"] # All records updated

📝 LESSON RECAP

✓Unit tests verify individual functions; integration tests verify the pipeline

✓Mock API responses so tests work without live systems

✓The idempotency test: run twice, same result, no duplicates

✓Always clean up test data after end-to-end tests

Phase 4 Checkpoint

Month 6 — You've expanded from one pipeline to a professional integration toolkit.

READINESS SCORE

0/16

ADVANCED TECHNICAL

COMPLIANCE & ARCHITECTURE

📂 YOUR EXPANDED TOOLKIT

✓ Webhook receiver for real-time events

✓ Bi-directional sync with conflict resolution

✓ PowerShell scripts for Microsoft/Azure environments

✓ CI/CD pipeline with automated tests

✓ Externalized configuration (config-as-code)

✓ Advanced ServiceNow GRC module integration

✓ FedRAMP ConMon knowledge

✓ Unit, integration, and idempotency tests

🏢 WHERE YOU ARE NOW

You're no longer a junior building their first pipeline. You have the toolkit of a mid-level GRC integration engineer: multi-platform, multi-language, tested, automated, and FedRAMP-aware. You can design and implement integrations, not just build them to spec.

Phase 5: Months 6–12

Senior-level skills: OSCAL, Zero Trust, observability, migration, and leadership.

🎯 THE PHASE 5 MISSION

→OSCAL — the future of machine-readable compliance

→Zero Trust architecture and its integration implications

→Advanced observability — metrics, traces, and SLOs

→GRC platform migrations — the hardest integration projects

→Building and leading a GRC integration practice

💼 WHAT CHANGES IN PHASE 5

Phase 5 isn't about learning to build — you already can. It's about learning to design, lead, and modernize. These skills separate a senior engineer who architects solutions from a mid-level engineer who implements them. Many of these topics are forward-looking — you'll be ahead of most practitioners.

Week-by-Week Plan

OSCAL & Zero Trust
Machine-readable compliance and the new security architecture.

Advanced Observability & Migrations
Enterprise-grade monitoring and the hardest integration projects.

Multi-Cloud & GRC Program Management
Hybrid environments and leading a compliance automation program.

Building a Practice & Senior Interview Prep
Thought leadership, team building, and executive communication.

Lesson 34: OSCAL — Machine-Readable Compliance

Week 17 · GRC — The future of GRC is structured data, not Word documents.

🎯 WHAT YOU'RE ABOUT TO LEARN

→What OSCAL is and why it's transforming GRC

→The OSCAL data models: catalog, profile, SSP, assessment, POA&M

→How OSCAL integrations differ from traditional ones

→Why learning OSCAL now puts you ahead of 95% of GRC practitioners

The Problem OSCAL Solves

Today, most SSPs are 300+ page Word documents. POA&Ms are spreadsheets. Control catalogs are PDFs. Updating them means editing documents, not data. OSCALOSCAL (Open Security Controls Assessment Language): A NIST standard for representing security plans, assessments, and POA&Ms as structured data (JSON, XML, YAML) instead of Word documents. It makes compliance artifacts machine-readable, enabling automation at scale. changes this by representing all compliance artifacts as structured data — JSON, XML, or YAML that machines can process.

📄 TODAY: DOCUMENTS

300-page SSP in Word
POA&M in Excel
Copy-paste between systems
Manual updates quarterly

🔗 FUTURE: OSCAL DATA

SSP as JSON/YAML
POA&M as structured data
API-driven updates
Real-time from your pipeline

OSCAL Models

CAT

Catalog — Machine-readable version of NIST 800-53 controls

PRO

Profile — Which controls apply to YOUR system (your baseline selection)

SSP

System Security Plan — Your implementation details as data, not prose

Assessment Results — Control test outcomes as structured records

POA&M — Weakness tracking as data your pipeline can update via API

Why this matters for you: When GRC artifacts are data instead of documents, YOUR integration skills become even more valuable. You can programmatically update SSPs, generate assessment reports, and manage POA&Ms entirely through APIs. Learning OSCAL now puts you years ahead of most GRC practitioners who still think in Word documents.

📝 LESSON RECAP

✓OSCAL represents compliance artifacts as structured data (JSON/XML/YAML)

✓Five models: Catalog, Profile, SSP, Assessment Results, POA&M

✓OSCAL enables API-driven compliance — exactly what you build

✓FedRAMP is mandating OSCAL — demand will surge

Lesson 35: Zero Trust Architecture

Week 17 · GRC — The security model reshaping how integrations are designed.

🎯 WHAT YOU'RE ABOUT TO LEARN

→Zero Trust principles — "never trust, always verify"

→The 7 Zero Trust pillars and what data each one needs

→How Zero Trust changes YOUR integration design

→NIST SP 800-207 and the federal Zero Trust mandate

What Is Zero Trust?

Traditional security: "If you're inside the network, you're trusted." Zero TrustZero Trust: A security model where no user, device, or network location is automatically trusted. Every access request is verified based on identity, device health, location, and behavior — regardless of whether you're "inside" or "outside" the network. says: "Trust nobody. Verify every request. Assume the network is compromised." This creates enormous integration needs — you need data from identity, device, network, and application systems feeding a central policy engine.

The 7 Pillars

🔐 Identity — Who is requesting access? → Entra ID, Okta (your AC-2 integration)

💻 Device — Is their device compliant? → Intune, CrowdStrike, Tanium

🌐 Network — Micro-segmentation, encrypted traffic → AWS VPC, Azure NSG

📱 Application — Is the app authorized and patched? → CMDB, scan results

📦 Data — Is data classified, encrypted, access-controlled? → DLP, encryption APIs

📊 Visibility — Can we see everything? → SIEM, your dashboards

🤖 Automation — Can we respond automatically? → SOAR, YOUR INTEGRATIONS

Every pillar needs data integration. Zero Trust doesn't work with siloed tools. It requires data flowing between identity providers, device managers, network tools, SIEMs, and policy engines. This is exactly what you build. Zero Trust is creating more GRC integration demand, not less.

📝 LESSON RECAP

✓"Never trust, always verify" — every access request is checked

✓7 pillars: identity, device, network, application, data, visibility, automation

✓Zero Trust requires massive data integration between security tools

✓Federal agencies are mandated to adopt Zero Trust (EO 14028, OMB M-22-09)

Lesson 36: Advanced Observability

Week 18 · Technical — Enterprise-grade monitoring: metrics, SLOs, and operational excellence.

🎯 WHAT YOU'RE ABOUT TO LEARN

→Beyond logs: metrics, traces, and structured events

→Defining SLOs for your integration pipeline

→What "observable" means and why it matters for compliance

The Three Pillars of Observability

Logs

What happened? Structured JSON records of events. You built this in Phase 2.

Metrics

How much? Numerical measurements over time: records processed per run, error rate, latency, uptime. Feed these into Prometheus, CloudWatch, or Datadog.

Traces

How does one record flow through the system? Tracing follows a single finding from extraction through transformation, validation, and loading — showing exactly where it spent time or failed.

SLOs for Your Pipeline

SLOsSLO (Service Level Objective): A target you set for your service's reliability. Example: "99% of pipeline runs complete successfully" or "findings appear in ServiceNow within 4 hours of discovery." SLOs quantify what "working" means. define "what does 'working' mean?" for your pipeline:

📊 Availability: Pipeline runs successfully 99% of scheduled runs

📊 Freshness: Findings appear in ServiceNow within 4 hours of discovery

📊 Completeness: 99.5% of source findings are loaded (reconciliation rate)

📊 Accuracy: Less than 1% data quality rejection rate

📝 LESSON RECAP

✓Three pillars: logs (events), metrics (numbers), traces (flows)

✓SLOs define measurable reliability targets for your pipeline

✓Observable pipelines prove to auditors that you know when things break

Lesson 37: GRC Platform Migrations

Week 18 · Technical — The hardest and highest-value integration projects.

🎯 WHAT YOU'RE ABOUT TO LEARN

→Why organizations migrate GRC platforms (and how often)

→The migration pipeline: extract from old, transform, load into new

→Data mapping between two different GRC schemas

→Validation and reconciliation during migration

GRC platform migrations (e.g., Archer → ServiceNow, CSAM → eMASS, eMASS → ServiceNow) are the highest-value integration projects because they touch every compliance artifact: SSPs, POA&Ms, control assessments, evidence, risk registers, and asset inventories.

Why Migrations Are Hard

⚠️ Different schemas: Field names, data types, and relationships differ between platforms

⚠️ Historical data: You must preserve years of audit history, not just current state

⚠️ Custom fields: Every organization customizes their GRC platform differently

⚠️ Relationships: Controls link to systems, systems link to POA&Ms, POA&Ms link to findings — preserving these links is critical

⚠️ Zero downtime: Compliance doesn't pause during migration — both systems may run in parallel

The good news: A GRC migration IS an ETL pipeline — the same architecture you've been building. Extract from the old platform, transform between schemas, validate, and load into the new platform. Your skills transfer directly. The complexity is in the mapping and the volume, not in new technical patterns.

📝 LESSON RECAP

✓GRC migrations are the hardest and highest-paid integration projects

✓Same ETL pattern — different challenge: schema mapping and relationships

✓Preserve historical data, custom fields, and inter-record relationships

✓Reconciliation is critical — every record must transfer correctly

Lesson 38: Multi-Cloud & Hybrid Environments

Week 19 · Technical — When your integration needs to pull from AWS, Azure, and GCP simultaneously.

🎯 WHAT YOU'RE ABOUT TO LEARN

→Why multi-cloud is the reality, not the exception

→Normalizing findings from different cloud providers into one format

→Cloud Security Posture Management (CSPM) as an aggregation layer

Most large organizations use multiple clouds: AWS for compute, Azure for M365 and identity, maybe GCP for data analytics. Your GRC platform needs a unified view across all of them. This means your integrations pull from multiple cloud security APIs and normalize everything into one format.

The Normalization Challenge

FIELD	AWS SECURITY HUB	AZURE DEFENDER	GCP SCC
Finding ID	`Id` (ARN)	`id` (resource ID)	`name` (full path)
Severity	`Severity.Label`	`properties.severity`	`severity`
Resource	`Resources[0].Id`	`properties.resourceDetails`	`resourceName`

The adapter pattern solves this. You built this architecture in Phase 3 (Lesson 24). Each cloud gets its own adapter that extracts and normalizes into your common format. The shared pipeline handles everything after that. Multi-cloud is an architecture problem, not a code problem.

📝 LESSON RECAP

✓Multi-cloud is normal — most enterprises use 2+ providers

✓Each provider has different field names and formats — normalize to one schema

✓The adapter pattern you learned in Phase 3 handles multi-cloud cleanly

Lesson 39: GRC Program Management

Week 19 · GRC — Leading a compliance automation program, not just building pipelines.

🎯 WHAT YOU'RE ABOUT TO LEARN

→How to prioritize which integrations to build first

→Building a business case for automation investment

→Stakeholder management — working with CISOs, assessors, and system owners

→Measuring and communicating the value of your integrations

Prioritization: The Impact/Effort Matrix

You can't automate everything at once. Prioritize by: highest compliance impact × lowest implementation effort.

TYPICAL PRIORITY ORDER

🥇 Vulnerability → POA&M pipeline (high impact, you already built it)

🥈 Asset inventory sync (high impact, moderate effort — CMDB + cloud APIs)

🥉 Access review automation (high audit value, moderate effort — identity APIs)

4. Log coverage dashboards (moderate impact, low effort — SIEM API)

5. Configuration compliance (moderate impact, moderate effort — STIG scanners)

Communicating Value

❌ TO A CISO: DON'T SAY

"I built a Python ETL pipeline with upsert logic using boto3 and the ServiceNow Table API"

✅ TO A CISO: SAY

"We reduced POA&M update time from 40 hours/month to 2 hours, with 100% finding coverage and daily freshness instead of quarterly"

📝 LESSON RECAP

✓Prioritize: highest compliance impact × lowest effort

✓Translate technical work into business value for stakeholders

✓Measure: hours saved, coverage percentage, data freshness, accuracy

Lesson 40: Building & Leading a GRC Practice

Week 20 · GRC — From individual contributor to team leader and architect.

🎯 WHAT YOU'RE ABOUT TO LEARN

→How to build reusable frameworks your team can extend

→Training others — creating standards and templates

→Architecture governance — ensuring quality as the team grows

→Career paths: IC track vs management track in GRC

From Pipeline Builder to Practice Leader

Eventually, you won't build every pipeline yourself. You'll design the architecture, create the standards, review the code, and train the team. Your adapter pattern, config templates, testing framework, and documentation standards become the foundation others build on.

WHAT A GRC INTEGRATION PRACTICE INCLUDES

📐 Architecture standards: Adapter pattern, config-as-code, logging format

📋 Templates: Field mapping document, runbook, README, test file structure

🔧 Shared libraries: Reusable ServiceNow client, retry logic, validation framework

✅ Quality gates: CI/CD tests, code review checklist, reconciliation requirements

📊 Metrics: Pipeline health dashboards, SLOs, coverage reports

Career Paths

🔧 IC TRACK (Individual Contributor)

Jr. Integration Engineer → Sr. Engineer → Staff/Principal Engineer → Distinguished Engineer

Focus: deeper technical skills, architecture decisions, mentoring, thought leadership

👥 MANAGEMENT TRACK

Sr. Engineer → Team Lead → Manager → Director → VP of GRC Engineering

Focus: hiring, prioritization, stakeholder management, strategy, budget

📝 LESSON RECAP

✓Create reusable frameworks: adapter pattern, templates, shared libraries

✓Quality gates (CI/CD, code review, reconciliation) ensure consistency

✓IC track = deeper expertise; management track = team leadership

Lesson 41: Senior Interview Preparation

Week 20 · GRC — The questions change when you're not junior anymore.

🎯 WHAT YOU'RE ABOUT TO LEARN

→Senior interview questions: design, architecture, trade-offs, leadership

→How to walk through a system design for a GRC integration

→Talking about failures, lessons learned, and growth

Senior Questions Are Different

Junior interviews ask: "Can you build it?" Senior interviews ask: "How would you design it? What trade-offs would you make? What would you do differently next time?"

1. "Design a multi-source GRC integration platform from scratch."

Walk through: adapter pattern, shared validation, config-driven sources, centralized logging, reconciliation per source, monitoring dashboard. Discuss trade-offs: simplicity vs. flexibility, custom vs. off-the-shelf.

2. "Tell me about a time an integration failed in production."

Structure: Situation → what broke → how you detected it → how you fixed it → what you changed to prevent recurrence. Show you learn from failures, not just survive them.

3. "How would you prioritize automating 50 controls?"

Impact/effort matrix. Start with controls that are audit-critical AND have API-accessible data sources. Quick wins build trust. Communicate progress in business terms, not technical terms.

4. "How do you ensure data quality across 10 integration sources?"

Shared validation framework, per-source reconciliation, SLOs with alerting, automated data quality reports. The answer isn't "I check it manually."

🔧 DO THIS NOW

Write a 2-minute answer for each question. Practice out loud 3 times. Time yourself. Record on your phone. Senior interviews reward structured thinking — rambling loses points.

📝 LESSON RECAP

✓Senior interviews: design, trade-offs, failures, and leadership

✓Structure design answers: requirements → architecture → trade-offs

✓Failure stories: situation → detection → fix → prevention

✓Prioritization: impact × effort, quick wins first, communicate in business terms

Course Complete: Your GRC Integration Journey

12 months · 50 lessons · From "What is Python?" to senior-level GRC integration architect.

FINAL READINESS SCORE

0/16

SENIOR TECHNICAL SKILLS

LEADERSHIP & ARCHITECTURE

🎓 YOUR COMPLETE SKILL SET

✓ Python: APIs, data transformation, validation, structured logging

✓ PowerShell: Microsoft Graph API, Azure/M365 integration

✓ ServiceNow: Table API, GRC module, attachments, business rules

✓ AWS: Security Hub, boto3, IAM least privilege

✓ Architecture: Adapter pattern, config-as-code, multi-source, multi-cloud

✓ Quality: Unit tests, integration tests, CI/CD, idempotency

✓ Operations: Scheduling, monitoring, alerting, SLOs, observability

✓ Compliance: RMF, FedRAMP, OSCAL, Zero Trust, 800-53 controls

✓ Documentation: READMEs, runbooks, field mappings, architecture diagrams

✓ Communication: Explaining technical work to CISOs, assessors, and executives

🏢 WHERE YOU ARE NOW

Twelve months ago, you installed Python for the first time. Today, you can design, build, test, deploy, monitor, and document GRC integration pipelines. You understand both the technical implementation and the compliance context it serves. You can interview for mid-to-senior GRC integration, compliance automation, or GRC engineering roles.

You didn't just learn skills — you built a career foundation.

Learn to build the integrations that prove compliance

Start building today

Start Here: Your Lab Setup

Lesson 1: Your First Python Script

Before You Start

Part 1: Create and Run Your First Python File

Part 2: Storing Data in Variables

Part 3: Dictionaries — How APIs Send You Data

Part 4: Reading Data Safely (Without Crashing)

Part 5: Lists — Working With Multiple Findings

Part 6: Validation — Checking Data Before You Use It

Part 7: Your First API Call

Lesson 2: The Risk Management Framework

Let's Start With: What Is "Compliance"?

What Is RMF?

The Seven Steps — In Plain English

Key Vocabulary — Words You'll Hear Every Day

Lesson 3: REST APIs — How Systems Talk to Each Other

What Is an API? (No Jargon Version)

The 4 Actions (HTTP Methods)

Status Codes — The Server's Answer

Pagination — Getting ALL the Data

Retry with Exponential Backoff

Lesson 4: Your First Controls — AC & AU

What Is a "Control"?

AC-2: Account Management

AU-6: Audit Record Review

Lesson 5: JSON and Git — Data Format & Change Control

Part 1: JSON — What Every API Speaks

Part 2: Git — Your Change Tracking System

Lesson 6: The Controls You'll Automate Most

Lesson 7: OAuth & Your First GRC Platform

How Does Your Script "Log In" to an API?

The 4-Step Flow

The #1 Security Rule: Never Hardcode Credentials

Set Up Your Free ServiceNow Lab

Lesson 8: FISMA, SSPs & POA&Ms

Why Does GRC Integration Work Exist?

The POA&M — Your First Integration Target

Phase 1 Checkpoint

Technical Skills

GRC Knowledge

Phase 2: Days 31–60

Week-by-Week Plan

Lesson 10: ServiceNow API Mastery

Part 1: The Table API — One Pattern for Everything

Part 2: The sys_id Trap

Part 3: Building Your Reusable Client

Part 4: Test It

Lesson 11: POA&M Field Mapping

What Is a Field Mapping?

Scanner Fields → POA&M Fields

Calculating Due Dates

Lesson 12: AWS Security Hub & boto3

Part 1: What a Finding Looks Like

Part 2: Pulling Findings with boto3

Lesson 13: Control Inheritance

The Problem: 200 Systems, 1 Datacenter

Three Types of Controls

The Cloud Shared Responsibility Model

Lesson 14: Build the Integration

The ETL Pipeline — What You're Building

Step 1: The Transform Function

Step 2: The Validate Function

Step 3: Wire the Full Pipeline

Step 4: Run It!

Lesson 15: Evidence Quality

The Evidence Strength Spectrum

Chain-of-Custody Metadata

Lesson 16: Structured Logging & Error Handling

Part 1: Why print() Isn't Enough

Part 2: Three Error Handling Levels

Part 3: Exit Codes

Lesson 17: SCAP & STIGs

The Problem STIGs Solve

STIG Severity Categories

How This Connects to Your Pipeline

Phase 2 Checkpoint

Technical Skills

GRC Knowledge