Skip to content
Secure Smarter — Solutions for Modern Threats

From AI-driven SecOps to cloud security automation, Foresite delivers fully managed and scalable security solutions tailored for enterprise, hybrid, and multi-cloud environments.

Why Foresite — Security Excellence, Your Way

From our Adaptive Security Model to our Google Cloud Premier SecOps Partnership, we combine deep expertise, purpose-built technology, and customer-first flexibility.

Google Cloud Security — delivered by Foresite,
Premier SecOps Partner

Combine Google’s native security power with Foresite’s expert-driven, AI-powered operations to secure your cloud and unlock business growth.

Foresite - Google Cloud SecOps Delivery Partner Badge
Foresite continuous security validation: autonomous penetration testing case study
Thomas AllenJune 8, 202610 min read

Why Clean Penetration Tests Miss Breaches

Why Clean Penetration Tests Miss Breaches
14:50

The Needle They Almost Missed

 
How a year of clean penetration tests left a healthcare data organization one step from a breach, and what continuous autonomous testing found in 17 hours

Download the case study as a PDF ↓

A healthcare organization that handles regulated member data ran annual penetration tests for more than a year. Every one came back clean. The compliance boxes were checked, the program was well run, and by every available indicator the environment was secure. Then Foresite ran an internal test using the NodeZero® autonomous penetration testing platform. In 17 hours, with no human attacker and no prior knowledge of the environment, NodeZero exploited 254 attack paths, compromised 319 credentials, and accessed 739 items of protected personally identifiable information, including Social Security Numbers and Individual Taxpayer Identification Numbers.

The exposure was not new. It had been sitting in the environment, reachable from any foothold, through every prior clean test. This is the story of why point-in-time testing missed it, why a follow-up scan two weeks later came back clean again with nothing fixed, and why the only reliable answer is to test as often as possible.

254 attack paths exploited. 319 credentials compromised. 739 PII items accessed. 17 hours.

 

What is the difference between a clean test and a clean environment?

A clean penetration test means the weakness was not found inside that engagement window. It does not mean the weakness is not there.

Traditional testing, even strong manual testing, works within constraints. Testers operate inside a defined scope and time window, follow established methodologies, and bring real judgment. They also bring human limits. No tester can simultaneously evaluate every credential, every protocol, every relay opportunity, and every attack-chain combination across roughly 100 hosts in a single engagement.

NodeZero by Horizon3.ai

The weaknesses NodeZero surfaced were not recent. SMB signing had not been required for years. Shared service account passwords had not been rotated. The NTLM coercion attack surface had existed since the domain was built. What changed was the tool used to look, and specifically its ability to chain conditions together autonomously that a human tester might evaluate individually but never connect.

The needle had been in the haystack for more than a year. The prior tests simply had not landed on the moment it was findable.

 

The rescan that disappeared

This is the most instructive moment in the engagement. After reviewing the NodeZero findings, the organization ran a follow-up scan two weeks later without remediating anything. It came back clean. No critical findings. No compromised credentials. No path to host compromise.

Nothing had been fixed. So what changed?

NodeZero's NTLM relay chains depend on timing, active sessions, and the real-time availability of relay targets. A domain controller processing authentication at the right moment on day one was not doing so on day fifteen. The attack path closed temporarily. The underlying misconfiguration was completely unchanged.

This is exactly how real attackers behave. They do not run one scan and accept a clean result. They probe continuously, wait, and try again when conditions shift. An organization that tests once and trusts the result is betting that an attacker gives up after the first attempt.

Clean test. Same environment. Two weeks later. Nothing fixed. The path closed because the conditions were not aligned that day, not because the vulnerability was gone.

Timeline contrasting a day 1 scan that fully exploited the environment with a day 15 scan that came back clean despite no remediation.
Two scans, two weeks apart, with nothing changed in between. The attack path closed on day 15 because the conditions were not aligned, not because the vulnerability was fixed.

 

What did NodeZero actually find?

NodeZero did not find one catastrophic flaw. It found three systemic weaknesses that amplified each other, the same pattern a capable attacker would identify and chain. Each is manageable alone. Together they produced full domain-level access to live PII in under 17 hours.

Diagram showing five stages from initial foothold to credentials cracked to NTLM coercion and relay to domain controller takeover to 739 PII records exposed.
Three weaknesses, none critical alone, chained into full domain access and live PII in under 17 hours. This is the same path a real attacker would follow.

Root cause 1: Weak and reused credentials

NodeZero cracked 32 credentials with standard dictionary attacks, including cleartext passwords on accounts named after internal business units and partners. Three domain service accounts responsible for SQL replication, scheduled jobs, and backups shared a single password. One privileged account was cracked in under eight minutes, and that one compromise led directly to domain user compromise across three accounts, ransomware exposure on two file stores, and sensitive data access across seven assets.

  • 32 credentials cracked via dictionary attack
  • 3 domain service accounts confirmed sharing one password
  • 140 instances of local credential reuse across database infrastructure
  • 1 shared local administrator account working across unrelated hosts
  • 1 privileged account cracked in under 8 minutes


Root cause 2: SMB signing not required, combined with NTLM coercion

NodeZero confirmed three NTLM coercion techniques against the domain controllers: Authenticated PetitPotam (MS-EFSRPC), DFSCoerce (MS-DFSNM), and PrinterBug (MS-RPRN). Each forced a domain controller to authenticate back to NodeZero, handing over its machine account hash. Because SMB signing was not required on 76 services across the environment, those captured hashes were relayed straight to other hosts with no password cracking needed.

Several of these protocols are designated no-fix issues by Microsoft. The defense is configuration, not patching: enfce SMB signing environment-wide and apply RPC filters. Neither had been done. The attack surface had been present and exploitable through every prior clean test.

  • 76 services found with SMB signing not required
  • 3 domain controllers confirmed vulnerable to NTLM coercion
  • Machine account credentials for all three domain controllers captured via relay
  • 19 hosts compromised through relay attack alone


Root cause 3: Inadequate endpoint security controls

NodeZero acquired 137 credentials through OS credential dump, reaching the Windows Security Account Manager database and LSASS memory on compromised hosts. Endpoint detection and response was either absent or insufficiently tuned to block common credential harvesting. This matters beyond these findings: even after passwords are reset, an environment without effective EDR will have its credentials harvested again within hours of the next attacker foothold.

 

The exposure that makes this a breach scenario

NodeZero accessed 739 items of personally identifiable information, specifically Social Security Numbers and Individual Taxpayer Identification Numbers, via compromised credentials on SMB file shares. One share held more than 51,000 files with read, write, and delete permissions accessible to a single compromised service account.

Under HIPAA, unauthorized access to protected health information is a breach that triggers notification to affected individuals, to the Department of Health and Human Services, and in some cases to the media. NodeZero showed that access was not only possible but trivially achievable from any foothold inside the network, and that it had been achievable throughout more than a year of clean testing.

Had this been a real attacker rather than an autonomous platform, the organization likely would not have known until members reported identity theft or a ransom note appeared on the file servers. The prior clean tests would have given no warning.

 

Why does frequency change everything?

The lesson is not that earlier testing was incompetent. It is that point-in-time testing, regardless of quality, is structurally incapable of catching conditions that are transient, timing-dependent, or that require chaining a large number of simultaneous variables. The reliable answer is frequency.

The more often you run, the more conditions you sample. A weakness that is not exploitable at 9 a.m. on a Tuesday may be perfectly exploitable on a Thursday afternoon when a domain controller is processing heavy authentication load and a service account password has not rotated in eighteen months. NodeZero running weekly catches that window. An annual test almost certainly does not.

Running weekly rather than annually means sampling 52 moments instead of one. For timing-dependent chains like NTLM relay, frequency is not a luxury. It is the mechanism by which the platform works.

Frequency also enables something traditional testing cannot: immediate remediation verification. When the team applies a fix, they rerun NodeZero that day and confirm the path is closed. The feedback loop tightens from months to hours.

Comparison of an annual test taking one sample that misses the exploitable window against weekly testing taking 52 samples that catch it.Timing-dependent attack windows open and close. An annual test samples one moment and misses them. Weekly testing samples 52 and catches them.

 

Compliance is not the same as security

Annual penetration testing satisfies the HIPAA Security Rule requirement for technical evaluation. It produces a report, a finding list, and a remediation plan, all of which satisfy an auditor. None of it prevented this organization from carrying exploitable access to 739 PII items through more than a year of compliant testing.

Compliance and security are not the same thing. This engagement is the proof. Continuous autonomous validation is what closes the gap between them.

 

Where the vCISO program fits

Foresite serves clients as a virtual CISO, providing strategic leadership, program oversight, policy development, and compliance guidance. That is exactly the scope a vCISO engagement is built to cover, and the program here was well run by any reasonable standard. The exposure existed anyway.

The point is not that vCISO programs fall short. It is that no program, however well designed, can compensate for testing that samples conditions once or twice a year. The vCISO provides strategy, governance, and program architecture. Continuous autonomous validation provides the ground truth about whether that program is actually working at any given moment.

Together the picture changes. The vCISO gains real-time evidence of exploitable conditions rather than periodic snapshots. Remediation guidance rests on proven attack paths rather than theoretical risk scores. And when fixes are applied, they are verified the same day rather than at the next annual window.

 

Recommended remediation path

Foresite built a prioritized roadmap that addresses root causes in order of leverage. Each fix reduces the blast radius of the rest.

Immediate:

  • Rotate every compromised credential identified in the findings
  • Enable and require SMB signing via Group Policy across all domain-joined systems
  • Apply RPC filters for MS-EFSRPC and MS-DFSNM on domain controllers
  • Disable the Print Spooler service on all non-print servers

Short term, within 30 days:

  • Migrate service accounts to Group Managed Service Accounts to eliminate shared and manually managed passwords
  • Deploy LAPS for all local administrator accounts
  • Enforce a 12-character minimum password length via domain policy
  • Verify EDR deployment and enable credential-harvesting prevention on every host

Ongoing:

  • Enable Extended Protection for Authentication on AD CS
  • Disable NTLM where possible and audit NTLM usage
  • Run NodeZero on a continuous schedule, weekly at minimum, to catch configuration drift and verify closure
  • Establish a formal fix, rescan, document workflow so every remediation is verified before it is closed


The Foresite and NodeZero difference

Foresite delivers NodeZero-powered autonomous penetration testing as part of its vCISO and continuous security validation services. The combination provides what neither delivers alone: strategic security leadership grounded in real-time, proven evidence of what is actually exploitable right now.

The value is not the report. The value is the loop. Find, fix, verify, repeat, running continuously while the vCISO program focuses on strategy, governance, and the bigger picture.

  • Autonomous internal and external testing powered by NodeZero
  • Proof-of-exploit documentation for every finding, not theoretical risk scores
  • Immediate rescan after remediation, no scheduling and no waiting
  • Continuous coverage that catches what annual tests miss
  • vCISO integration, so strategy and ground truth work together
  • Aligned to HIPAA, SOC 2, ISO 27001, and NIST CSF validation requirements

 

To see what continuous security validation finds in your environment, contact us at soc@foresite.com or visit foresite.com/solutions/autonomous-penetration-testing.

See how other teams modernized →

 

Frequently asked questions

 

Can a penetration test pass and still leave an environment exploitable?

Yes. A clean test means the weakness was not found inside that engagement window. Timing-dependent conditions like NTLM relay can be exploitable one day and not the next, with nothing changed in the environment.

Why did a follow-up scan come back clean with nothing fixed?

NTLM relay chains depend on active sessions and the real-time availability of relay targets. When those conditions were not aligned, the path closed temporarily. The underlying misconfiguration was unchanged. A real attacker simply comes back when conditions shift.

How often should penetration testing run?

As often as possible. Weekly autonomous testing samples 52 different sets of conditions a year instead of one, which is what catches transient, timing-dependent attack windows. Annual testing satisfies most compliance requirements but cannot catch them reliably.

Is compliance enough for security?

No. Annual testing can satisfy the HIPAA Security Rule and still miss exploitable access to protected data, as it did here. Compliance and security are related but not the same. Continuous autonomous validation closes the gap.

What is autonomous penetration testing?

It is testing performed by a platform such as NodeZero that independently discovers, chains, and safely exploits weaknesses across an environment, with proof of exploit for every finding and no prior knowledge required.

avatar
Thomas Allen
Thomas Allen is Chief Information Security Officer – C|CISO, CRISC, CCSP, ISO 27001 LA, CISA, CISSP, HCISPP, GCCC, GCFA at Foresite Cybersecurity, a Google Cloud Premier Partner and the 2026 Google Cloud Security Partner of the Year.

RELATED ARTICLES