ADR-011: End-to-End Validation and Troubleshooting
- Status: accepted
- Date: 2026-03-31
- Deciders: Architecture Team
Context and Problem Statement
The skill catalog covers creating content (Showroom), scaffolding deployments (Field-Sourced Content), and provisioning infrastructure (AgnosticD). However, after a workshop environment is deployed there is no formalized strategy for verifying it actually works from the student’s perspective, or for systematically diagnosing failures.
The RHDP Skills Marketplace provides complementary validation tools that each check a slice of the stack:
| Marketplace Tool | What It Checks | Layer |
|---|---|---|
/showroom:verify-content | AsciiDoc quality, Red Hat style, structure | Content quality (pre-deployment) |
/health:deployment-validator | Pods running, routes accessible, operators installed | Infrastructure health (post-deployment) |
/agnosticv:validator | AgnosticV catalog file correctness | Catalog configuration |
/ftl:rhdp-lab-validator | Generates Solve/Validate button playbooks for ZT grading | Lab grading automation |
None of these answer the question: “Given a deployed environment and OpenShift credentials, can students actually start and complete this workshop right now?”
That requires checking the full student experience end-to-end: authentication, lab guide accessibility, terminal functionality, prerequisite resources, RBAC, and multi-user readiness. It also requires support for multiple environment types beyond OpenShift – RHEL VM labs, AAP environments, and hybrid deployments.
Additionally, when deployments fail, the AI assistant has no structured diagnostic steps. Each skill documents how to use its tool but not how to systematically troubleshoot failures across the integrated stack.
Decision Drivers
- Workshop developers need a single “go / no-go” check before handing environments to students
- The validation must work across all AgnosticD environment types (OCP shared, OCP dedicated, RHEL VM, AAP, hybrid)
- Existing marketplace tools cover slices but not the student-perspective end-to-end check
- Troubleshooting guidance should be embedded where the AI encounters the failure (in deployment skills)
- The solution should complement, not duplicate, RHDP Skills Marketplace tools
Considered Options
- Reference marketplace tools only – Point users to
/health:deployment-validatorand/showroom:verify-contentwithout building anything new - Embed validation checklists in existing skills – Add readiness checks to AgnosticD and Showroom SKILL.md files
- Create a dedicated student-readiness skill – A new process-oriented skill that runs a structured checklist against a live deployment, plus troubleshooting decision trees in deployment skills
- Build a shell-based validation script – A
validate-env.shscript that runs checks programmatically
Decision Outcome
Chosen option: “Dedicated student-readiness skill with troubleshooting decision trees” (option 3), because:
- A separate skill keeps the readiness checklist focused and environment-type agnostic
- Troubleshooting trees embedded in AgnosticD and Showroom skills give the AI diagnostic steps at the point of failure
- The marketplace tools remain the canonical validation layer for their respective domains – our skill fills the gap they don’t cover
- A feature request for rhpds/rhdp-skills-marketplace has been drafted (see
docs/marketplace-feature-request.md) to propose this concept upstream
Workshop Lifecycle
The full lifecycle now has five phases:
Create Deploy Validate Test Troubleshoot
┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
│ Showroom │ │AgnosticD │ │ Student │ │ Workshop │ │ Decision │
│ content │──────►│ + Field │──────►│Readiness │──────►│ Tester │──────►│ trees │
│ authoring│ │ Content │ │ Skill │ │ Skill │ │ in SKILL │
└──────────┘ └──────────┘ └──────────┘ └──────────┘ └──────────┘
│ │ │ │ │
▼ ▼ ▼ ▼ ▼
verify-content deployment- student- workshop- agnosticd/
(marketplace) validator readiness tester showroom
(marketplace) (this project) (this project) troubleshoot
The workshop-tester skill (see ADR-012) extends the validation lifecycle by executing actual module exercises against the live environment. While student-readiness checks “is the env ready?”, workshop-tester answers “do the exercises actually work?” — classifying failures as Instruction Fix, Infra / Deployment Fix, or Rethink.
Student Readiness Skill
A new skill (student-readiness) that teaches the AI to verify a deployed environment is ready for students. Key characteristics:
- Process-oriented: No upstream repo – this skill defines a diagnostic process, not a tool wrapper
- Environment-type agnostic: Adapts its checklist based on what was deployed (OCP, RHEL VM, AAP, hybrid)
- Complements the marketplace: References marketplace tools for deeper validation when checks fail
The readiness checklist covers:
- Cluster/host access (oc login or SSH)
- Showroom lab guide accessibility (if deployed)
- Terminal functionality (if deployed)
- Operator readiness (OCP environments)
- Namespace and RBAC correctness (OCP environments)
- Workload resource availability
- Content-environment attribute match
- AAP controller readiness (if applicable)
- Multi-user isolation (if applicable)
Troubleshooting Decision Trees
Each deployment skill (AgnosticD, Showroom) gets an embedded troubleshooting section with a structured decision tree. The AI follows these trees when a user reports a failure, working through diagnostic steps before escalating to marketplace tools.
Positive Consequences
- Workshop developers get a single command (“is my environment ready?”) instead of checking multiple tools
- The AI can diagnose failures systematically instead of guessing
- All AgnosticD environment types are covered, not just OpenShift + Showroom
- Clear escalation path: skill troubleshooting -> student readiness -> marketplace tools
Negative Consequences
- A new skill to maintain that has no upstream source repo
- The readiness checklist must be updated as new environment types or deployment patterns emerge
- Overlap potential if the RHDP Skills Marketplace adds a similar capability in the future
Links
- RHDP Skills Marketplace – complementary validation tools
- /health:deployment-validator – infrastructure health checks
- /showroom:verify-content – content quality validation
- /ftl:rhdp-lab-validator – lab grading automation
- Related: ADR-010 (cross-skill dependencies), ADR-008 (skill updates), ADR-012 (workshop module testing)