The open-source accelerator forplatform engineering andAgentic AIat enterprise scale.
Backstage on Azure, six layers of governance, the Horizons Phases, and a working Agent IDP in 90 days.
AuthorPaula Silva
RoleSoftware Global Black Belt
Duration60 to 90 minutes
Date2026-05-22
Agenda
Four acts. Thirteen parts. Two hours end to end.
ACT 1
I-IV · 25 min
Diagnostic
IThe problem
IIWhat Open Horizons is
IIIPlatform engineering 101
IVThe rise of Agentic DevOps
ACT 2
V-VI · 30 min
The stack
VMS Agentic DevOps stack
VIDay in the life
VIILive tour, ohorizons.ai
VISix-layer Context Platform
ACT 3
VII-X · 35 min
Adoption + ops
VIIHorizons Phases, H1/H2/H3
VIIIGitHub + ADO integration
IXGolden Paths + Agent Catalog
XSecurity + governance
ACT 4
XI-XIII · 30 min
Business + action
XIBusiness value + ROI
XIIGetting started, 5 steps
XIIIPartner ecosystem
→Close + Discovery CTA
Short on time? Watch Act 2 alone, V to VI. That covers the demo and the architecture. Then jump to Act 4 for Discovery sign-off.
Who is speaking
Paula Silva, Software Global Black Belt.
Building the future of software development with AI and Agentic DevOps.
I work with enterprise customers across the Americas on Agentic AI, platform engineering, and software modernization. This deck distills the patterns I see repeated across dozens of programs, pilots that work in a notebook and stall on a production cluster. Open Horizons is the accelerator that closes that gap, by construction, on Day 1, not on Day 180.
PART
I
The problem.
Why most enterprise AI pilots stall before production, and why more tools will not fix it.
The agent cemetery
95%
is the GenAI pilot failure rate measured by MIT NANDA in 2025. Failures concentrated in context, integration, and governance gaps. Not model quality.
Source: MIT NANDA, The GenAI Divide, State of AI in Business 2025.
The cancellation curve
40%
of Agentic AI projects will be cancelled by end of 2027, according to Gartner. Cost overruns, unclear business value, inadequate risk controls.
Source: Gartner press release, June 2025. Same period when Gartner also forecast 40 percent of enterprise apps will feature task-specific AI agents by 2026, up from under 5 percent in 2025.
The inverted perception problem
Developers estimated AI tools would speed them up by 20 percent. An RCT measured they were 19 percent slower.
Estimated speedup
+20%
developer self-report of expected productivity gain from AI tools
Inversion
39 pp
vs measured outcome
Measured speedup
-19%
actual change in throughput, 16 experienced open-source developers
Source: METR, RCT on AI-augmented development, 2025. arXiv:2507.09089. When the tool makes you feel capable, you cannot detect it has made you less effective. Trust measured trajectories, not satisfaction surveys.
The Triple Debt
Three forms of debt accumulate in AI-native development. Each is invisible until it is expensive.
Familiar
Technical debt
CMU 2026, AI agents drive +18 percent static-analysis warnings and +39 percent cognitive complexity. Liu et al. 2026, AI code adds significantly more requirement and test debt across 304,362 commits.
New
Cognitive debt
Storey 2026 names it cognitive surrender. Anthropic Fellows 2026, AI use during learning reduces library-specific skill acquisition by 17 percent. The codebase becomes orphaned knowledge.
Worst
Intent debt
Objectives, constraints, and decision rationale never captured. The Klarna 2025 case, comprehensive context, no codified intent. The agent optimizes for the wrong metric.
Open Horizons exists to eliminate the Triple Debt by construction. SDD anchors intent, Backstage anchors knowledge, scope guards anchor scope.
The four costs of the status quo
Without a platform, every team pays four taxes.
Cognitive
Six jobs in one role
Kubernetes, Terraform, Actions, observability, security scanners, and now agent frameworks. Burnout, shallow expertise, slow delivery.
Inconsistency
N teams, N paved roads
N pipelines, N base images, N security postures. Zero leverage when a CVE drops or compliance changes.
Compliance
Annual fire drill
SOC 2, ISO 27001, HIPAA, PCI-DSS enforced manually, retrospectively, and incompletely. Audits become projects, not checkpoints.
AI adoption
Shadow agents
Devs pasting code into ChatGPT, unsanctioned prototypes touching prod, no per-agent cost, no governance for which models, which tools, which data.
The wrong reflex
The instinct is to buy another tool. That makes the problem worse.
The reflex · stack another tool
CI/CD vN+1
New IaC
AI gateway
Obs vendor
Sec scanner
Agent SaaS
MCP hub
FinOps SaaS
+ N more
Every purchase adds surface area, integrations to maintain, and one more decision the developer has to make before writing code. The cognitive tax compounds.
platform shift
The answer · one platform, opinions baked in
Paved roads, Golden Paths replace per-team improvisation.
Opinions encoded, no rebuilding the wheel on each repo.
Governance baked in, security, FinOps, audit by construction.
A platform multiplies leverage. That is what Open Horizons delivers.
The solution is not another vendor. The solution is a platform with opinions, paved roads, and governance baked in. Tools serve the platform; the platform serves developers and agents.
What good looks like
An enterprise where the platform multiplies leverage, not headcount.
01A new service is scaffolded, deployed, monitored, and compliant in under 30 minutes.
02Every team uses the same paved road, but can step off it explicitly when needed.
03AI agents are first-class citizens with identity, RBAC, cost ceilings, and trajectory replay.
04Audits are continuous, not annual. Every change is traceable to a spec, a PR, and a run log.
05The platform team ships a product. Developers and agents are the customers.
The road from here
The problem is named. The path forward is mapped. Here is how the next 110 slides take you from the cemetery to a working platform.
Act II → V · Concept
What Open Horizons is, why platform engineering, what Agentic DevOps means, how Microsoft ships the stack
VI + VII · Demo
Day-in-the-life of a developer and an agent. Live tour through ohorizons.ai screens you can open today
VI → X · Mechanics
The six-layer architecture, Horizons Phases, GitHub + ADO integration, Golden Paths, Agent Catalog, security and compliance
XI → XIII · Action
Business value frame, the five-step engagement model, partners, references, and how to start a Discovery
If short on time, jump to the live tour at VII (10 slides), then the Horizons Phases at IX (6 slides), then Getting Started at XIV (6 slides). That is the 22-slide path.
The body of work behind this deck
Three artifacts, one continuum. The deck teaches the model. The playbook documents the patterns. Open Horizons ships them.
01 · Deck
Context Platform Stack
The diagnostic deck. Four questions, four layers, the cemetery numbers, the cost of failure. 50 slides, executive audience. The conceptual model in compressed form.
Audience: CTO, CIO, CFO, board
02 · Playbook
Open Horizons Playbook
25 chapters. Part I tells the story. Part II is the receipts: peer-reviewed research, the CNCF crosswalk, every layer deep, every claim cited. The reference architecture as a published guide.
The deck, materialized. A working Backstage + Azure + Foundry deployment in your tenant in under 3 hours. 22 Golden Paths, 17 agents, 15 MCP servers, all governed and observable on Day 1.
Audience: every team in the org · ohorizons.ai
This deck pulls from all three. Numbers and diagnostics come from the deck, mechanics come from the playbook, screenshots and live demos come from the accelerator.
The four questions
Four questions most enterprises fail to answer with precision. Open Horizons answers each one in code, not in slides.
The Context Platform Stack started as four layers. Production added two more: Integration and Harness.
2025 · Original 4-layer model
L4 Intent · what should agents optimize for
L3 Context · what can agents know
L2 Platform · what can agents access
L1 Cloud · where do agents run
prod
2026 · OH production 6-layer model
L6 Harness · wraps every model call · NEW
L5 Integration · GitHub + ADO + Argo + MCP · NEW
L4 Intent · what should agents optimize for
L3 Context · what can agents know
L2 Platform · what can agents access
L1 Cloud · where do agents run
Five layers collapses Integration into Platform and Harness into Context. Both collapses produced unmaintainable bundles in real deployments. The two extra layers are what survived production contact.
Defending the model
Two recurring pushbacks. Both have crisp answers from the field.
Pushback 1 · "Three is enough"
Merging context and intent creates drift.
If you merge "what the agent knows" with "what the agent wants," both become ambiguous. You see it when the same skill change alters expected behavior and nobody can tell if it is a bug or a feature. Separating context (facts) from intent (values) makes drift detectable. That is why L3 and L4 are different layers in Open Horizons, with different artifacts, different review processes, different change cadences.
Pushback 2 · "Six is too many"
Integration and Harness have distinct owners.
Integration (L5) is owned by the platform integrators handling GitHub + ADO + Argo + MCP coexistence. Harness (L6) is owned by SRE + FinOps + Security teams. If you collapse them into Platform or Context you put the wrong owner on a contract. Separation maps cleanly to who is on-call when each layer fails. Production survives the audit only when ownership is unambiguous.
The 4-layer model is the teaching frame. The 6-layer model is the operating frame. Both are correct. They serve different rooms.
PART
II
What Open Horizons is.
An accelerator, not a SaaS product. A platform, not a stack of tools. Two personas, one portal.
In one sentence
Open Horizons is an open-source Agentic DevOps Platform. An Azure-native accelerator that gives enterprises a production-grade Internal Developer Platform and an AI Agent Platform in one coherent stack.
Delivered by Microsoft and the certified partner network. Deployed in your tenant, your subscription, your data. No SaaS, no lock-in, no per-seat pricing.
What it is not
Four things Open Horizons is deliberately not.
Not SaaS
Your tenant
Open Horizons runs in your Azure subscription, on your AKS cluster, with your identity provider. No data leaves your tenant.
Not a fork
Upstream Backstage
Open Horizons consumes upstream Backstage and contributes back. You stay on the community release train, you keep the ecosystem.
Not lock-in
Open standards
Every layer is open source or open standard. AKS, Terraform, Argo CD, OpenTelemetry, MCP. You can take the code with you.
Not templates
Working infra
The accelerator includes live infrastructure, a working agent runtime, observability, Golden Paths, agents, skills, prompts, policies, runbooks.
Two personas, one portal
Both personas share identity, catalog, RBAC, and observability. An agent is just another component that needs to be managed, secured, and measured.
Developer IDP
For app engineers
Self-service scaffolding via Golden Paths. One-click environments. TechDocs per service. Integrated CI/CD, secrets, observability, cost. A catalog of components, APIs, resources, teams.
Agent IDP
For AI engineers
Catalog of agents with identity, ownership, governance. Trajectory logs, replayable and auditable. Per-agent cost dashboards. Skill, prompt, instruction registry with PR-based approvals.
The core components
Twelve concerns. Four families. One coherent platform.
Portal
Backstage OSS
Single pane of glass for devs and agents.
Catalog
Software Catalog
Source of truth, services + APIs + agents.
Scaffolder
Ohorizons/ohorizons-golden-paths
22 templates, sibling repo, versioned apart.
GitOps
Argo CD
Declarative deploys, no kubectl from laptops.
Runtime
Azure Kubernetes Service
Private API, autoscaler, Workload Identity.
IaC
Terraform, 18 modules
Tags mandatory, reproducible Azure infra.
Observability
Prometheus + Grafana + Loki
7 per-layer dashboards out of the box.
Identity
Entra + Workload Identity
Zero secrets in pods, per-agent identity.
Secrets
Azure Key Vault + CSI
Private endpoint, secrets as files.
AI Runtime
Microsoft AI Foundry
gpt-5-1, gpt-5-4-pro, Agent Framework.
Agent layer
.github/ · H1+H2+H3 integrators
19 agents that ship + integrate the Horizons.
Policy
OPA + Gatekeeper + Trivy + tfsec
Continuous compliance, admission control.
Developer experience
Infrastructure
AI runtime + agents
Governance
Anatomy of the accelerator
Two repositories under github.com/Ohorizons. The agents in .github/ are what makes this an accelerator, not a template.
The agents in .github/ are the implementer + integrator team that wires H1, H2, H3 together. They provision infrastructure, register Golden Paths from the sibling repo, onboard agents, configure dashboards, and verify each phase. Without these agents, this would be a template. With them, it is an accelerator.
PART
III
Platform engineering 101.
The discipline of building internal products for internal developers. Not DevOps rebranded.
Definitions that matter
Four terms the executive team must agree on before signing a platform charter.
01Platform engineering. The discipline of designing and building toolchains and workflows that enable self-service for software engineering in the cloud-native era.
02Internal Developer Platform, IDP. The product that platform engineers build. To developers what Azure plus GitHub plus Backstage is to an enterprise, one paved road that abstracts complexity.
03Golden Path. A pre-defined, opinionated way to do a common task. How a platform expresses "here is how we do it here."
04DevEx. The measurable quality of being a developer in your org. DORA, SPACE, time-to-first-PR. What we instrument.
The five pillars of a modern IDP
Every IDP, including Open Horizons, must deliver these.
01
Service Catalog
Single source of truth for what exists. In OH, Backstage Software Catalog.
02
Software Templates
Scaffolding new things in minutes. Golden Paths organized by H1, H2, H3.
03
TechDocs
Docs as code next to the code, published via the portal. No more wiki rot.
04
Observability
Dashboards, alerts, logs, traces in the developer context, not a separate tool.
05
Governance
Policy as code, RBAC, secrets, supply chain by default, not as an afterthought.
The product mindset
The single biggest mistake is treating platform as an infra project. It is a product. Developers and agents are both customers.
Infrastructure mindset
"We deployed Kubernetes."
"The cluster is up."
"Devs should read the docs."
Annual roadmap.
Success equals uptime.
Product mindset
"12 teams onboarded, 8 deploying daily. 14 agents in catalog, 9 running production traffic."
"Time-to-first-PR is 3 days. Time-to-first-agent-trajectory is 90 minutes."
"We watched 5 devs and 3 agent authors onboard, here is where they got stuck."
Monthly product reviews with engineering and agent-author customers.
Success equals adoption + DevEx + AgentX metrics.
DORA + SPACE, the metrics that matter
You cannot improve what you do not measure. Two frameworks, shipped out of the box.
DORA · the four keys
DELIVERY
01 · Velocity
Deployment frequency
Elite: multiple times per day.
02 · Velocity
Lead time for changes
Elite: under one day commit to prod.
03 · Stability
Change failure rate
Elite: under 15 percent.
04 · Stability
Mean time to recovery
Elite: under one hour.
OH dashboards: Backstage DORA Four Keys plugin · Grafana L2 board · alert rules on the four keys.
SPACE · the human side
DEVEX
S
Satisfaction and well-being. Burnout signal, sentiment, retention.
P
Performance. Quality of output, defect escape rate.
A
Activity. Volume of work, PRs, reviews, commits.
C
Communication and collaboration. Cross-team flow.
E
Efficiency and flow. Focus time, context switches, time-to-merge.
DORA measures the system. SPACE measures the people running it. Both surface in the same Grafana board, both gate platform releases.
Anti-patterns to avoid
Five mistakes that kill platforms. All preventable.
01If we build it, they will come. They will not. Adoption is a product motion. Sell internally, train, iterate.
02One Golden Path to rule them all. Different stacks need different paths. Open Horizons ships multiple per Horizon.
03Security is someone else's problem. Bolt it on later and you never catch up. Defaults are the policy.
04Two SREs in a corner. Platforms need PMs, designers, engineers. Ratio target, 1 per 15 to 25 app engineers.
05We will add AI later. Retrofit agent governance is harder than building it in. Treat agents first-class on Day 1.
PART
IV
The rise of Agentic DevOps.
From code completion to chat to production agents. The enterprise governance problem just exploded.
A three-year arc
From autocomplete to autonomy in three release cycles.
2022
Code completion
GitHub Copilot makes autocomplete intelligent. Humans still drive every step.
2023, 2024
Chat assistants
GitHub Copilot Chat, ChatGPT enter the IDE. Each interaction one-shot. No memory, no tools, no autonomy.
2025, 2026
Agentic systems
Agents plan, call tools, persist memory, run for minutes or hours, and ship work. The governance problem just exploded.
What makes a system agentic
Four properties of a minimum-viable agent.
01
Goal-directed
It decomposes a request into a plan and reasons over it.
02
Tool use
It can call APIs, search code, read files, invoke deployments.
03
Memory
It remembers across turns, sessions, runs. The context platform.
04
Governed autonomy
It can act, but within identity, RBAC, and policy constraints.
Why "just use GitHub Copilot" is not enough
GitHub Copilot ships the primitives. Open Horizons ships the enterprise control plane that unifies them.
Enterprise need
GitHub Copilot 2026
What GitHub Copilot ships · gap
OH Agent IDP
Per-team cost visibility
Partial
Team-level metrics API + Cost Centers (May 2026). Min 5 active users/day, no hierarchical chargeback.1
Yes
Trajectory logging and replay
Partial
Agent-Logs-Url trailer + OTel export (Mar 2026). Coding-agent only, not unified across IDE/CLI surfaces.2
Yes
RBAC: which agent touches which data
Partial
Content Exclusion + Agent Control Plane GA Feb 2026 + MCP allowlist. Not enforced on Cloud Agent, CLI, Agent Mode yet.3
Yes
Approved skill, prompt, instruction registry
Partial
Org custom instructions GA (Apr 2026) + BYOR MCP + custom agents repo. Federation of files, no single registry UI.4
Yes
Compliance-grade audit logs
Partial
SOC 2 Type 1 + ISO 27001 in scope. 180-day retention only; long-term needs Splunk/Event Hubs streaming.5
Yes
Custom long-running business agents
Limited
Custom Agents (Markdown profiles) bounded to dev workflows. Business-domain agents need Copilot Studio / Foundry.6
Yes
Integration: catalog, observability, policy
Limited
OpenTelemetry native. Backstage plugin is community-maintained. No first-party OPA integration.7
GitHub Copilot ships the primitives. Open Horizons unifies them with your catalog, observability stack, policy engine, and existing IDP.
The four pillars of Agentic DevOps
Open Horizons is built around these four. Every customer inherits them.
Pillar 1
Identity for agents
Every agent has a service principal in Entra ID, a catalog entry, scoped RBAC, a cost center tag. "What is this agent allowed to do?" has an audited answer.
Pillar 2
Context engineering
The six-layer Context Platform Stack. MCP servers, three-tier memory, RAG, prompt cache, scope guards. Grounded in 25+ peer-reviewed papers.
Pillar 3
Trajectories and cost
Every agent run produces a structured, replayable, evaluable trajectory. Cost per agent, per team, per task. The black box recorder.
Working Backstage + Azure + Foundry in your tenant under 3 hours. 22 Golden Paths, 19 agents, 12 MCP servers, governed and observable on Day 1.
Next part unpacks the middle box. We name every component, show how it fits, and end with the loops that make GitHub + Azure better together than either alone.
PART
V
The Microsoft Agentic DevOps stack.
GitHub, Azure, AI Foundry. The three pieces Open Horizons orchestrates into one platform.
Evolving DevOps
Twenty years, three definitions. Each one adds, none removes.
DevOps
"Union of people, process, and technology to enable continuous delivery of value to end users."
The original. Broke down silos between Dev and Ops. Made deploys daily, not quarterly.
DevSecOps
"Union of people, process, and technology with security as a shared responsibility to enable continuous delivery of value to end users."
Security shifted left. Scans, policies, identity baked into the pipeline.
Agentic DevOps
AI-powered agents operating as members of your dev and ops teams, automating, optimizing, and accelerating every stage of the software lifecycle.
Agents shifted in. Code, review, deploy, operate alongside humans, governed.
Autonomous and semi-autonomous agents work alongside developers and operators across every stage of the software lifecycle.
Agents solve routine and complex tasks together, bringing apps to market faster, increasing code quality and security, removing repetitive development work, reducing technical debt, and reframing the economics of operating, maintaining, and modernizing apps in production.
Through Agentic DevOps, developers orchestrate a series of agentic services with the freedom to focus on higher-value creative work, while operators proactively identify, mitigate, and resolve issues in production.
Developer
Orchestrates agents. Owns intent.
Agent
Executes routine + complex tasks.
Operator
Identifies, mitigates, resolves.
Microsoft + GitHub, the platform for AI innovation
One platform across the full software lifecycle. Policy and governance wrap every stage.
Source: Microsoft. Open Horizons consumes this stack and wires it into Backstage. Every named asset above appears in the catalog with an owner.
Agentic DevOps for Azure and GitHub
Three workflows. Code, Collaborate, Operate. Each one is the agent + the human working the same surface.
GitHub and Azure work better together
Six concrete loops where GitHub + Azure remove a class of integration work.
GitHub Copilot Modernization
Build apps
GitHub Copilot App Modernization builds applications and refactors legacy code into AKS and App Services.
Foundry Apps
Manage AI services
Customers adopting AI and Agents build AI Apps and use Azure AI Foundry to manage their AI services in one place.
PGSQL preference
AI-app data layer
Developers building AI Apps prefer PostgreSQL with pgvector, leading to use of Azure Database for PostgreSQL.
Dev with GitHub Copilot
Code to prod
Developers using AI and Agents with GitHub Copilot write the code that ships to production and integrates across Azure services.
GHAS + Defender
Secure the loop
GitHub Advanced Security integrates with Microsoft Defender for Cloud to identify vulnerabilities and remediate with agents through GitHub Copilot.
SRE Agent
Incident response
SRE Agent automates incident response and monitoring, creating GitHub Issues and collaborating with GitHub Copilot's coding agent to fix problems.
Azure AI Foundry
App platform for a multi-model world. The model + agent + observability surface Open Horizons consumes.
Open Horizons consumes Foundry Models for inference, Foundry Agent Service for orchestration, and Foundry Observability as the data source for the L6 harness telemetry.
Two protocols, two scopes
MCP ♥ A2A
MCP is about Agent → Tool interactions
A2A is about Agent → Agent interactions
MCP, the Model Context Protocol from Anthropic, is the standard for how an agent calls a tool or retrieves context. A2A v1.0 is the standard for how one agent hands off to another, propagating state and trace context. Open Horizons ships 12 MCP servers and uses A2A v1.0 in the L6 harness so multi-agent workflows are observable end to end.
Spec-Driven Development
Four phases that turn a vague idea into a verifiable spec, ready for agent execution.
01 · Specify
What and why
Define user stories, goals, success criteria. Not tech specs. Source of truth for the change.
02 · Plan
How
Tech stack, architecture, data models, integration with legacy systems, compliance constraints.
03 · Task
Break it down
Reviewable, testable, specific. Implementable and verifiable in isolation. TDD-friendly.
04 · Implement
Execute
Each task implemented individually. Review, test, approve. Continuous validation against the spec.
.specs/042-pricing/
CONSTITUTION.md# non-negotiablesSPECIFICATION.md# EARS reqs, user storiesPLAN.md# architecture, data modelsTASKS.md# 22 reviewable unitsDIAGRAMS/# 4 mermaid + 2 svgADRs/# decision records@implementer picks T-001..T-022 in sequence
@reviewer gates each PR against SPEC
scope-guard blocks file edits outside .specs/042-*
Source: spec-kit pattern, used by Microsoft Specky and Open Horizons SDD pipeline. The spec is the contract between human intent and agent execution.
GitHub Copilot, ten foundational use cases
The starter menu. What every team uses on day one, before the first agent ships.
Source: Microsoft GitHub Copilot fundamentals. Open Horizons treats these as the on-ramp. Once teams adopt them, the next step is custom agents through Foundry Agent Service.
GitHub Copilot, use cases by persona
Five roles. Different prompts, same plumbing. The platform is one.
Developer
Code completion
Boilerplate generation
Multi-language support
Refactoring
Debugging
API integration
Unit test generation
QA Engineer
Test case generation
Edge case suggestions
Test data generation
Bug reproduction
Mock + stubbing
Regression tests
Coverage improvement
DBA
SQL optimization
Schema design
Stored procedures
Indexing
Data migration
Backup + recovery
Query debugging
DevOps
IaC (Terraform, Bicep)
CI/CD pipeline YAML
Log parsing
Containerization
System monitoring
Incident response
Shell scripting
Security
Secure coding
Threat modeling
Policy enforcement
Pen-testing scripts
Log analysis
IAM config
Crypto guidance
✓ Accelerates workflows
Suggestions, boilerplate, automation.
✓ Reduces cognitive load
Less syntax memorization.
✓ Enhances learning
Real-time examples and best practices.
✓ Improves efficiency
Less context switching, faster cycles.
Better together for AIOps
Four loops where GitHub + Azure AI Foundry produce outcomes neither delivers alone.
Loop 1
Code-first AI dev
Application code, model configurations, prompt engineering all in one repo. AI Toolkit + AI Foundry extension lets devs work locally before deploying.
Loop 2
Build, test, deploy
Automate AI model deployment and agent workflows. Deploy prompt flows, evaluations, and monitoring to Azure AI Foundry.
Loop 3
Enterprise security
Detect vulnerabilities early with CodeQL + Dependabot. Add observability, safety filters, evaluation frameworks for responsible AI.
Loop 4
End-to-end orchestration
Automates deployment of validated models and agents directly. Unifies AI orchestration across models and tools with chaining, memory, planning.
Faster innovation
Robust security
Enterprise governance
Scalable AI adoption
Where Open Horizons fits
Microsoft ships the building blocks. Open Horizons assembles them into one governed platform on your tenant.
GitHub gives you GitHub Copilot, Actions, Issues, Advanced Security. Azure gives you AKS, Entra, Key Vault, App Insights, Defender. Azure AI Foundry gives you Foundry Models, Foundry Agent Service, Foundry Observability. Each piece is excellent on its own and well documented. The work that takes 9 to 18 months in most enterprises is gluing them together into a single, governed, opinionated, reproducible platform with paved roads and a catalog. That is the work Open Horizons has already done. You consume the result.
What writing a spec looks like
VS Code, Specky extension, the .specs/ folder open. Intent first, then plan, then tasks, then code. In that order.
042-pricing-engine · ohorizons · main
EXPLORER
▾ .specs/
▾ 042-pricing-engine/
📋 CONSTITUTION.md
📄 SPECIFICATION.md
🏗 PLAN.md
✓ TASKS.md
📊 DIAGRAMS/
📝 ADRs/
▸ src/
▸ tests/
▸ helm/
📜 catalog-info.yaml
SPECIFICATION.md
CONSTITUTION.md
1#Pricing Engine v2 · SPECIFICATION
2
3##Goal
4Reprice 1.2M SKUs in ≤4h, respecting margin floor.
5
6##EARS Requirements
7REQ-001WHEN a SKU has margin < 8%,
8THE SYSTEM SHALL hold the prior price.
9REQ-002WHEN upstream cost increases >15%,
10THE SYSTEM SHALL route to @reviewer first.
11
12##Out of scope
13· Promo engine (separate spec 043)
14· Currency conversion (handled upstream)
15│
GitHub Copilot · Specky
P
REQ-002 mentions @reviewer. Should that fire on every price change or only when cost change > 15%?
C
Spec uses WHEN ... THEN, so guard fires only at the threshold. I would tighten the wording:
REQ-002WHEN
cost_delta > 0.15
THE SYSTEM SHALL
route("@reviewer")
✓ Spec syntax valid. Adds to TASKS.md as T-009.
The dev writes intent in natural language. Specky validates EARS, suggests refinements, and queues implementation tasks. Code does not start until the spec compiles.
Part VI · Day in the life
VI
How developers and agents actually use the portal.
A walkthrough with mockups, flows, and one collaboration diagram. So you can picture it before you build it.
The developer's portal, simulated
What a developer sees on a Tuesday morning. One pane. Everything wired.
backstage.ohorizons.ai/catalog
Open Horizons
Catalog
Create new
Golden Paths
TechDocs
AI Agents
FinOps
Dashboards
Software Catalog
46 entities · owned by 5 teams · all healthy
Name
Kind
Owner
DORA
formulation-service
Service
team-rnd
● Elite
batch-tracking
Service
team-quality
● Elite
@deploy
Agent
team-platform
● High
@reviewer
Agent
team-platform
● High
storefront
Service
team-commerce
● Medium
Services and agents in one catalog. Same owner, same lifecycle, same DORA scoring. An agent is not a special creature, it is a component with a model.
How a developer uses the portal
From "I need a new microservice" to "it is live in production." Seven steps, fully self-service.
Total developer effort: ~5 minutes of clicks. Total wall-clock to production: ~12 minutes. No tickets opened.
The agent's portal, simulated
What an agent author sees. Same UI, different tab. Trajectory + tokens + verdict in plain view.
The platform refuses to record an agent action without a PR + spec link, and refuses to record a human action without a PR + reviewer. Same governance, both sides.
Portal surfaces, one click away
Six entry points covering the full developer + agent workflow.
Foundry Control
Live agents and models. The Toolbox: 12 MCP servers, 4 built-ins. 3-tier cache. Model routing.
AI Agents
Every agent: what it consumes, its model tier, its cache hit-rate, its trajectories.
One catalog. One identity model. One audit trail. Two kinds of users.
A developer scaffolds a service via a Golden Path. An agent is invoked by an event, calls tools through the harness, writes a trajectory. Both go through the same portal, the same RBAC, the same observability stack. The agent is not a side-car or a chatbot, it is a first-class platform citizen registered in the catalog with an owner, a cost center, a runbook, and an SLO. Treat it that way from Day 1 and the 95 percent pilot failure rate becomes a 5 percent problem.
From concept to pixels
You have seen the diagrams. Now look at the actual product.
The next ten slides are mockups of the real ohorizons.ai screens. Same chrome, same data shapes, same agents. You can open the URL on your laptop and recognize every panel. If a screen looks unfamiliar, that is the gap. If a screen makes you say "we need that," that is the opening for a Discovery.
→ Landing page
→ Maturity framework
→ Command dashboard
→ Create templates
→ AI Chat
→ AI Impact
→ Foundry Control
→ 17 Platform Agents
Part VII · Live tour
VII
Inside the platform.
Eight UI surfaces, simulated from the real ohorizons.ai showcase environment.
ohorizons.ai · landing
The public showcase. Same brand chrome devs see inside the portal.
ohorizons.ai
Open Horizons
PlatformDifferentiatorsArchitectureFAQ
● Agentic DevOps Platform · Open Horizons
The platform that accelerates the Agentic SDLC
AI-powered developer portal with Golden Paths, intelligent agents, and full observability, built on Backstage, Azure, and GitHub.
The foundry-agents service in namespace ai-services is the runtime that fronts Azure AI Foundry: serves agents, fronts the model router, aggregates the MCP Toolbox, runs the 3-tier semantic prompt cache, applies pre/postToolUse hooks, emits 21-field telemetry, writes Purview audit.
Walks new users through prerequisites, config, first deployment
docs-agent
three-horizons-platform
gpt-4o-mini CHEAP
100
92%
Technical writing + knowledge management across platform docs
Tiered model routing keeps 76% of calls on the cheap tier. Workhorse only when the task demands it. Cache hits drive total spend down by an order of magnitude.
Try it live
The whole platform is at ohorizons.ai
Sign in with GitHub, explore the Catalog, click into Foundry Control, run an AI Impact analysis, or talk to the six agents. The showcase is the platform, deployed on a public Azure subscription, with the same code you would deploy on yours.
URL
ohorizons.ai
Public showcase environment
Sign-in
GitHub OAuth
Read-only access to the showcase
Source
github.com/Ohorizons
Same code as your deployment
Maturity radar · current state
Where most LATAM enterprises score today across the four engineering layers. Pre-platform vs post-H2 Open Horizons.
Pre-platform · typical client baseline
Intent 1 · Context 2 · Platform 2 · Infra 3
Post-H2 Open Horizons · 12 weeks in
Intent 4 · Context 4 · Platform 4 · Infra 4
Maturity is decided by the weakest layer. The radar makes that visible. H1 lifts Infra, H2 lifts Platform, H3 lifts Context and Intent together.
What L4 and L6 actually measure
Two dials every platform team should read weekly. Intent debt and token spend distribution.
L4 · Intent debt index
MONITOR
0-40
OK
40-60
MONITOR
60+
ACTION
Distance between spec baseline and live agent behavior. Above 60 means the spec is stale and agents drifted.
L6 · Token spend · last 30d
5× SAVED
Cheap tier76%
Workhorse18%
Premium5%
Cache hit1%
Model routing keeps 76 percent of calls on cheap tier. Without routing, the same workload would cost ~5× more on workhorse default.
L4 measures whether agents are still doing the right thing. L6 measures how much it costs. Both numbers belong in the platform team's Monday review.
PART
VIII
The Context Platform Stack.
Six layers, integrated. Cloud, platform, context, intent, integration, harness.
The stack at a glance
Six layers, top to bottom. Intent flows down, telemetry flows up.
Upstream Backstage on AKS · 22 Golden Paths · RBAC plugin · DORA Four Keys · OPA Gatekeeper · the scaffolder.
Grafana L2Catalog
L1 · Cloud
Cloud platform team
Terraform 18 modules · AKS · networking · Key Vault · ACR · PostgreSQL Flexible Server (pgvector) · Azure AI Foundry.
Grafana L1terraform/
intent
Intent flows from L4 down into Golden Paths in L2 and agent behavior in L3. The harness in L6 wraps every model call. Integration in L5 is how GitHub, Azure DevOps, Argo CD, and MCP coexist. Everything sits on the Terraform-managed Azure foundation.
Layer 1 · Cloud and Infrastructure
The compute substrate. Eighteen Terraform modules, declared and reproducible.
AKS
Kubernetes 1.34
Private API server, autoscaler min=1, max=4, Workload Identity enabled.
ACR
Container registry
Admin disabled, managed-identity pull, signed images via cosign.
Key Vault
Secrets
Private endpoint, RBAC, CSI driver projects to pods as files.
PostgreSQL
Flexible Server
Private VNet, 30-day backup, pgvector enabled for memory and RAG.
Log Analytics
Telemetry
90-day retention, Container Insights, the App Insights sink for L6.
AI Foundry
Models
gpt-5-1, gpt-5-4-pro deployed by name. Routing handled at L4.
Ingress
NGINX + cert-manager
Let's Encrypt, 4 TLS ingresses Ready by default.
All resources tagged with customer_name, environment, cost_center for L1 + L6 FinOps roll-up.
Layer 2 · Platform Engineering
The developer experience layer. Upstream Backstage on AKS, 22 Golden Paths, full observability.
Catalog discovery + scaffolder publish. App ID 3010479 in the reference deployment.
ADO connection
Workload Identity Fed
No PAT, no SP secrets. Federated credential.
Argo CD
Git-source agnostic
Consumes manifests from either source. The deployment authority.
API Center
Unified inventory
Single API inventory across GitHub + ADO repos.
Cross-links
Catalog ↔ everything
Backstage entities link to GitHub Issues, ADO Boards, Argo apps, Grafana dashboards.
Layer 6 · Harness Engineering
The runtime that wraps every model call. With L6, the agent becomes a governed production system.
How the harness wraps a model call
Observability · 3 hooks
Gateway pre/postToolUse hooks intercept every call. A2A v1.0 context with correlation IDs, spans, trace. 21-field llm.call.completed streamed to App Insights.
FinOps · budget enforcement
Per-agent, per-team, per-CC budgets enforced at 50/80/100 percent. 100 percent hard-stops the agent.
Security & compliance · 2 sinks
Microsoft Purview audits every retrieval against sensitive sources.Sentinel SIEM/SOAR receives prompt-injection, safety violations, scope-guard breaches.
Identity · per agent
Entra Workload Identity Federation per agent. Kubernetes ServiceAccount, Azure AD identity, scoped RBAC.
Per-layer Grafana dashboards
Six layers, dedicated dashboards each. Data sources, Prometheus, Loki, App Insights, Azure Cost API, PostgreSQL.
Layer
Highlight dashboards
L1, Cloud
AKS Cluster Health · AKS Resource Utilization · Azure Resource Inventory · Azure FinOps Spend by Service · FinOps Anomaly · Key Vault Health · PostgreSQL
L2, Platform
Backstage Service Health · Catalog Coverage · Argo CD Sync · Golden Path Adoption · DORA Four Keys · Ingress + TLS · OPA Violations
L3, Context
MCP Server Health · Tool Call Distribution · Prompt Cache Hit Ratio · Shared Context Store · 3-Tier Memory · Skill Load Heatmap · RAG Index Health
L4, Intent
SDD Pipeline Status · Model Routing Decisions · Routing Cost Savings · Scope Guard Activity · Intent Drift
L5, Integration
GitHub Actions Health · Azure DevOps Pipeline Health · Argo CD App-of-Apps · API Center Inventory · Catalog Cross-Link Coverage
L6, Harness
Agent Fleet Overview · Trajectory Volume · Token Consumption Live · Cost Live USD · Budget vs Actual · Budget Alerts 50/80/100 · Eval Scores · Content Safety · Purview Access · llm.call.completed Stream
Every dashboard has Alertmanager rules in prometheus/alerting-rules.yaml. The platform refuses to operate without observability, CI gates enforce dashboard + alert presence.
Cross-layer FinOps
The CFO view. L1 cloud spend + L6 AI spend, one board. Closes the "AI is unaffordable" objection before it leaves the room.
Total spend
Cloud + AI roll-up
L1 cloud and L6 AI in one number, with month-over-month deltas, broken down by cost center, team, environment.
Forecasting
Budget vs actual
Forecast vs budget for the current month. Anomalies and breaches in the last 30 days. Top 5 cost drivers across cloud + AI.
Per agent
Cost-per-trajectory
Efficiency metric, USD per successful trajectory, per agent. Drives prompt-cache investment, model routing, and retirement decisions.
Why six layers and not five or seven
Five collapses. Seven over-specifies. Six survived production contact.
5
COLLAPSES
Five layers loses contracts.
Five collapses Integration into Platform (the GitHub + ADO + Argo + MCP coexistence problem becomes a Backstage problem) and Harness into Execution (telemetry + FinOps + hooks + audit become the agent author's problem). Both produced unmaintainable bundles in real deployments.
SHIPPED
6
SURVIVES
Six layers, clean ownership.
L5 Integration owned by platform integrators, not by Backstage maintainers. L6 Harness owned by SRE + FinOps + Security, not by the agent author. Each layer maps cleanly to who is on-call when it fails. Survives audits.
7
OVER-SPECIFIES
Seven adds noise.
Splitting Identity out of Cloud, or carving Telemetry out of Harness, doubles the contracts without doubling the ownership. The boundary stops matching how teams actually operate. Layers should be the minimum that survive a real audit, not the maximum a diagram can hold.
The six-layer model is not a theory. It is the residue of trying five and seven first. Clean contracts, each one with a named owner.
PART
IX
The Horizons Phases.
H1 Foundation, H2 Enhancement, H3 Innovation. A staged adoption model.
Three phases, three outcomes
Each phase builds on the previous. No customer is asked to commit to H3 before H1 works.
H1
Foundation, 4 to 8 weeks
Cloud + Platform + Portal. A working IDP on Azure with Backstage, GitOps, observability. Pilot team usable within 30 days.
H2
Enhancement, 8 to 12 weeks
Golden Paths + Governance + Self-service. Application teams ship via paved roads. Continuous compliance enabled.
OPA + Gatekeeper, tfsec, Trivy, gitleaks, Defender for Cloud surfaced in Backstage. Evidence on a schedule.
Add
FinOps starts
Cost tags on every resource. FinOps dashboards. Per-team showback or chargeback.
Success criteria, 80 percent of new services via Golden Path. Every PR gated by tfsec, Trivy, OPA. FinOps dashboard with attributed spend.
H3 · Innovation
Make agents first-class platform citizens. Move from AI assistants to production agentic systems.
01Agent IDP, the second persona. Agent catalog in Backstage with owner, RBAC, cost center, version. Trajectory logs into Loki + Backstage. Per-agent cost dashboards in Grafana.
04Harness (L6). The full telemetry + FinOps + Purview + Sentinel stack goes live.
Success criteria, at least 5 production agents each with >100 trajectories/week. Per-agent cost attribution is exact and auditable. A failed run replays deterministically.
Visualizing the journey
Day 0 to Day 180+, three stages, three outcomes.
Each phase delivers value on its own. Each one multiplies the value of the next.
PART
X
GitHub and Azure DevOps.
The foundation of everything. Three scenarios, one platform.
Source control is the center of gravity
If it is not in Git, it is not real.
Every workflow in Open Horizons, every deploy, every spec, every agent invocation, every audit event, begins or ends in a Git repository. GitHub and Azure DevOps are not afterthoughts. They are the foundation that makes the rest of the platform work.
Migrations in progress, divisional preferences, M&A scenarios. Single Backstage catalog, dual auth, Argo CD agnostic.
Scenario A · GitHub end to end
Seven steps from "Create new service" to live in production.
01Developer opens Backstage, picks "Create new service" from the scaffolder.
02Backstage uses the GitHub App to create a new repo with the chosen Golden Path.
03Repo registered in the Backstage catalog automatically.
04GitHub Actions runs on every push, tests, scans, image build, push to GHCR or ACR.
05Argo CD picks up the manifest change, syncs the new version to AKS.
06GitHub Advanced Security results stream into Backstage. tfsec, Trivy, gitleaks results gate the PR.
07@reviewer, @sentinel, @security agents post comments on the PR.
Scenario B · Azure DevOps end to end
Six steps. Same outcome, ADO-native plumbing.
01Developer opens Backstage, picks a Golden Path.
02Backstage uses the ADO REST API (via Service Connection) to create the repo and pipeline.
03Repo registered in Backstage, discovered as a catalog entity.
04Azure Pipelines runs on every push, same scans as the GitHub equivalent.
05Argo CD consumes the rendered manifests, syncs to AKS. Argo is Git-source agnostic.
06Boards work items link to PRs and surface in Backstage via the ADO plugin.
Scenario C · Hybrid coexistence
A single Backstage catalog discovering entities from both sources. Dual auth. Single agent plumbing.
Catalog
Unified
A single Backstage catalog that discovers entities from both GitHub and ADO.
Auth
Dual
Users sign in via either provider. SSO mapping in Entra.
CI/CD
Native + converged
Each repo uses its native CI (Actions or Pipelines) but converges on the same Helm + Argo CD pattern.
Agent identity
Common
Workload identity in Azure is the common substrate. Agents authenticate the same way regardless of source repo.
Migration
Stepping stone
Hybrid is often a stepping stone to consolidation. The platform does not force a choice.
What makes the foundation strong
Five non-negotiables across A, B, and C.
01Everything is in Git. Code, IaC, policies, specs, prompts, instructions, all version-controlled.
02PRs are the unit of change. No deploy without a PR. No agent action without a trajectory tied to a PR or spec.
03CODEOWNERS flows into the catalog. Ownership is never lost.
04Branch protection is enforced. No direct pushes to protected branches. No bypass without audit.
05Argo CD is the deployment authority. No kubectl apply from laptops.
DevSecOps tenet "everything as code" made operational, by Open Horizons defaults.
PART
XI
Golden Paths and the Agent Catalog.
Paved roads for developers. First-class governance for agents.
A Golden Path in one line
The opinionated, paved, well-lit road for the most common developer journeys. Fully scaffolded, fully wired, ready to ship.
Term coined at Spotify. Operationalized in Open Horizons as Backstage Software Templates that produce a working repo, pipeline, infrastructure, observability, and catalog entry, in a single click.
What a Golden Path produces, all at once
Production-ready on Day 1. Twelve artifacts wired together, four families.
04Every model call, model name, tokens in/out, cost, latency
05Memory reads and writes
06The final output and a verdict, success, failure, escalation
Trajectories are stored in PostgreSQL + Loki, indexed, replayable, exportable to OpenTelemetry, and surfaced in Backstage. Without them, you have no debug story.
Cost governance, FinOps for AI
The single fastest way to lose executive trust in AI is a surprise bill. Open Horizons prevents it with six controls.
Ledger
Per-agent
Middleware logs every model call with agent_id, team, cost_center, model, tokens, USD.
Dashboards
Per-team and per-CC
Grafana panels with filters, forecasts. Drill down from CFO view to a single trajectory.
Alerts
50, 80, 100 percent
Alertmanager fires when an agent or team exceeds threshold.
Ceilings
Hard
Per-agent monthly budgets enforced by the runtime. Refuse to run if exceeded.
Routing
.github/model-routing.yaml
Declares which model handles which task. Cheaper models for cheaper tasks.
Eval budget
Separate
Continuous evaluation cost is tracked separately from production traffic.
PART
XII
Security, compliance, governance.
Security is not a feature. It is a property of every layer. The secure way is also the easy way.
The seven security domains
Each maps to NIST CSF, ISO 27001, CIS, SOC 2.
Identity
Zero passwords
Workload Identity for every workload. Zero standing SSH. Just-in-time elevation via Entra PIM. RBAC at every layer.
Secrets
Key Vault SoT
Single source of truth, private endpoint, RBAC. Pods consume via CSI driver. gitleaks blocks PRs.
Network
Private by default
Private endpoints for every PaaS that supports them. Default-deny NetworkPolicies inside the cluster. HSTS on ingress.
Asset inventory in Backstage catalog. RBAC export. Runbook + on-call as incident management.
NIST CSF 2.0
Identify, Protect, Detect
Catalog + RBAC + Defender + alerts + runbooks + backups. Mapped end to end.
CIS Kubernetes
PSS restricted
CIS-aligned AKS configuration via Terraform. Enforced through OPA.
Azure WAF
Five pillars
Reliability, Security, Cost, Performance, Operational Excellence with explicit checklists per service.
NIST AI RMF
Map, Measure, Manage, Govern
Implemented across L3, L4, L5. SDD provides the Manage and Govern receipts.
ISO 42001
AI Management System
Lifecycle, risk, transparency, monitoring, supported by SDD pipeline and trajectory infrastructure.
Audit posture
"How did this change reach production?" Answered in seconds, end to end.
01The spec in .specs/NNN-feature/ describes the intent.
02The PR in GitHub or ADO shows the change, the reviews, the scans.
03The CI run shows the tests and the security gates.
04The Argo CD sync shows the actual deployment to the cluster.
05The Grafana dashboard shows the post-deploy behavior.
06The trajectory, if an agent was involved, shows the autonomous steps.
Every link in this chain is immutable, timestamped, and signed.
PART
XIII
Business value and ROI.
Three buckets. Customer's own numbers. Publicly verifiable benchmarks.
The three buckets of value
A complete business case quantifies all three. Plug your own numbers in. That is the only credible business case.
Bucket 1
Productivity
Faster scaffolding. Reduced cognitive load. AI-assisted code, review, docs. Faster troubleshooting. Measured via DORA + GitHub Copilot research ranges.
Bucket 2
Risk reduction
Continuous scanning. Always-on observability. Per-agent identity + audit. Spec-driven dev. Measured via DORA CFR/MTTR, Verizon DBIR, IBM Ponemon.
Bucket 3
Cost optimization
Cost attribution by tag. AKS autoscaler. Scheduled start/stop dev envs. Per-agent budgets. Model routing. Tool consolidation. Measured via FinOps Foundation, Flexera.
TCO frame
Compare against what you would build yourself, or a closed SaaS IDP.
Status quo
Build it yourself
6 to 18 months to a working IDP. 4 to 8 platform engineers full-time during build. Custom integrations everywhere. Custom agent runtime. Maintained internally, forever.
Alternative
Closed SaaS IDP
Faster to start, but your data and code go somewhere else. Per-seat pricing. Limited customization. Often no agent IDP at all.
Recommended
Open Horizons
4 to 12 weeks to a working H1. Microsoft and certified partners deliver the heavy lift. Customer owns the code, cluster, data. Open source license, no per-seat pricing.
The accelerator typically pays for itself inside the H2 window through productivity and consolidation savings alone.
Three scales of speed
Same scope of platform. Three timeframes. Customer's choice.
9 to 18 months
Industry baseline
A platform of this scope, built from scratch. The cost of going alone. Forrester Wave Q1 2026 baseline for comparable IDPs.
90 to 180 days
Three Horizons end-to-end
H1 Foundation + H2 Enhancement + H3 Innovation integrated. First production AI workload. Observability complete. The accelerator path.
2h 30m
git clone to production H1
Backstage on AKS reachable via HTTPS, Let's Encrypt TLS, GitHub OAuth, ArgoCD syncing, Grafana live, first Golden Path scaffoldable. The install-wizard path, around 3h when an agent drives it.
The compression is the entire business case. Same scope, three speeds, customer's choice.
From numbers to next steps
The business case is yours to build. The first step is small. Two weeks. Fixed scope. Walk-away clause.
Everything from here is mechanics. The Discovery is a paid, scoped, fixed-deliverable engagement that produces an H1 plan your CFO can sign off on or reject in writing. You leave with the architecture diagram, the risk register, the cost estimate, and a partner short-list whether or not you proceed. No build-trap. No retainer. The only commitment is two weeks of your platform lead's time.
Step 1
Discovery, 1 to 2 weeks
Step 2
Pilot, 2 to 4 weeks
Step 3
H1 Foundation, 4 to 8 weeks
Steps 4 + 5
H2 + H3, 20 to 36 weeks
PART
XIV
Getting started.
From discovery to innovation, five steps with decision gates at every step.
The five-step engagement model
Each step has a decision gate. Customers can stop, pause, or scale at any gate.
What you need on Day 1
To start a Discovery, the customer needs six things. Everything else is provided.
01An Azure subscription (or willingness to create one)
02A GitHub organization or Azure DevOps organization
03A Microsoft Entra ID tenant, typically the same one used for M365 or Azure
04An executive sponsor, typically CTO, head of platform, or chief architect
05A named platform lead on the customer side
06One to three pilot teams willing to be early adopters
A short-list of certified partners matched to your industry, region, and AI ambition.
Success plan
Milestones + gates
A 90-day plan with success criteria and decision gates.
Microsoft backstop
Architectural
Microsoft is engaged on architectural escalations across the engagement.
Common pitfalls
Six failure modes, all preventable with the staged model.
01Trying to do H1+H2+H3 at once. Stage. Each horizon delivers value alone.
02Building before piloting. Spend the 2 to 4 weeks on the pilot. It pays for itself.
03Skipping platform-as-a-product. Treat developers as customers. Survey them, iterate.
04Underfunding the platform team. A platform without a team becomes a graveyard. Budget for 2 to 4 dedicated engineers.
05Letting agents bypass governance. Use SDD, trajectories, and cost ceilings from Day 1.
06No exit story from a partner. Insist on knowledge transfer milestones in every SOW.
What "done" looks like
You will know Open Horizons is working when five things become true.
01A new service goes from idea to production in hours, not weeks.
02An application developer can answer "where does this metric come from" in the portal, without asking the platform team.
03An auditor can trace any production change to a spec, a PR, and a trajectory in under a minute.
04An agent invocation has a known cost, a known owner, a known SLO. Like any other service.
05The platform team is shipping a product, not fighting fires.
PART
XV
Partner ecosystem.
You buy the accelerator once. You customize it forever, with partners who know the stack.
Certified partners do four things
Often in combination. Always with customer ownership intact.
Service 1
Deploy + onboard
Stand up the platform in the customer's Azure tenant. H1 in 4 to 8 weeks.
Service 2
Customize
New Golden Paths, plugins, agents, MCP servers, bespoke compliance mappings, custom eval pipelines.
Service 3
Operate Day-2
Team augmentation, upgrades, on-call, FinOps reviews, prompt iteration.
Service 4
Train + enable
Platform team certification, developer onboarding curricula, "train the trainer" for large enterprises.
Certification tiers
Three tiers, based on demonstrated outcomes, not paid status. Renewed annually.
Registered
Trained + signed
Code of conduct signed, disclosure rules agreed. Can deliver onboarding and basic customization.
Certified
>=3 successful H1+H2
Passed technical assessment. Can deliver Day-2, complex customization, agent work.
Strategic
H3 track record
Published reference architectures, named technical leads. Enterprise scale, regulated industries, multi-region.
What partners do not do
Four boundaries that protect the customer.
01Partners do not own customer code or data.
02Partners do not lock customers into a fork of Open Horizons.
03Partners do not bypass governance, security, or audit controls.
04Partners do not exclusively service a customer. Customers can engage multiple, switch, or insource.
How to engage a partner
A five-step procurement pattern that keeps incentives aligned.
01Talk to the Microsoft field team. Size the engagement, match partners to your context.
02Request two or three partner proposals. Compare approaches, references, pricing models.
03Run a paid Discovery, typically 1 to 2 weeks. Produces a fixed-scope H1 plan.
04Sign an SOW tied to outcomes, not hours.
05Insist on knowledge transfer milestones. Every engagement should reduce, not increase, dependence on the partner.
The field-friendly takeaway
"Pilots fail at 95 percent because teams build agents without the four-layer foundation. Open Horizons gives you that foundation on Day 1. The 95 percent becomes a 5 percent problem instead."
Paula Silva
Software Global Black Belt
The data is in the references at the end of the playbook. The accelerator is in the repository. The conversation starts with a Discovery.
References
The research-grounded backbone of this deck. Sixteen open citations, four families.
Every claim in this deck traces to one of these sources. The playbook has the full bibliography with annotations.
Thank you
Let's talk.
If your enterprise is stuck between pilots and production, the conversation starts with a Discovery. One to two weeks. Fixed scope. A 90-day H1 plan you can fund or walk away from.