Cody Champion — Dublin-based Applied AI Architect & GenAI Systems Lead
cody@bitsandbeakers.com ·
LinkedIn ·
GitHub
I have built and documented governance structures from scratch inside a federal agency — not advised on them from the outside. At NSF I created an AI deployment playbook, co-chaired the 100+ member AI Community of Practice, engineered the agency's first vector and graph database capabilities, and sat as a voting member on the engineering review board. The outputs were approval processes, risk classification frameworks, audit trails, AI literacy programs, and a governance stack that let the agency move fast without exposing itself to unacceptable risk.
At Accenture I translate that governance experience directly into delivery: clients in regulated sectors get systems with access control, prompt injection defense, human-in-the-loop checkpoints, cost/latency monitoring, and evidence artifacts ready for internal audit or regulatory review. I can map technical controls to governance requirements — not just describe the controls.
Keywords: AI assurance · model risk management · auditability · risk classification · governance-to-engineering · human review checkpoints · control mapping · EU AI Act readiness · AI deployment playbook · evidence artifacts · model monitoring · responsible AI · regulated GenAI deployment · AI transparency · access control · prompt injection defense
I design and build end-to-end GenAI systems that run in production, not just demos that impress in a sandbox. The architecture decisions I own most often: retrieval stack design (hybrid vector, keyword, and graph with source attribution), agent and tool orchestration (MCP-based, least-privilege, auditable), evaluation loops (ground truth plus LLM-as-judge sampling), and observability (Langfuse where it fits, conventional telemetry where it does not).
I work primarily across Azure AI Foundry and GCP, with practical LLMOps around quality, safety, latency, and token spend. The 99.6% ML infrastructure cost reduction I achieved at Accenture Federal was not a procurement decision — I personally rewrote the ML codebase and cloud architecture. I can design for scale, explain the cost model, and own the implementation.
Keywords: Applied AI Architect · GenAI Architect · Enterprise GenAI · RAG · agents · tool orchestration · LLMOps · Azure AI Foundry · GCP · cloud-native AI deployment · observability · Langfuse · MLflow · FastAPI · Docker · Kubernetes · token cost · latency optimization · AI evaluation harnesses · production GenAI · agentic workflows · embedding pipelines
My public LLM evaluation work (llm-eval-workbench and the PAEF Zenodo preprint) is built around regulated-enterprise model readiness. The harness covers capability, reliability, governance behavior, groundedness, security reasoning, cost and latency tracking, and structured failure taxonomy. Its public scenarios are benign and designed for reviewable evidence rather than safeguard bypass. Every run produces structured artifacts, not just a score.
The PAEF preprint (Zenodo DOI: 10.5281/zenodo.19848867) formalizes a parallelized atomic evaluation framework for contract compliance across multiple models, with token-level margin analysis. The embedding benchmark evaluated 33 models across 1,700 arXiv papers — cross-domain, reproducible, MLOps-rigorous. I think about evaluation the way an engineer thinks about CI: it runs continuously, it catches regressions, and it produces evidence you can show to a skeptic.
On the operational side I have built prompt injection defenses, sandboxed tool execution, and access control inside live enterprise GenAI systems. I also have experience explaining AI safety tradeoffs to non-technical stakeholders — governance boards, federal oversight bodies, and clients who need to sign off on deployment risk.
Keywords: LLM evaluation · model readiness · model evaluation · governance behavior · RAG groundedness · access-control reasoning · failure taxonomy · cost/latency tracking · AI observability · prompt injection resilience · contract compliance · evaluation harness · AI reliability · PAEF · atomic evaluation
Based in Dublin, Ireland. Available for roles in Dublin, remote-Ireland, UK, EMEA, and global remote senior positions. Not currently seeking US-based on-site roles.