Opinion & Analysis
Written by: Chirag Agrawal | Global Head of Data Science at Novelis
Updated 2:00 PM UTC, Wed October 29, 2025

Enterprises are shifting from static chatbots to autonomous, agentic systems that plan, call tools, merge with operational data, and execute, often without a human in the loop for each step. That change breaks the traditional “model risk” mindset: That governance must move from models to behavior, from validation to assurance, and from policy statements to auditable controls embedded in runtime.
This article outlines an actionable operating model, technical guardrails, and a 12-month plan to make effective and auditable autonomous AI possible in complicated businesses.
Autonomous agents, planning and executing actors between APIs, data stores, and business applications, are out of the laboratory. They’re being prototyped to classify tickets, extract and relay knowledge, text to SQL intelligence, and even to send emails or SAP lookups, creating tangible value and new threats such as data breach, tool misuse, hallucinated behavior, and regulatory risk.
Internally, the majority of organizations have recognized GenAI security and governance lagging in adoption. Surveys tend to reflect high interest in advanced analytics and AI, but with concerns regarding governance maturity and secure deployment pathways. Governance concerns intensify as pilots prepare to scale.
Outside of the organizations, regulators are also increasing pressure: The EU AI Act has entered into force, introducing a risk-based regime and a phased series of obligations for the deployment of models in critical systems. The regulations also require human review, risk management, and logging.
Governance requires a common language to describe the level of autonomy an AI system possesses. A productive template is to establish Levels of Autonomy (A0–A5) for enterprise agents:
To apply the framework, map each production agent to a level, controls, and evidence of proof to the level. This mirrors the way internal councils already differentiate data criticality and governance cadences, and it is coupled with regulators’ emphasis on risk classification, human oversight, and logs.
Governance becomes actionable with a lean policy stack:
This architecture aligns with today’s data catalog lineage/metadata capabilities, as well as modern operating models proposed by strategy partners, riding existing rails rather than reinventing them.
Data controls: Embrace “data as a product” with automated lineage and dynamic metadata; associate retrieval (RAG) with governed sources, chunking rules, and retrieval parameters versioned and auditable. This enhances factuality and auditability with lower IP/PII risk.
Model & prompt controls: Use an LLM mesh (abstraction layer) to decouple applications from model providers. It yields cost agility, red team testing across models, and rapid de-risking if a provider alters terms or quality. Maintain a timely registry with versioning and side-by-side testing.
Tool usage and action controls: Limit agents with allowed listed tools, parameter whitelists, environment scopes, and rate limits; for more independence, add dry run modes, dual control approvals, and canaries before complete execution. These trends demonstrate how teams securely introduced email actions, API calls, and SQL connectivity in pilots.
Observability & evaluation: Stand up an evaluation store of golden prompts, adversarial examples, and acceptance thresholds (e.g., fabrication rate, policy violations). Hook these into CI/CD and runtime monitors; alert when drift exceeds thresholds.
Incident readiness: Predefine AI incident types (safety, privacy, IP, security, bias), the kill switch path, and root cause procedures that pass through prompts, tools, data retrieval, and model output. Sync with enterprise data governance councils to ensure clean stewardship and escalation.
Executives seek returns; regulators seek certainty. Watch both:
A CAIO or a governance lead must be an owner of a balanced scorecard that reports these metrics monthly to the executive committee and quarterly to the board.
In industrial environments, autonomy manifests itself through maintenance manager agents (scheduling work orders), document intelligence on specifications and standards, text-to-SQL for operational KPIs, and scenario automation for near real-time updates. All patterns leverage the controls outlined above — specifically, tool allow lists, dataset scoping, and scenario-based refresh with auditable traces.
When agents are on the verge of execution (e.g., sending an email to stakeholders, opening tickets, or establishing parameters), treat them as A1/A2 with double controls and dry runs until guardrails and evaluation coverage are established. Those conducting email tool testing and SQL connectivity found that explicit address handling, parameter validation, and post-action logging were necessary to prevent misfires and establish trust quickly.
Autonomous AI will transform business work, provided we shape behaviors, not models, which means an accurate autonomy taxonomy; a policy stack that composes into controls; lineage-first data practices; mesh-based model agility; rigorous evaluation; and measures that prove both value and safety. With regulators issuing clear expectations and internal constituencies demanding trust, this is the moment to hardwire governance into the architecture—so autonomy increases with trust.
About the Author:
With over 15 years of leadership experience at the intersection of Artificial Intelligence, Data Science, and Cloud Transformation, Chirag Agrawal is spearheading the future of enterprise innovation with Generative AI, Agentic AI, and Autonomous Decision Systems. As the Global Data Science Head of a leading manufacturing company, Chirag has architected and developed AI ecosystems that deliver operational excellence, fuel digital transformation, and provide quantifiable business value. He holds a Bachelor of Science in Mechanical Engineering and a Master’s in Analytics with a major in Machine Learning.