
Ever had a payment stall on your banking app? That frustrating delay often traces back to hidden mainframes. These last-century systems are essential but invisible to today’s monitoring tools. This lack of mainframe observability leaves organizations struggling to deliver a seamless digital experience.
This story is all too common. Many organizations lack true end-to-end visibility because their mainframes remain “black boxes,” invisible to modern observability platforms. But what if your legacy systems could be part of your comprehensive monitoring strategy without risking reliability?
Broadcom’s Watchtower platform, along with its performance tools like AP4Z and MAT, is making this reality possible. We were lucky enough to hear about this during the Tech Field Day Extra event at the SHARE 2025 Cleveland conference.
The Modern Mainframe Observability Challenge
Today’s users demand instant, seamless digital experiences. Mainframes still power 68% of global production workloads, running critical business operations. Even a brief downtime will cost major financial institutions up to $300,000. That four-minute payment delay we mentioned? It represents potential losses exceeding $72 million in immediate transaction failures as well as long-term customer churn and reputation damage.
Most enterprises use modern observability tools like Datadog, Grafana, or New Relic to monitor distributed systems and microservices. However, these tools typically stop at the mainframe boundary, leading to a “war room” effect: incidents trigger frantic cross-team meetings with finger-pointing and fragmented diagnostics. Application, network, and storage teams each consult their dashboards, but the mainframe team is forced to dig through logs manually, delaying resolution.
Compounding the issue is the shrinking pool of mainframe experts. These SMEs (subject matter experts) are frequently distracted from strategic work to provide basic troubleshooting and translation, slowing down incident response and stalling modernization.
Breaking Down Walls: Watchtower’s Unified Approach
Broadcom’s Watchtower platform integrates mainframe visibility with enterprise-wide observability. Instead of treating mainframes as separate, Watchtower unifies distributed and legacy insights, correlating real-time metrics across the stack.
The platform leverages decades of expertise embedded in proven Broadcom monitoring tools like SysView, NetMaster, and OpsMVS. Then it serves up their insights via modern interfaces and open protocols. That gives both mainframe specialists and broader IT teams the actionable data they need.
Alert Insights: Smarter Incident Response
One of Watchtower’s most powerful features is Alert Insights, which transforms how organizations handle mainframe-related incidents. Let’s say a CICS transaction starts to fail. Alert Insights will just generate an alert. But it will also automatically pull contextual information from all relevant monitoring tools at the exact moment the alert triggered.
Let’s take this typical scenario: Your payment processing system starts experiencing delays. With traditional monitoring, different teams would check their individual tools:
- The CICS administrator examines transaction logs
- The network team reviews connection statistics
- The database team analyzes query performance
- The storage team checks I/O metrics
Watchtower’s Alert Insights automatically correlated this contextual information and presents it in a single, comprehensive alert. The system might reveal that the CICS delay correlates with increased MQ message queuing, elevated database connection times, and storage contention.
This focused view cuts mean time to resolution (MTTR) because teams don’t waste time gathering information that should have been available from the start.
Machine Learning-Powered Anomaly Detection
Static thresholds have limited value in today’s rapidly changing environments. Watchtower goes beyond static threshold monitoring by incorporating machine learning algorithms that understand normal behavior patterns for your specific environment.
The system learns that Monday mornings look different from Friday afternoons, that month-end processing creates predictable spikes, and that seasonal events like Black Friday generate unique load patterns.
This information enables proactive alerting on gradual performance degradation that might not trigger traditional thresholds but could indicate brewing problems. For example, if a critical COBOL program’s execution time creeps from two seconds to three seconds to four seconds over several weeks, machine learning algorithms will flag this trend long before it hits your five-second SLA threshold.
The beauty of this approach lies in buying time. Instead of reacting when performance crosses failure thresholds, teams get advance warning to investigate and resolve issues during maintenance windows rather than emergency response situations.
Next-Gen Optimization with AP4Z and MAT
Broadcom’s Application Profiler for Z (AP4Z) is a breakthrough in continuous, low-overhead (0.1% resource) mainframe application monitoring. It collects detailed module-level data on CPU usage and execution patterns for all workloads: COBOL, database calls, and every system interaction across your entire mainframe workload without impacting production.
This always-on monitoring helps teams act on the Pareto principle: AP4Z surfaces the applications that consume the lion’s share of resources or cause the bulk of performance pain, letting you focus optimization where it truly matters.
In legacy-heavy environments, even simple changes like recompiling COBOL apps with the latest compilers can cut CPU use by 10 to 20%. When combined with optimization parameters and architecture-specific enhancements, savings can reach 75%.
For a large financial institution processing millions of transactions daily, these improvements translate to substantial cost reductions in software licensing fees, hardware requirements, and operational overhead. AP4Z makes these opportunities visible by showing exactly which applications would benefit most from modernization efforts.
Mainframe Application Tuner (MAT) complements AP4Z by providing intensive, statement-level analysis. When AP4Z identifies a problematic application or when performance issues require deep investigation, MAT can analyze code execution down to individual program statements.
MAT supports diverse workload types such as COBOL, PL/I, Java applications, and database interactions. This provides detailed attribution of delays, bottlenecks, and inefficiencies. The tool operates non-intrusively, analyzing running applications without code modifications or significant performance impact (typically 3-4% overhead during analysis periods).
Integrating these tools delivers a complete optimization ecosystem. AP4Z focuses on continuous trend detection; MAT drills deep when needed. Together with Alert Insights, this workflow streamlines investigation and remediation, weaving rapid diagnostics and long-term optimization into daily operations.
OpenTelemetry: Speaking a Common Language
One of Watchtower’s most innovative aspects is its embrace of OpenTelemetry, the industry-standard observability framework. By streaming mainframe telemetry data such as traces, metrics, and logs in OpenTelemetry format, Watchtower makes mainframe systems speak the same language as modern distributed applications.
This means Site Reliability Engineers (SREs) see mainframe CICS transactions alongside microservices and APIs, as well as act on those insights just as easily, incorporating them into enterprise-wide SLOs and dashboards.
Consider the typical SRE workflow: They monitor service level objectives (SLOs), track error budgets, and correlate performance metrics across their technology stack. With Watchtower’s OpenTelemetry integration, mainframe services become first-class citizens in this monitoring approach.
Now they can set service level indicators (SLIs) for mainframe transactions, incorporate them into overall service level objectives, and receive alerts when error budgets are at risk, all without needing deep mainframe expertise. The system translates complex mainframe concepts into familiar observability primitives.
Watchtower’s observability isn’t just about exposing raw numbers. Comprehensive traces include helpful detail-error codes, descriptions, supporting documentation-so teams can act immediately, not just report incidents and wait for someone else to step in.
Mainframe Observability: Transforming Operations and Results
Watchtower eliminates the “war room” bottleneck by ensuring the first alert after an issue contains all crucial context. Instead of hours spent correlating logs across silos, teams see what failed and why in real time, tracing end-to-end flows from digital frontends to mainframe transactions.
By reducing the burden on SMEs, Watchtower enables mainframe pros to focus on modernization, optimization, and higher-value projects. Less time is wasted on simple translation, and broader teams become capable of handling routine triage and response.
Organizations using Watchtower have documented:
- 40–60% reductions in MTTR for mainframe-related incidents
- 15–30% lower software licensing costs from smarter optimizations
- Lower risk by giving teams better visibility for modernization planning
- Improved customer satisfaction and retention due to fewer critical outages
The Future of Mainframe Operations
Broadcom’s Watchtower platform, along with AP4Z and MAT, brings legacy systems into the enterprise-wide monitoring ecosystem.
You can take a closer look at how these solutions work by watching the Tech Field Day videos to see live demonstrations, real-world case studies, and detailed discussions straight from platform experts.