
Business Analyst IV - Alert Management Lead
3d3 days agoAstreya
US · Full-time · $98,040 – $154,800
About this role
The Business Analyst IV serves as Alert Management and Observability Standards Lead. This position provides solutions that help attain business outcomes through alert governance. The role defines standards and ensures alerts align with service reliability goals and operational coverage models.
Day-to-day responsibilities center on rationalizing alerts for business criticality and actionability. The lead conducts regular reviews of new and existing alerts while reducing signal-to-noise issues. Routing decisions determine whether alerts reach the 24x7 Eyes-on-Glass team or follow other paths.
This role operates at the intersection of the IT Operations Command Center, engineering teams, platform owners, and service owners. Collaboration ensures alerts remain actionable with clear ownership and escalation paths. Standards are embedded into monitoring tooling through templates and validation rules.
The position builds a scalable knowledge system for consistent incident response. Runbooks are versioned and maintained on a defined cadence to support high-quality actions by responders. Continuous improvement efforts preserve detection of true incidents while minimizing alert fatigue.
Requirements
- Experience working with IT Operations Command Center and 24x7 monitoring teams
- Knowledge of observability platforms and alert management tooling
- Understanding of service reliability goals and operational coverage models
- Ability to define severity thresholds and routing rules for incident response
- Familiarity with runbook and playbook development for consistent remediation actions
- Skill in alert rationalization to reduce noise while preserving true incident detection
Responsibilities
- Establish and maintain a department-wide alert rationalization framework evaluating business criticality and actionability
- Define and enforce alerting standards including severity definitions, metadata requirements, and naming conventions
- Act as gatekeeper for determining alert routing to 24x7 Eyes-on-Glass, on-call engineering, or ticket creation
- Establish a consistent approach to cataloging response instructions covering symptoms, triage steps, and escalation triggers
- Perform regular alert reviews to ensure quality, correct routing, and alignment with operational coverage
- Create a standardized Alert Design Checklist and approval workflow for alert onboarding
- Partner with tool and platform owners to embed standards in monitoring tooling through templates and validation
- Own the runbook template ensuring versioned maintenance and review on a defined cadence
Similar roles

Lead Architect AI - Remote
2d2 days agoNTT DATA
Mexico City, MX · Full-time

AI Agent Engineer - Intern
2d2 days ago小米科技 Xiaomi Technology
Nanjing, CN · Internship

Software Engineer - D365 (Remote)
2d2 days agoCES
Chennai, IN · Full-time · INR 900,000 – INR 1,600,000

Salesforce Developer - Remote
2d2 days agoCommonwealth of Massachusetts
Boston, US · Full-time · $82,752 – $123,455