This is a hands-on operational role for someone who thrives in complex environments and can balance immediate incident management with longer-term stability and service improvement.
Key Responsibilities:
- Lead daily operational support of legacy systems, ensuring availability, performance, and resilience.
- Manage incident, problem, and change activities in line with ITIL and enterprise service standards.
- Proactively monitor and tune infrastructure, applications, messaging, and scheduling platforms.
- Act as the escalation point for critical incidents, coordinating technical resources for rapid resolution.
- Lead root cause analysis and drive service improvement initiatives.
- Define and maintain runbooks, SOPs, and operational documentation.
- Ensure backup, recovery, and disaster recovery processes are tested and aligned to business needs.
- Oversee job scheduling, batch management, and automation activities (e.g. Tivoli Scheduler).
- Collaborate with Infrastructure, Development, and Architecture teams on upgrades, migrations, and modernisation.
- Mentor operations engineers and manage knowledge transfer into BAU operations.
Key Competencies:
- Strong leadership and coordination across technical and non-technical stakeholders.
- Excellent analytical and diagnostic skills, with a structured approach to documentation and reporting.
- Confident communicator in high-pressure incident situations.
- Ability to balance hands-on technical detail with broader architectural/system awareness.
- Strong incident management and prioritisation skills under pressure.
- Excellent communication skills for knowledge transfer and stakeholder engagement.
- Risk-aware mindset, balancing short-term stability with longer-term strategy.
Required Skills & Experience:
- Solid background in Java, AWS, and Kubernetes.
- Proven track record in managing and supporting enterprise-scale services.
- Experience working in a multi-vendor environment with coordinated triage and joint operational activities.
- Familiarity with ITIL-aligned service management.
- Demonstrable experience building and maintaining KT libraries.
- Experience with system migrations, re-platforming, or legacy modernisation programmes is highly desirable.
- Strong knowledge of high-availability, disaster recovery, and enterprise integration patterns.
- Ability to prioritise effectively in complex, multi-system environments.
Why Join the Team:
- Work on a high-impact programme within central government.
- Remote-first contract with very occasional travel to client sites.
- Contribute to the transition of critical national services.
- Outside IR35 contract with a competitive day rate.
