Background: One of the nation’s largest Healthcare systems seeks to hire an Azure Operations & Support Engineer to join its Cloud Infrastructure team. Engineer will provide day to day technical support, monitoring, and remediation of issues related to its proprietary AI Platform on Azure.

What Will You Do:

  • SME and day to day support lead for infrastructure operations of its proprietary AI platform on Azure
  • lead Incident Management team, develop strategic plans to build maturity around incident management, build a world-class incident response function (MTTD/MTTR)
  • effectively communicate with Platform users on all issues, outages, solutions, and timelines
  • prepare incident and solution reports by collecting and analyzing data including trends and threats to the Platform
  • champion and implement automated monitoring tools to track Platform health, uptime, and outages
  • collaborate with Data Engineering, L3, DevOps teams on automated deployment through integrated CI/CD pipelines while following DevOps best practices
  • hands-on with scripting including Azure Resource Manager (ARM), Terraform, and Powershell to deploy fixes
  • provide recommendations on how Platform can optimize usage of Cloud services i.e. installation, testing, tuning, upgrading and loading patches, and troubleshooting problems
  • manage operations plans around data recovery /disaster recovery, adhere to established SLA’s, optimize operational costs across vendors and service providers
  • ensure compliance with best security practices and continuously assess potential vulnerabilities

More Information:

  • full-time role that is 100% remote with great benefits

Required Experience:

  • 4+ years of experience in leading production support teams
  • 4+ years of experience managing a 24/7 operations for a high-volume and business-critical Cloud service
  • 2+ years’ experience with Azure
  • Excellent verbal and written communication skills

Apply for this position

Allowed Type(s): .pdf, .doc, .docx