Job Description

• Design, implement, and maintain instrumentation and monitoring systems for IT infrastructure and applications.

• Develop and maintain dashboards, reports, and alerts to provide visibility into the health of IT systems.

• Analyze data collected by instrumentation and monitoring systems to identify and resolve issues before they impact users.

• Collaborate with other IT teams to improve the performance and reliability of IT systems.

• Work with developers to ensure that applications are designed for monitoring and instrumentation.

• Automate the deployment of instrumentation and monitoring tools.

• Develop and maintain documentation and knowledge base articles for instrumentation and monitoring.

Qualification

• Bachelor's or Master's degree in Computer Science, Information Technology, or a related field.

• Minimum of 5 years of experience in IT operations, infrastructure engineering, or a related role.

• Strong knowledge of IT infrastructure and applications, including networks, servers, storage, databases, and middleware.

• Experience with instrumentation and monitoring tools, such as Nagios, Zabbix, or New Relic.

• Experience with scripting and programming languages, such as Python, Ruby, or PowerShell.

• Strong analytical and problem-solving skills.

• Excellent communication skills, with the ability to work collaboratively with other IT teams and stakeholders.