• Lead the design and implementation of observability/automation solutions that provide deep insights into application performance, system health, and user experience.
• Establish and advocate for observability best practices across engineering teams.
• Work closely with the infrastructure teams to automate and optimize infrastructure provisioning and scaling using IAC tools like Terraform.
• Ensure infrastructure code is tested, reliable, and efficient.
• Champion the adoption of open telemetry standards to collect, process, and export telemetry data.
• Utilize and integrate monitoring tools like Dynatrace and Splunk to provide thorough insights and analytics.
• Drive the evaluation and adoption of new tools and technologies to keep the organization at the forefront of observability and monitoring practices.
• Collaborate with various engineering teams to ensure smooth adoption and transition to new technologies.
• Analyze existing monitoring and observability practices, identifying areas for improvement or optimization.
• Foster a culture of continuous learning and improvement within the observability team and across the organization.
• Provide guidance, and mentoring to the observability team.
• Foster a collaborative and inclusive environment that encourages innovation and growth.
What your background looks like
7+ years of devops engineering experience
- Information Technology degree and/or technology certifications preferred or substantial equivalent experience.
- Design and implement an observability system for a new microservices-based application.
- Migrate an existing monitoring system to Prometheus and Grafana
- Develop a new alerting system to detect and respond to performance issues.
- Work with the development team to instrument their code for better observability.
- Train and mentor other engineers on observability best practices
- Strong customer and communication skills to interact with team members, customers, vendors and leadership team.
- Advanced Shell scripting and IaC automation skills with Ansible and Terraform
- Deep understanding of open telemetry standards.
- Experience with monitoring and logging tools like Dynatrace and Splunk.
- Proactive, go-getter attitude with a passion for new technology adoption.
- Excellent communication and collaboration skills.
- Ability to lead and mentor a team of engineers.
- Working Knowledge of Python and any databases (SQL/NoSQL).
- SMEs in enterprise monitoring like APM, Custom attribute Implementation, synthetic monitoring, browser monitoring, and Log monitoring
- Knowledge of requirement gathering and rollout monitoring and observability solutions. Partner with the business and development teams to identify requirements, define monitoring solutions, and implement the same.
- Experience in Application Performance Monitoring (APM) and Infrastructure Monitoring for Different Hybrid Business Applications and Infrastructure
- Providing health and performance reports, developing AIOps rules, creating alerts, creating custom dashboards
- Experience in creating workloads and user onboarding.
Experian Careers - Creating a better tomorrow together
Find out what its like to work for Experian by clicking here