Sr Site Reliability Engineer - Observability - Open to Remote

Full Time
Santa Ana, CA
$87,945 - $162,360 a year
Posted
Job description

Who We Are

Join a team that puts its People First! Since 1889, First American (NYSE: FAF) has held an unwavering belief in its people. They are passionate about what they do, and we are equally passionate about fostering an environment where all feel welcome, supported, and empowered to be innovative and reach their full potential. Our inclusive, people-first culture has earned our company numerous accolades, including being named to the Fortune 100 Best Companies to Work For® list for eight consecutive years. We have also earned awards as a best place to work for women, diversity and LGBTQ+ employees, and have been included on more than 50 regional best places to work lists. First American will always strive to be a great place to work, for all. For more information, please visit www.careers.firstam.com.

What We Do

The team is on an exciting transformation journey and moving from the classic support model to the site reliability engineering model. We are looking for a person who has prior SRE experience on the on-premise and cloud technologies. We are in search of an engineer who has lived and thrived during the transformation journey by not just following the work directed, but by the virtue of thinking out-of-the-box and working on developing monitoring applications and taking up deep dives into issues.

Essential Functions:

  • Efficiently handle live production incidents, debug/troubleshoot application and infrastructure issues, follow and implement SRE best practices.

  • Monitor application performance, take steps to improve overall application performance and stability, and follow through with implementation.

  • Build end-to-end monitoring infrastructure (Logging, Metrics, Tracing) and work closely with the other Production Engineers to provide the right tooling to measure the reliability of our systems.

  • Establish SLIs, SLOs, Error Budgets, and other SRE metrics to ensure the better reliability.

  • Collaborate with development and operations team to ensure availability and reliability of the application and infrastructure.

  • Serve as an escalation point for other Systems Administrators, Engineers, and other technology teams in the resolution of server and system problems.

  • Maintain effective knowledge base and runbooks to bring faster resolution to production issues.

  • Provide weekend on-call rotation for production support.

  • Communicate with stakeholders using strong written and verbal communication.

  • Constantly update personal technical and business knowledge and skills and mentor others to increase the knowledge and skills of the team.

  • Provide stellar organizational support, customer support, and self-manage project initiatives.

Technical Skills:

  • Bachelor's degree in computer science or equivalent combination of education and experience.

  • 9+ years of hands-on experience in application and technical support role in live production environment following Development, DevOps, and SRE best practices.

  • 6+ years of hands-on experience with configuring and monitoring via tools such as Splunk, AppDynamics, ELK, Microsoft SCOM, Windows Processes, JavaScript Framework, etc.

  • 2+ years of experience monitoring on AWS Workloads using AWS CloudWatch, AWS X-Ray, etc.

  • Experience in monitoring and analyzing infrastructure performance using standard performance monitoring tools - Perfmon, PerfView, ProcDump, DebugDiag, etc will be a plus.

  • Experience with automation using PowerShell, Python scripting or similar tech preferred.

  • Nice to have experience with monitoring web-based applications, webservices, and database driven applications using Microsoft Technologies C#, .Net 4.5, Azure DevOps, & SQL Server 2016.

Pay Range: $87,945.00 - $162,360.00

This hiring range is a reasonable estimate of the base pay range for this position at the time of posting. Pay is based on a number of factors which may include job-related knowledge, skills, experience, business requirements and geographic location.

#techreferral

#LI-JC2

#tcorpit

What We Offer

By choice, we don’t simply accept individuality – we embrace it, we support it, and we thrive on it! Our People First Culture celebrates diversity, equity and inclusion not simply because it’s the right thing to do, but also because it’s the key to our success. We are proud to foster an authentic and inclusive workplace For All. You are free and encouraged to bring your entire, unique self to work. First American is an equal opportunity employer in every sense of the term.

Based on eligibility, First American offers a comprehensive benefits package including medical, dental, vision, 401k, PTO/paid sick leave and other great benefits like an employee stock purchase plan.

colinoncars.com is the go-to platform for job seekers looking for the best job postings from around the web. With a focus on quality, the platform guarantees that all job postings are from reliable sources and are up-to-date. It also offers a variety of tools to help users find the perfect job for them, such as searching by location and filtering by industry. Furthermore, colinoncars.com provides helpful resources like resume tips and career advice to give job seekers an edge in their search. With its commitment to quality and user-friendliness, colinoncars.com is the ideal place to find your next job.

Intrested in this job?

Related Jobs

All Related Listed jobs