Site Reliability Engineer
Tata Consultancy Services
Toronto · Ontario · Canada
Full-time
5-10
100,000 – 120,000
7h ago
72%
Strong
Job description
Inclusion without Exception:
Tata Consultancy Services (TCS) is an equal opportunity employer, and embraces diversity in race, nationality, ethnicity, gender, age, physical ability, neurodiversity, and sexual orientation, to create a workforce that reflects the societies we operate in. Our continued commitment to Culture and Diversity is reflected in our people stories across our workforce and implemented through equitable workplace policies and processes.
About TCS:
TCS is an IT services, consulting, and business solutions organization that has been partnering with many of the world’s largest businesses in their transformation journeys for over 55 years. Its consulting-led, cognitive-powered portfolio of business, technology, and engineering services and solutions is delivered through its unique Location Independent Agile delivery model, recognized as a benchmark of excellence in software development. A part of the Tata group, India's largest multinational business group, TCS operates in 55 countries and employs over 607,000 highly skilled individuals, including more than 10,000 in Canada. The company generated consolidated revenues of US $ 30 billion in the fiscal year ended March 31, 2025, and is listed on the BSE and the NSE in India. TCS' proactive stance on climate change and award-winning work with communities across the world have earned it a place in leading sustainability indices such as the MSCI Global Sustainability Index and the FTSE4Good Emerging Index.
We are seeking a Site Reliability Engineer (SRE) to ensure the reliability, scalability, and performance of platform services. The ideal candidate will bring strong expertise in SRE practices, observability, infrastructure automation, and developer platform enablement, with exposure to modern technologies including policy-as-code and emerging GenAI-driven systems.
Key Responsibilities:
Implement and manage SRE practices including Incident management, root cause analysis, and postmortems Reliability engineering and performance optimization Tracking and improving DORA metrics Define and monitor Service Level Indicators (SLIs) and Service Level Objectives (SLOs)Build and manage monitoring, logging, and distributed tracing frameworks Ensure platform reliability through proactive alerting, observability, and automation Automate infrastructure and governance using Terraform (Infrastructure as Code)Policy-as-Code tools (OPARego, Sentinel)Enhance developer experience and productivity by Designing self-service platform capabilities Managing service catalogues and platform standards Building reusable templates and golden paths Work with tools like Backstage to enable internal developer platforms Collaborate with engineering teams to improve system stability, deployment reliability, and operational efficiency Support integration and reliability considerations for GenAI-based systems (RAG, prompt workflows, model evaluation).
Required Skills:
Strong experience in SRE practices and reliability engineering Hands-on expertise with Monitoring logging platforms and distributed tracingSLOSLI frameworks and observability design Experience in incident management and performance engineering Strong understanding of DORA metrics and operational excellence Proficiency in Terraform (Infrastructure as Code)Policy as Code (OPARego, Sentinel)Experience with Developer platform tools (Backstage, service catalogues)Golden paths and platform standardization Nice to Have Exposure to GenAI platforms, RAG, and prompt engineering conc
Salary Range - CA$ 100,000 - CA$ 120,000 Per Year
Note:
TCS does not use artificial intelligence tools for candidate screening or evaluation.
This posting is for a current vacancy
The hiring process includes an initial screening by the TCS Hiring Team, followed by a technical evaluation and managerial discussion conducted by the Business Team, and concluding with the final HR evaluation.
Tata Consultancy Services Canada Inc. is committed to meeting the accessibility needs of all individuals in accordance with the Accessibility for Ontarians with Disabilities Act (AODA) and the Ontario Human Rights Code (OHRC). Should you require accommodation during the recruitment and selection process, please inform Human Resources.
Thank you for your interest in TCS. Candidates that meet the qualifications for this position will be contacted within a 2-week period. We invite you to continue to apply for other opportunities that match your profile.