About Turvo: Turvo provides a collaborative Transportation Management System (TMS) application designed specifically for the supply chain. Turvo Collaboration Cloud connects freight brokers, 3PLs, shippers, and carriers to unite supply chain ecosystems, delivering outstanding customer experiences, real-time collaboration, and accelerated growth. The technology unifies internal and external systems, providing one end-to-end solution that streamlines operations, enhances analytics, and automates business processes while eliminating redundant manual tasks. Turvo’s customers include some of the world’s largest Fortune 500 logistics service providers and shippers as well as small to mid-sized freight brokers. Turvo is based in Dallas, Texas, with offices in Hyderabad, India. (www.turvo.com). Responsibilities:

Site Reliability Engineers at Turvo fill the mission-critical role of ensuring that our complex, web-scale systems are healthy, monitored, automated, and designed to scale.

Proactively monitor the production environment and respond quickly in response to trends or issues.

Contribute in debugging, troubleshooting the complete stack of a service and drive the analysis of an outage.

Participate actively in bug/issue triage with the feature teams and support well informed decisions towards business and engineering goals.

Document operational processes for proactive monitoring, debugging and resolving Issues.

Develop tools to improve our ability to rapidly deploy and effectively monitor custom applications.

Work closely with development teams to ensure that platforms are designed with "operability" in mind.

Design, write and deliver high quality software to improve the availability, reliability, scalability, latency, security, resiliency, and efficiency of a service.

Write software and build automation to resolve problems permanently.

Engage in service capacity planning and demand forecasting, software performance analysis and system tuning.

Qualifications:

3+ years in a UNIX-based large-scale web operations role.

2+ years experience with at least one programming language (Python is preferred).

Experience with relational databases (MySQL) and NoSQL (MongoDb, Cassandra, etc.).

Exposure to monitoring tools like Dynatrace, ELK or similar tools will be an added advantage.

Familiarity with application profiling, system scalability, monitoring and performance.

Ability to understand unfamiliar code bases, debug server-side, multi-threaded, and highly scalable applications.

Strong debugging, troubleshooting/problem solving skills.

Previous experience working with geographically-distributed coworkers.

Site Reliability Engineer

AI overview