Ensure the health, monitoring, automation, and scalability of Turvo's complex web-scale systems while collaborating with development teams to enhance service reliability.
About Turvo:
Turvo provides a collaborative Transportation Management System (TMS) application designed specifically for the supply chain. Turvo Collaboration Cloud connects freight brokers, 3PLs, shippers, and carriers to unite supply chain ecosystems, delivering outstanding customer experiences, real-time collaboration, and accelerated growth. The technology unifies internal and external systems, providing one end-to-end solution that streamlines operations, enhances analytics, and automates business processes while eliminating redundant manual tasks. Turvo’s customers include some of the world’s largest Fortune 500 logistics service providers and shippers as well as small to mid-sized freight brokers.
Turvo is based in Dallas, Texas, with offices in Hyderabad, India. (www.turvo.com).
Responsibilities:
Site Reliability Engineers at Turvo fill the mission-critical role of ensuring that our complex, web-scale systems are healthy, monitored, automated, and designed to scale.
Proactively monitor the production environment and respond quickly in response to trends or issues.
Contribute in debugging, troubleshooting the complete stack of a service and drive the analysis of an outage.
Participate actively in bug/issue triage with the feature teams and support well informed decisions towards business and engineering goals.
Document operational processes for proactive monitoring, debugging and resolving Issues.
Develop tools to improve our ability to rapidly deploy and effectively monitor custom applications.
Work closely with development teams to ensure that platforms are designed with "operability" in mind.
Design, write and deliver high quality software to improve the availability, reliability, scalability, latency, security, resiliency, and efficiency of a service.
Write software and build automation to resolve problems permanently.
Engage in service capacity planning and demand forecasting, software performance analysis and system tuning.
Qualifications:
3+ years in a UNIX-based large-scale web operations role.
2+ years experience with at least one programming language (Python is preferred).
Experience with relational databases (MySQL) and NoSQL (MongoDb, Cassandra, etc.).
Exposure to monitoring tools like Dynatrace, ELK or similar tools will be an added advantage.
Familiarity with application profiling, system scalability, monitoring and performance.
Ability to understand unfamiliar code bases, debug server-side, multi-threaded, and highly scalable applications.
Please mention you found this job on AI Jobs. It helps us get more startups to hire on our site. Thanks and good luck!
Ace your job interview
Understand the required skills and qualifications, anticipate the questions you may be asked, and study well-prepared answers using our sample responses.