Senior Site Reliability Engineer

TLDR

Design and automate SRE platforms for a Brokerage-as-a-Service platform, focusing on reducing toil and ensuring reliability for global financial markets.

About Us

DriveWealth is on a mission to make investing easier. We believe that everyone should have the ability to control their financial future, and that access to financial markets should not be limited by geography, wealth, or legacy systems. We are a global B2B financial technology organization dedicated to democratizing access to financial independence around the world. Our mission is realized through an API-based platform, empowering our partners to offer seamless investing and trading experiences to clients worldwide, all from their mobile devices. Our technology provides partners with a modern, extensible toolkit, enabling traditional investment workflows and innovative techniques like fractional share ownership. DriveWealth has evolved into a global platform offering trading of US equities, mutual funds, ETFs, fixed income, and options.

There’s never been a better time to build a category-defining business and there has rarely been a team better positioned for this opportunity. Our culture blends the pace and agility of a fintech start-up with the impact, stability, and discipline of Wall Street. We encourage creativity and experimentation while ensuring institutional-grade execution and regulatory compliance in everything we do. Join us and help build the future of global investing!

About The Role

As a Senior Site Reliability Engineer, you won’t just be "keeping the lights on." You will be an engineering force responsible for the architecture, scalability, and self-healing capabilities of our Brokerage-as-a-Service platform.

This role is centered on reducing toil through engineering. You will design and develop internal SRE platforms, automate complex workflows, and ensure our Kubernetes-based ecosystem can handle the demands of global financial markets. While this role includes critical on-call responsibilities to support our 24/7 global operations, your primary mission is to build and modernize systems that make manual intervention obsolete.

What You’ll Do

  • Engineering & Automation: Design and develop internal tools and SRE platforms to eliminate repetitive tasks (toil) and improve developer velocity.
  • Infrastructure as Code: Architect and maintain modular, reusable IaC using Terraform and manage GitOps workflows via ArgoCD.
  • Observability & Reliability: Implement OpenTelemetry standards and the Grafana stack (Alloy, Loki, Tempo, Mimir) to provide deep insights into system health. Define and manage SLIs, SLOs, and Error Budgets.
  • Platform Governance: Review software architecture and Kubernetes metrics to ensure high availability, capacity planning, and cost-optimization across AWS regions.
  • Incident Engineering: Lead incident response, perform complex root-cause analysis (RCA), and champion a blameless post-mortem culture.
  • Collaboration: Partner with engineering teams to foster the adoption of new tools, security standards, and reliability best practices.

What You'll Need

  • Linux & Networking Mastery: Proficient in Linux administration with a deep understanding of the TCP/IP stack, OSI model, DNS, and network troubleshooting.
    FinTech Background: Experience working in highly regulated financial environments or with FIX/API connectivity.
  • Production Kubernetes: Hands-on experience managing production-grade clusters, including RBAC, autoscaling, Helm, and multi-cluster patterns.
  • Cloud Native Expertise (AWS): Strong grasp of AWS core services, security, and high-availability patterns. Proficiency with boto3 and AWS CLI for automation.
  • Modern CI/CD & GitOps: Experience building secure, automated delivery pipelines and operating GitOps workflows (ArgoCD).
  • Code Proficiency: Strong scripting and development skills in Python or Golang, along with Bash and Ansible.
  • Security Mindset: Experience with secrets management, vulnerability scanning, and securing the software supply chain.
  • AI & Prompt Engineering: Familiarity with using LLMs, Public MCPs, or Bedrock Agent Core to enhance SRE workflows.
  • Data & Middleware: Experience managing Kafka, MQ, SQS, or orchestration tools like Airflow and Rundeck.

Applicants must be authorized to work for any employer in the U.S. DriveWealth is unable to sponsor or take over sponsorship of an employment Visa at this time.

Compensation
Compensation package offerings are based on candidate experience and technical qualifications, as it relates to the role. These are identified and determined throughout your interviewing experience.

Please note
: at this time, we are not able to hire in all states.

Remote (Most US States) Pay Range
$150,000$170,000 USD

Working at DriveWealth

We do our best work when we’re in the same room. To maintain the speed our partners expect, our New York and Chicago teams work in-office 4 days a week. We’ve found that being physically side-by-side is the only way to solve complex problems in real-time and stay truly accountable to the products we ship. When you’re here, you’re working directly with the people making the decisions. To support that work, we provide competitive compensation, equity, and a 401(k) match. We also offer full insurance coverage, a wellness reimbursement, a company-provided phone, and a personal development allowance. Finally, we value the time you spend away from the office with generous PTO, observed holidays, and extended leave.

How We Think About AI

We leverage AI to work smarter and move faster. We seek AI curious talent who are proactive about using emerging tools to increase signal quality, reduce friction and improve outcomes to deliver product faster, provide better service to our partners, and to streamline processes. Your ability to leverage our internal tools and technology to drive results is as important to us as your core domain expertise.

Compensation

Pay is generally based upon the level, complexity, responsibility, location and job duties / requirements of the specific position. We then source candidates with the requisite skills, expertise, education, training, and experience.  If you are selected for an interview, please feel welcome to speak to a recruiter about our compensation philosophy and other available benefits. This role is eligible for base, bonus, equity, 401(k) match, and heavily subsidized benefits and perks.

Equal Employment Opportunity

To build technology and products that are used and loved by people and solve real-world problems, we need to build a team with many different perspectives and experiences. We are an equal opportunity employer. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status. We encourage candidates from all backgrounds to apply. Applicants in need of special assistance or accommodation during the interview process or in accessing our website may contact us at [email protected].

Agency Disclaimer

DriveWealth does not accept agency resumes. Do not forward resumes to our jobs alias, employees, or any other organization location. DriveWealth is not responsible for any fees related to unsolicited resumes.

DriveWealth builds an API-based platform that makes investing accessible to everyone, regardless of their financial background. Its services cater to businesses looking to offer their customers easy access to financial markets and promote financial independence. What sets DriveWealth apart is its commitment to democratizing investment opportunities on a global scale.

View all jobs
Salary
$150,000 – $170,000 per year
Ace your job interview

Understand the required skills and qualifications, anticipate the questions you may be asked, and study well-prepared answers using our sample responses.

Senior Site Reliability Engineer Q&A's
Report this job
Apply for this job