Senior Infrastructure & Security Engineer

TLDR

Own the reliability, security, and operational health of a multi-region healthcare platform on AWS while contributing to infrastructure automation and incident management.

Koda Health is looking for a Senior Infrastructure & Security Engineer to own the reliability, security, and operational health of our production systems.

You'll be the person responsible for keeping our platform running, secure, and observable — owning everything from AWS infrastructure and deployment pipelines to incident response, security compliance, and production monitoring. You'll work directly with the CTO and a small engineering team.

This is a hands-on, high-ownership role. We run a multi-region healthcare platform on AWS with real uptime requirements, HIPAA obligations, and SOC 2 compliance. You'll inherit a mature CDK codebase and be expected to extend it, harden it, and build the monitoring and incident management layer.

We also want someone who can contribute to the codebase and automate operational work. You won't be a full-time software engineer, but you should be comfortable using AI coding tools like Claude Code to make small TypeScript PRs, triage Sentry errors, fix production bugs, and set up automated monitoring, triage, and recurring infrastructure health checks.

Expect roughly:

  • 60–70% infrastructure, architecture, reliability, and monitoring
  • 10–20% security, compliance, and vendor questionnaires
  • 5–10% TypeScript contributions (bug fixes, small features, Sentry triage)

What You'll Do

Production Reliability & Observability

  • Own the operational health of production across two AWS regions
  • Investigate production issues, lead root-cause analysis, and drive resolution
  • Build and maintain dashboards that give real-time visibility into application health, queue depths, API latency, and error rates
  • Monitor SQS/SNS queue health, dead-letter queues, and event processing pipelines
  • Expand observability beyond CloudWatch - evaluate and implement distributed tracing, APM, and log aggregation
  • Oversee weekly deployments to production
  • Own cost monitoring and alerting (Budget alerts, Cost Explorer)
  • Improve automated uptime and SLA reporting

AWS Infrastructure & CDK

  • Own and evolve all AWS infrastructure defined in CDK
  • Lead the migration to capturing 100% of cloud infrastructure in CDK
  • Manage and improve services: Lambda, ECS Fargate, Elastic Beanstalk, S3, CloudFront, SNS, SQS, EventBridge, WAF, Cognito
  • Support multi-region uptime, disaster recovery planning, and backup/restore practices
  • Improve cross-region replication and automated failover
  • Own deployment pipelines, release processes, and database migration safety
  • Support and evolve data pipelines used for analytics and product features
  • Set standards for how we ship, deploy, and operate software at scale

Security, Compliance & Hardening

We're a healthcare company. HIPAA and SOC 2 aren't checkboxes - they're how we operate. You'll own the security posture of our infrastructure.

  • Maintain and harden AWS infrastructure with a strong security mindset
  • Own vulnerability remediation and SLA timelines
  • Help respond to security questionnaires and vendor assessments
  • Own and improve WAF rules, security groups, IAM policies, and network configuration
  • Own SecurityHub, AWS Config, VPC Flow Logs, and CloudTrail
  • Support GuardDuty malware scanning and S3 upload security
  • Ensure SOC 2 and HIPAA compliance across infrastructure
  • Manage secrets, key rotation, and access controls
  • Conduct periodic security reviews of infrastructure and application configuration

Backend Contributions

You're not a full-time developer, but you can ship some code with the help of AI coding tools.

  • Triage and fix production errors surfaced by Sentry
  • Make small TypeScript PRs to backend services
  • Debug complex production issues that span infrastructure and application code
  • Participate in architecture discussions, especially around infrastructure and deployment concerns

Requirements

  • 6+ years building and operating production systems on AWS
  • Strong experience with AWS CDK (we use CDK in typescript)
  • Deep knowledge of core AWS services: Lambda, ECS, S3, CloudWatch, SNS, SQS, IAM, VPC, WAF
  • Experience setting up and managing monitoring, alerting, and incident management
  • Experience with security hardening and compliance in regulated environments (HIPAA, SOC 2, or similar)
  • Working knowledge of TypeScript or Node.js - enough to read the codebase, make PRs, and debug production issues
  • Experience with CI/CD pipelines (CodePipeline, GitHub Actions, or similar)
  • Comfortable owning production systems end-to-end in a small team where you're the expert
  • Strong English fluency - written & verbal communication (security questionnaire responses, etc)
  • US-based, able to work CST/EST hours (contractual requirement).

Bonus Points

  • Healthcare industry experience (FHIR, HL7v2, Epic/Cerner integrations)
  • Experience with multi-region AWS architectures and disaster recovery
  • Experience with MongoDB operations and performance
  • Experience with cost optimization in AWS
  • Familiarity with AI-assisted development tools (e.g., Claude Code)

Benefits

  • Base salary of $160,000 - $170,000 per year
  • Fully remote role (US-based)
  • Flexible, Unlimited Paid Time Off
  • Great medical, dental, and vision coverage
  • 401k options
  • Yearly personal development budget that can be used for books, courses, trainings, and more
  • Office setup budget
  • Annual company and team events
  • Latest Macbook + enterprise tooling (e.g. Claude Code, etc)
  • Opportunity to gain exposure to applied RL and SFT work on foundational AI models
  • Clear growth paths for ICs (Staff/Principal) and managers (EM/Director).

Benefits

Equity Compensation

Clear growth paths for ICs (Staff/Principal) and managers (EM/Director)

Health Insurance

Great medical, dental, and vision coverage

Home Office Stipend

Office setup budget

Learning Budget

Yearly personal development budget that can be used for books, courses, trainings, and more

Paid Time Off

Flexible, Unlimited Paid Time Off

Koda Health builds an AI-driven patient decision support platform designed to enhance serious illness planning, allowing patients to articulate their treatment preferences clearly. Our solution is aimed at healthcare providers, facilitating care aligned with patient goals and improving care coordination for those with complex health needs. What sets Koda apart is our combination of digital tools and personalized support, driving better adoption of advance care planning across diverse populations.

View all jobs
Salary
$160,000 – $170,000 per year
Ace your job interview

Understand the required skills and qualifications, anticipate the questions you may be asked, and study well-prepared answers using our sample responses.

Security Engineer Q&A's
Report this job
Apply for this job