Senior Software Engineer
TLDR
Design, build, and own scalable backend systems with minimal supervision, while enhancing reliability and system performance for critical features.
The Opportunity
We are hiring Software Engineers with 3–4 years of experience who are ready to take meaningful ownership of backend systems and work with minimal supervision.
This role is designed for engineers who have moved beyond writing features under guidance and are now ready to own them end-to-end — from design to production. You will be expected to make sound technical decisions, debug complex production issues independently, and contribute to how our systems scale and stay reliable.
If you have been in production, felt the pain of a silent failure, reasoned through a double-charge incident, or rearchitected a sync job that silently dropped data — this role is for you.
You will work closely with product managers, designers, and other engineers to build systems that are scalable, maintainable, and production-ready. Your daily work will involve:
Ensuring Data Integration with Third-Party CRMs: Design and own solutions that integrate customer data seamlessly and reliably with various CRM systems.
Enhancing Event and Fundraising Management Tools: Drive improvements to our event and fundraising tools, with a focus on reliability and scale.
Owning Payment and Communication Systems: Take end-to-end ownership of systems that handle payments and user communications, including resilience and failure handling.
Maintaining and Improving System Uptime: Lead reliability efforts in your areas of ownership, proactively identifying and resolving issues before they impact customers.
Responsibilities
Own Features End-to-End: Design, build, and maintain features independently — from requirements to production — with minimal supervision.
Drive System Reliability: Proactively identify performance bottlenecks, reliability risks, and scalability gaps and address them systematically.
Debug Production Issues Independently: Investigate and resolve complex production issues using logs, metrics, and structured debugging approaches.
Design for Failure: Build systems that handle partial failures, retries, and third-party API unreliability correctly. Know when idempotency matters and apply it.
Code Review and Quality: Conduct and participate in code reviews, raise the quality bar, and help define good engineering practices within the team.
Collaborate Cross-Functionally: Work closely with product managers, designers, and other engineers to deliver high-quality software that meets user needs.
Contribute to Architecture: Participate actively in design discussions, propose solutions to technical problems, and think through trade-offs clearly.
Continuous Improvement: Stay current with engineering best practices and apply that knowledge to improve the systems you own.
Requirements
Must-Have
3–4 years of full-time software engineering experience
Hands-on experience with backend development in Java, Python, or Go
Experience with frontend development using React or similar frameworks
Strong understanding of HTTP, REST APIs, and client–server architecture — including failure cases, versioning, and idempotency
Experience designing data models and writing complex SQL queries; ability to diagnose slow queries using execution plans and reason about composite index design
Proven ability to build and own distributed systems or microservices in production — and to reason about how they fail, not just how they work
Experience designing APIs and backend systems for scale
Ability to debug and resolve complex production issues independently — using logs, metrics, and structured investigation, not guesswork
Hands-on experience with performance tuning — query optimisation (composite indexes, execution plans), caching strategies, and async processing
Experience building async or background processing systems — with an understanding of worker failures, queue behaviour, at-least-once delivery, and partial failure scenarios
Experience using Git, writing tests, and participating in code reviews
Comfortable working with minimal supervision and taking ownership of outcomes
Good-to-Have
Experience with Redis or similar in-memory data stores for rate limiting, caching, or queuing
Familiarity with observability tools — metrics, distributed tracing, alerting (e.g. Datadog, Sentry, Prometheus) — and experience using them to detect silent failures, not just crashes
Exposure to database sharding, partitioning, or replication
Experience with message queues or event-driven architecture (e.g. Celery, RabbitMQ, SQS) — including dead letter queues and transactional outbox patterns
Prior experience in a SaaS product environment
Experience designing reconciliation mechanisms for third-party integrations — detecting and recovering from data drift between systems
Curiosity about how systems fail at scale and how to design around those failure modes
What Does Your 1st Year Look Like at Almabase?
First 3 Months
Ramps up quickly on the codebase, systems, and architecture
Delivers well-scoped features independently with minimal hand-holding
Identifies gaps or risks in existing systems and raises them proactively
Establishes credibility through reliable, high-quality output
3–6 Months
Owns complete features or workflows end-to-end, from design to production
Debugs production issues independently using logs, metrics, and systematic reasoning
Improves reliability and performance in areas they own
Contributes meaningfully to technical design discussions
6–12 Months
Drives architecture and design decisions for their domain with confidence
Leads incident reviews and contributes to post-mortem culture
Reduces technical debt and improves maintainability across their areas
Acts as a technical reference point for junior engineers on their team
What We Look For in Interviews
Our interview process is designed to surface engineers who think in failure modes, not just happy paths. Specifically, we look for:
Production ownership: Can you describe a system you owned end-to-end — including what broke, how you found it, and what you changed?
Failure-mode reasoning: Can you identify what breaks when traffic doubles, a worker crashes mid-job, or a third-party API returns a 200 with a partial failure in the body?
Idempotency intuition: Do you know when duplicate writes or duplicate charges can happen — and how to prevent them?
Data integrity across integrations: Have you dealt with sync jobs that silently dropped data? Do you know how to build reconciliation and alerting around unreliable external systems?
Indexing depth: Can you reason about composite index column ordering and use query execution plans to diagnose slow queries?
Async system instincts: Do you understand the failure modes of background workers and queues — not just how to set them up?
Almabase builds technology that helps universities and schools strengthen relationships with their alumni to enhance engagement and increase donations. Our solutions are designed for educational institutions seeking to provide exceptional experiences and drive funding through meaningful connections. We leverage AI and prioritize human-centered design to create impactful tools for growth and collaboration.