Verdigris
Verdigris

Engineering Manager, Cloud Platform

$190,000 – $240,000 per year

TLDR

Drive innovation in AI infrastructure by overseeing the cloud platform while mentoring a team of elite engineers and implementing best-in-class development practices.

GPU racks pull 120–140 kW today. By 2027, that number hits 600 kW to 1 MW per rack. The entire AI buildout — hundreds of billions in capex — is being erected on a grid that was not designed for it. Design margins have compressed from 30% to 10–15%. The monitoring systems built for the last generation of infrastructure poll at one-second intervals. GPU workloads ramp in eight milliseconds. AI is accelerating faster than the infrastructure beneath it can be understood.   The incumbent vendors — Schneider, Eaton, Vertiv — were built for a world where loads were predictable and slow. They are not broken. They are mismatched to what AI infrastructure demands. Verdigris captures continuous waveforms at 8 kHz. That is not a software improvement on existing monitoring data. It is a different measurement entirely — one that makes visible what no other system can see: hidden degradation, safe operating headroom, and the real-time electrical behavior of infrastructure running at the edge of its design limits.   We are not a monitoring solution. We are the electrical intelligence layer — the validation layer that sits between the physical environment and the autonomous control systems the industry is building toward. Solving this matters beyond the business case. Carbon-free AI, stranded capacity recovery, and the long-term reliability of the compute layer the world is betting on all depend on getting electrical intelligence right at the physical layer. The company   Twenty people. Lean by design. We have raised serious capital, refocused the company around the most consequential problem in AI infrastructure, and come out the other side with real customers, real revenue, and hardware that has been running in colocation and owned data center facilities for more than a decade. The cloud platform processes billions of 8 kHz waveform readings and turns them into validated operating limits that operators use daily.   This unique position—built on our high-fidelity 8 kHz metering—converts the strain on electrical infrastructure into a definitive roadmap for solving the AI industry's most critical power bottleneck and driving the sector's next wave of technological improvement.   Today that means reliability and early warning. Tomorrow it means capacity optimization and machine-facing orchestration APIs that GPU schedulers consume directly. The role   We are hiring an Engineering Manager to own the cloud platform — the system that makes all three product pillars work: Observability, Intelligence, and Orchestration.   You would manage a team of elite engineers, report to the cofounder/CTO, and hold a mandate to raise the bar on how this team builds and ships. This is a player-coach role. You will set direction, run the engineering operating cadence, and manage people. You will also read code, debug production issues, and make architectural calls. If you have not been in a codebase recently, this is not the right fit.   We are building the management layer to accelerate towards best-in-class industry standards: clear ownership, a culture of high craft, and leadership that empowers and accelerates rather than administrates. The candidate we want believes in this velocity.   One more thing: a big part of how we operate is through deliberate, opinionated use of agentic coding tools. The team is actively migrating towards an AI-native culture, learning how to adopt practices that scale. You will be instrumental in defining and coaching the next standard for AI-native development here, and you will recruit and coach to that standard. The situation   The platform works. Customers depend on it. The 8 kHz ingestion pipeline is real and running in production.   The platform is at a strategic inflection point: we must mature the architecture and organizational structure to support the scale and velocity of our next-generation product roadmap. We need someone who can take ownership of the platform, organize the team around clear ownership, and raise the quality bar — while also building toward future application layers that do not exist yet. First 6 months
  • Audit the platform: reliability, scalability, observability, tech debt. Form your own view, not just ours.
  • Organize ownership across the three-pillar stack. Ingestion and the 8 kHz pipeline. ML signal processing and validated operating limits. The APIs, MCPs, and workflows that deliver them.
  • Stand up an engineering operating cadence: roadmap reviews, incident reviews, delivery planning, architecture reviews.
  • Get your hands dirty on the hardest reliability and performance problems. Ship fixes, not just plans.
  • Establish AI-native development practices on the team. Not a policy — real tooling norms, a shared view on where agentic coding accelerates, and where it creates new risk.
  • Identify hiring gaps and start filling them. Raise the bar on who we bring in.
  • By 12 months, here is what success looks like
  • Platform reliability and deployment velocity are measurably better. Fewer fires, faster fixes.
  • The team ships consistently with clear ownership. They do not need you in every decision.
  • There is an engineering roadmap people trust — one that connects today’s reliability work to the capacity optimization and orchestration capabilities we are building toward.
  • You have made at least two hires who made the team noticeably stronger.
  • We are capitalizing on well-architected foundations, enabling us to move up the value delivery chain with our customers through a suite of well thought-through applications.
  • The platform is positioned to support machine-facing orchestration APIs: the layer where validated intelligence feeds directly into GPU schedulers and demand response systems.
  • What we are looking for
  • Real technical depth in cloud infrastructure, data systems, or ML platforms. You can review architecture, debug production, and make tradeoffs — not just delegate them.
  • You have inherited or built a small team before and made it better. You set expectations, build ownership, and coach people up.
  • You can operate without a clean roadmap. You turn ambiguity into a plan with owners and timelines.
    You care about production quality. Observability, incident response, release discipline. You build the habits, not just the systems.
  • You have strong opinions about how agentic coding tools change what a small team can build. You are actively shaping how your team works with AI — and you have the judgment to know where it helps and where it introduces new failure modes.
  • You are pulled by the mission. AI infrastructure is being built on a foundation that was not designed for it. Verdigris is the layer that makes it trustworthy. That framing should feel meaningful to you, not just interesting.
  • Why this role
  • You would work directly with the founding team and own the platform that makes the product work.
  • The company is small enough that your decisions show up in the product and the culture within months. A lean team, operating with the right practices and the right people, can build like a team ten times its size. You will define what that looks like here.
  • The 8 kHz ingestion pipeline is already running in production. You are not starting from zero. You are taking something real and making it significantly better — on infrastructure that actually matters.
  • If you are at a bigger company wondering whether you will ever get to build something from a position of real ownership, this is that role.
  • Verdigris creates intelligent energy solutions for data centers, focusing on efficiency and sustainability. Our technology harnesses AI to optimize energy usage, aiming for carbon-neutral electricity and resilient infrastructure. We're dedicated to shaping the future of energy systems for a better world.

    Founded
    Founded 2011
    Employees
    11-50 employees
    Industry
    Internet Software & Services
    Total raised
    $22M raised
    View company profile
    Report this job
    Apply for this job