Senior Staff Systems Engineer - Performance Engineer - JVM

AI overview

Lead complex performance investigations and optimize efficiency across a large JVM-based microservice architecture, utilizing advanced diagnostic tools to improve infrastructure.

About Us

Nu is one of the largest digital financial platforms in the world, with more than 127 million customers across Brazil, Mexico, and Colombia. Guided by our mission to fight complexity and empower people, we are redefining financial services in Latin America and this is still just the beginning of the purple future we're building.

Listed on the New York Stock Exchange (NYSE: NU), we combine proprietary technology, data intelligence, and an efficient operating model to deliver financial products that are simple, accessible, and human.

Our impact has been recognized by global rankings such as Time 100 Companies, Fast Company’s Most Innovative Companies, and Forbes World’s Best Bank. Visit our institutional page https://international.nubank.com.br/careers/  

About the Role

The Systems Performance team is part of the Computing Squad consists of two distinct workstreams —Orchestration and System Performance —each with its own approach and challenges on managing and improving the foundational infrastructure where the majority of the Nubank's workloads runs. 

The Performance team is focused on building deep diagnostic tools and performing high-level analysis to reduce latency,  infrastructure costs and increase services efficiency.

You will be responsible for leading complex performance investigations, identifying systemic bottlenecks, and driving efficiency across one of the largest JVM-based microservice architectures in the world.

Our core principles and behaviors include ownership, simplicity, veracity-first, teamwork, and a focus on quality over quantity. During a normal work day, you will interact with critical infrastructure layers, from the Linux Kernel and JVM internals to cloud-wide orchestration.

You'll be responsible for

  • Leading Deep-Dive Investigations: Conduct high-level performance analysis to identify and resolve systemic bottlenecks across our global JVM-based microservices architecture.
  • Optimizing Resource Efficiency: Drive initiatives to reduce infrastructure costs and latency by fine-tuning JVM parameters, Garbage Collection (ZGC, G1), and memory management (heap and off-heap).
  • Building Diagnostic Tooling: Develop and implement advanced observability tools using eBPF, JFR, and Flamegraphs to provide real-time insights into kernel and runtime behavior.
  • Kernel & Runtime Alignment: Bridge the gap between the Linux Kernel and the JVM, optimizing thread scheduling (CFS/EEVDF) and managing resource isolation (cgroups/throttling) within our Kubernetes environment.
  • Architecting Scalable Solutions: Design and deliver innovative infrastructure improvements that address long-term performance challenges, ensuring our systems scale ahead of demand.
  • Technical Mentorship & Culture: Share expertise on JVM internals and performance best practices with the wider Engineering team, fostering a culture of technical excellence and "quality over quantity."
  • Root Cause Excellence: Deep dive into complex concurrency issues, lock contentions, and memory leaks, providing definitive fixes for high-impact technical debt.
  • Strategic Collaboration: Work closely with the Computing Squad to align orchestration strategies with system performance goals, ensuring a seamless interface between infrastructure and workloads.

We are looking for a person who has

  • Expertise in JVM Internals: Deep, low-level knowledge of the JVM is essential. You must understand how the JVM works "under the hood," including JIT compilation (C1/C2), class loading, and intrinsic methods.
  • JVM Tuning & Garbage Collection: Extensive experience with GC algorithms (ZGC, G1, Shenandoah), including the ability to tune them for massive heaps and ultra-low latency requirements.
  • OpenJDK Contribution (Major Plus): Previous experience contributing to the OpenJDK project or other low-level runtime environments is a significant advantage.
  • Linux Kernel & Scheduling: Deep understanding of the Linux Scheduler (CFS/EEVDF), thread scheduling, and how the kernel manages high-concurrency Java workloads.
  • Memory Architecture: Mastery of heap and off-heap memory management, including Direct Buffers, memory-mapped files, and diagnosing complex memory leaks.
  • Advanced Diagnostics: Mastery of diagnostic tools such as Flamegraphs, JFR (Java Flight Recorder), eBPF, and performing large-scale heap dump analysis.
  • Resource Isolation: Extensive experience with cgroups and the impact of CPU Throttling on JVM quotas within Kubernetes/EKS.
  • Concurrency: Proven ability to diagnose and resolve complex concurrency problems, including lock contention and race conditions at the instruction level.
  • Cloud Platforms: Knowledge of AWS infrastructure and its performance characteristics.
  • Develops and delivers innovative solutions that address team-level or project-level challenges, focusing on medium and long-term impact
  • Understand the technical aspects, capabilities, and limitations of our systems, contributing to discussions and improvements. 
  • Anticipate technical and product issues, making appropriate design decisions to avoid them
  • Is enthusiastic about sharing knowledge and mentoring others.
  • Deep dive into a problem to identify root causes when prioritized.

 Benefits

  • Opportunity of earning equity at Nu
  • Medical Insurance
  • Dental and Vision Insurance
  • Life Insurance and AD&D
  • Extended maternity and paternity leaves 
  • Nucleo - Our learning platform of courses
  • NuLanguage - Our language learning program
  • NuCare - Our mental health and wellness assistance program
  • Extended maternity and paternity leaves 
  • 401K
  • Saving Plans - Health Saving Account and Flexible Spending Account
  • Work-from-home Allowance
  • Relocation Assistance Package, if applicable.

Location for this opportunity (City, Country)

  • Palo Alto, United States
  • Miami, United States
  • Washington DC, United States
  • Durham, United States

Work Model for this Role

Hybrid 2-3 times/week: Our hybrid work model brings us to the office at least twice a week, on strategic days designed to maximize team connection and collaboration. For more details, visit https://building.nubank.com/nu-hybrid-work-model/

Perks & Benefits Extracted with AI

  • Equity Compensation: Opportunity of earning equity at Nu
  • Health Insurance: Dental and Vision Insurance
  • Home Office Stipend: Work-from-home Allowance
  • Learning Budget: Nucleo - Our learning platform of courses
  • Relocation Assistance: Relocation Assistance Package, if applicable.
  • Paid Parental Leave: Extended maternity and paternity leaves
  • Wellness Stipend: NuCare - Our mental health and wellness assistance program

Nubank is a Brazilian neobank and the largest fintech bank in Latin America.

View all jobs
Ace your job interview

Understand the required skills and qualifications, anticipate the questions you may be asked, and study well-prepared answers using our sample responses.

Staff Systems Engineer Q&A's
Report this job
Apply for this job