Staff Engineer (Data Engineer – AI & Digital Platforms)

TLDR

Design and build scalable data pipelines and GenAI/LLM-powered applications using cutting-edge technologies to support data-driven solutions in a dynamic environment.

Data Engineer – AI & Digital Platforms

Must-Have Skills

  • Hadoop and MapReduce
  • Cloudera
  • AI-enabled Application Development
  • Machine Learning – General Experience
  • LLM Application Frameworks (Capable)

Key Responsibilities

  • Design and develop scalable data pipelines across Hadoop (Hive, Impala, Spark, Kafka, Iceberg) and Teradata environments.
  • Build ingestion and transformation frameworks using Java, Spark, Python, and Shell scripts.
  • Develop full stack applications and internal tools using Python, Shell scripting, and modern web frameworks (Flask, React).
  • Create APIs and microservices to expose data and ML models securely to downstream systems and user interfaces.
  • Collaborate with data scientists to operationalize ML models using Cloudera Machine Learning (CML).
  • Build and deploy GenAI/LLM-powered applications for intelligent data interaction, summarization, and automation.
  • Implement enterprise-grade security controls including RBAC, LDAP, Kerberos, Apache Ranger, and row-level access.
  • Tune and optimize data applications for performance across Hadoop and Teradata, ensuring efficient resource utilization
    • Support sandbox environments for prototyping, enabling users to build ML models, dashboards, and data pipelines.

    Required Skills & Experience

    Data Engineering

    • Strong experience with Hadoop ecosystem (Hive, Impala, Spark, Kafka, Iceberg, Ranger, Atlas), Teradata, and data pipeline orchestration.
    • Experience with MPP databases (e.g., Trino, Presto).
    • Proven ability in development and performance tuning of large-scale data applications.

 

Full Stack Development

  • Proficiency in Python, Shell scripting, REST APIs, and web frameworks (Flask, React).

Machine Learning & AI

  • Hands-on experience with ML platforms (CML), Spark MLlib, and Python ML libraries (scikit-learn, XGBoost).
  • Experience in operationalizing ML models at enterprise scale.

GenAI/LLM Applications

  • Familiarity with building applications using large language models (OpenAI, Hugging Face, LangChain).
  • Ability to build agent workflows and support users in creating agent-based solutions.

Security & Governance

  • Experience with enterprise data security (LDAP, Kerberos, RBAC), data masking, and access control.

Performance Tuning

  • Strong expertise in optimizing data applications and queries in Hadoop and Teradata environments.

Tools & Platforms

  • Cloudera Data Platform (CDP), Informatica, QlikSense, Apache Oozie, Git, CI/CD pipelines.

 

Soft Skills

  • Strong analytical and problem-solving skills.
  • Excellent communication abilities.
  • Ability to work effectively in cross-functional teams

Nagarro is a global digital product engineering company that specializes in building innovative products, services, and experiences across various digital mediums. With a team of over 18,000 experts in 36 countries, we empower businesses to thrive in a digital-first world by enhancing their agility and responsiveness. Our unique approach combines technology consulting and IT services, driving substantial business breakthroughs for our clients.

View all jobs
Ace your job interview

Understand the required skills and qualifications, anticipate the questions you may be asked, and study well-prepared answers using our sample responses.

Staff Engineer Q&A's
Report this job
Apply for this job