Data Engineer (Full-time)

TLDR

Design and maintain scalable data pipelines and AI/ML workflows that enhance data-driven decision-making and deliver intelligent experiences on Kata's platforms.

Design, build, and maintain scalable data pipelines, streaming infrastructure, and AI/ML data workflows that power data-driven products and enterprise AI solutions — ensuring reliable, timely, and high-quality data is available across the organization — so that AI Engineers, Product teams, and enterprise clients can make accurate, insight-driven decisions and deliver intelligent customer experiences through Kata's AI and voice platforms.

Qualifications & Education :

  • Bachelor's degree in Computer Science, Information Systems, Data Engineering, Statistics, or related field
  • Relevant certifications (GCP Professional Data Engineer, Databricks, Airflow/Astronomer, etc.) are a plus

Technical Skills :

  • Streaming: Apache Kafka — topic design, consumer groups, partitioning strategy, and real-time event processing
  • Batch Orchestration: Apache Airflow — DAG design, scheduling, dependency management, and failure handling
  • Distributed Processing: Apache Spark — batch and micro-batch transformations, DataFrame API, optimization
  • Data Warehousing: Google BigQuery (primary); Apache Hive for large-scale batch analytics
  • NoSQL / Wide-Column: Apache Cassandra — data modeling for high-write, time-series, and event-driven workloads
  • Languages: Python (required); SQL (required); Scala is a plus
  • Cloud: GCP — BigQuery, Dataflow, Cloud Storage, Pub/Sub, Vertex AI Pipelines; Azure is a plus
  • Containerization: Docker; basic Kubernetes for deploying data services
  • CI/CD: GitLab CI, GitHub Actions, or equivalent for pipeline deployment automation
  • Data Quality: Great Expectations, dbt tests, or custom validation frameworks
  • Monitoring: Prometheus, Grafana, or GCP Monitoring for pipeline observability; alerting on SLA breaches
  • Version Control: Git with feature branching and pull request workflow

Experience : 

Associate Level (1–2 years)

  • 1–2 years of professional experience in data engineering, software engineering with data focus, or a related technical role
  • Hands-on experience building or maintaining data pipelines in a production environment
  • Practical exposure to at least one streaming or batch processing technology (Kafka, Spark, or Airflow)
  • Familiarity with SQL and relational or columnar databases (BigQuery, PostgreSQL, Hive, or equivalent)
  • Exposure to cloud data services on GCP or Azure
  • Experience working in Agile/Scrum teams with sprint-based delivery


Mid Level (3–5 years)

  • 3–5 years of professional experience in data engineering, with at least 2 years building and operating production-grade pipelines
  • Proven hands-on experience with Apache Kafka for real-time event streaming — including topic design, consumer group management, and at-least-once/exactly-once delivery patterns
  • Demonstrated experience designing and maintaining batch workflows using Apache Airflow and large-scale data transformations with Apache Spark
  • Experience working with BigQuery and/or Hive for large-scale analytics workloads, including query optimization and partitioning strategies
  • Hands-on experience with Cassandra or similar NoSQL wide-column stores for high-write or time-series data use cases
  • Experience supporting AI/ML data pipelines — feature engineering, training dataset preparation, or model inference data feeds
  • Experience with data quality frameworks and implementing data observability practices in production environments

We value a flexible working hour for our employees.

The most important is we provide a learning experience in Conversational AI Industry.

Kata.ai is an Indonesian AI company that specializes in Conversational AI, enhancing how businesses understand and interact with their customers through advanced Natural Language Processing technology. Their Kata Bot Platform enables companies of all sizes to easily create feature-rich chatbots across various messaging platforms, making it easier for industries such as FMCG, telecommunications, and finance to automate customer interactions and improve user experiences.

View all jobs
Ace your job interview

Understand the required skills and qualifications, anticipate the questions you may be asked, and study well-prepared answers using our sample responses.

Data Engineer Q&A's
Report this job
Apply for this job