Role Overview
We are seeking an AWS Data Engineer with 4–7 years of experience to design and build cloud-native data pipelines, contribute to innovation in data engineering practices, and collaborate across teams to deliver secure, scalable, and high-quality data solutions. This role is critical to enabling real-time insights and supporting our mission to streamline enterprise operations.
Key Responsibilities
- Develop, test, deploy, orchestrate, monitor, and troubleshoot cloud-based data pipelines and automation workflows in alignment with best practices and security standards.
- Collaborate with data scientists, architects, ETL developers, and business stakeholders to capture, format, and integrate data from internal systems, external sources, and data warehouses.
- Research and experiment with batch and streaming data technologies to evaluate their business impact and suitability for current use cases.
- Contribute to the definition and continuous improvement of data engineering processes and procedures.
- Ensure data integrity, accuracy, and security across corporate data assets.
- Maintain high data quality standards for Data Services, Analytics, and Master Data Management.
- Build automated, scalable, and test-driven data pipelines.
- Apply software development practices including Git-based version control, CI/CD, and release management to enhance AWS CI/CD pipelines.
- Partner with DevOps engineers and architects to improve DataOps tools and frameworks.
Basic Qualifications
- Bachelor’s Degree in Computer Science, Engineering, or related field.
- 4–7 years of experience in application development and data engineering.
- 3+ years of experience with big data technologies.
- 3+ years of experience with cloud platforms (AWS preferred; Azure or GCP also acceptable).
- Proficiency in Python, SQL, Scala, or Java (3+ years).
- Experience with distributed computing tools such as Hadoop, Hive, EMR, Kafka, or Spark (3+ years).
- Hands-on experience with real-time data and streaming applications (3+ years).
- NoSQL database experience (MongoDB, Cassandra) – 3+ years.
- Data warehousing expertise (Redshift or equivalent) – 3+ years.
- UNIX/Linux proficiency including shell scripting – 3+ years.
- Familiarity with Agile engineering practices.
- SQL performance tuning and optimization – 3+ years.
- PySpark experience – 2+ years.
- Exposure to process orchestration tools (Airflow, AWS Step Functions, Luigi, or KubeFlow).
Preferred Qualifications
- Experience with Machine Learning workflows.
- Exposure to Data-as-a-Service platforms.
- Experience designing and deploying APIs.
Excellent communication skills