Design, Support and Implementation of highly scalable Data Architectures based on Lakehouse technologies, relational Databases, No-SQL and Streaming Scenarios on Kubernetes
Contributing and establishing best practices for Data Management
Development of Data Pipelines and improvement of existing Python / Pyspark Frameworks
Customization of common tools such as Airflow and Spark
Training and Support for Business Users
Extending the Data Ecosystem with complementary tools such as Data Catalogs and BI Tools such as Apache Superset
Profile:
Proficiency with Lakehouse technologies, especially Spark, Python, Apache Iceberg, S3. Ideally previous knowledge Dremio and Iceberg Metadata Catalogs.
Multiple years of experience in Data Management and Data Modelling in highly scalable Architectures.
Very good knowledge of Data Pipeline Orchestration with Apache Airflow including custom Operators.
Very good knowledge of the Software Development Lifecycle, preferably with GitHub
Good knowledge of Data Catalogs and related Metadata
Good knowledge of Kubernetes, Docker and related tooling
Very good English skills
Other Requirements:
Contractor based in Europe (employment option in Portugal).
Please mention you found this job on AI Jobs. It helps us get more startups to hire on our site. Thanks and good luck!
Get hired quicker
Be the first to apply. Receive an email whenever similar jobs are posted.
Ace your job interview
Understand the required skills and qualifications, anticipate the questions you may be asked, and study well-prepared answers using our sample responses.