We offer the industry’s only platform that fuses customer identity and anti-fraud solutions – customer identity management, identity verification, and fraud prevention.
We sell to industries with large, consumer-facing businesses such as: banking, financial services, insurance, fintech, gaming, ecommerce/retail, telco / media, utilities, etc.
About the Opportunity:
As an Data Architect at Transmit Security, you will lead the design and architecture of large-scale data engineering solutions, ensuring the seamless integration of AI/ML infrastructure, including Generative AI models. You will focus on building scalable, secure, and high-performance data pipelines that enable real-time AI applications and handle massive amounts of data. This role is crucial in shaping our next-generation identity security and fraud prevention platform.
If you have deep expertise in distributed computing platforms, high-scale systems, and a passion for driving AI innovations, this opportunity is for you.What You’ll Do:
-
Architect Distributed Data Pipelines: Design and implement large-scale, distributed data architectures leveraging technologies such as Flink, Spark, and Apache Beam to support AI/ML workloads at scale. Ensure real-time and batch processing pipelines handle massive volumes of data efficiently and securely.
-
Lead GenAI Integration: Drive the integration of Generative AI models, including LLMs, into the existing data infrastructure, ensuring seamless data flows for AI-driven customer interactions and fraud prevention.
-
Enhance Real-Time AI Capabilities: Build and optimize cloud-based (GCP, AWS, Azure) data infrastructures that support real-time data ingestion, processing, and AI model inference, scaling to millions of users.
-
Collaborate Across Teams: Partner with data scientists, ML engineers, and security researchers to develop AI/ML solutions that align with business goals, ensuring robust data pipelines that fuel advanced AI applications.
-
Optimize for High-Scale Systems: Ensure the architecture supports distributed processing and high-scale systems, enabling smooth handling of massive data streams for real-time decision-making in AI/ML environments.
-
Promote Data Engineering Best Practices: Define and implement best practices for data engineering, focusing on data quality, governance, security, and efficient pipeline design for AI/ML integration at scale.
What You’ll Need:
-
8+ years of experience in data engineering or architecture, with a strong focus on distributed systems and AI/ML infrastructure.
-
Extensive experience with distributed computing platforms like Flink, Spark, and Apache Beam, handling large-scale data processing in high-performance environments.
- Proven track record in designing and implementing data pipelines that support real-time and batch AI/ML workloads in cloud environments (GCP, AWS, Azure).
- Familiarity with Generative AI technologies and experience in integrating LLMs and transformer-based models into data architectures.
- Proficiency in Python and cloud-native data tools (e.g., Kafka, Airflow) for building and maintaining scalable data systems.
- Deep understanding of data governance, data security, and optimizing data pipelines for AI/ML model training and inference.
- Excellent leadership and collaboration skills, with the ability to guide teams through complex data engineering challenges.
Advantages:
- Master’s degree in Computer Science, Data Engineering, or a related field.
- Hands-on experience with MLOps tools and best practices for managing AI/ML model lifecycles in production environments.
- Expertise in identity management, fraud detection, or cybersecurity applications.
- Practical experience deploying large-scale GenAI models, particularly in NLP and transformer-based architectures.