Accellor is an AI-first digital transformation partner built for the next generation of enterprise. We help global organizations turn cloud, data, and AI into real, measurable business outcomes at scale.
At Accellor, people come first. You’ll be trusted, empowered, and challenged to solve meaningful problems, collaborate with exceptional teams, and continuously grow your skills while building solutions that matter.
Trusted by Fortune 100 companies and global innovators, we work across industries delivering AI solutions, data platforms, and product engineering using modern, scalable technologies. If you want your work to create real impact and shape the future of enterprise, Accellor is where it happens.
Senior Reliability Engineer – Data & Cloud
Position Overview
We are seeking a Senior Reliability Engineer to support and enhance the reliability, performance, and stability of our retail data and cloud platforms. This role will work closely with Data Engineering, Analytics, and Retail Operations teams to ensure that store, inventory, and supply chain are accurate, reliable, and available in real time.
You will play a key role in monitoring and maintaining Azure-based data pipelines, optimizing Fabric Lakehouse workloads, and improving automation across our retail technology ecosystem.
Key Responsibilities:
- Support the reliability, performance, and uptime of data pipelines that power retail operations (POS data, supply chain feeds, inventory updates, etc.).
- Monitor production workloads using Azure Monitor, Log Analytics, and custom dashboards.
- Respond to incidents, troubleshoot failures, perform root-cause analysis, and implement prevention measures.
- Optimize ETL/ELT workflows using Azure Data Factory (ADF) for retail datasets.
- Automate system/Power BI integrations (e.g., POS, ERP, loyalty systems) using Azure Logic Apps.
- Implement quality checks, data validations, and alerting to ensure data freshness and accuracy.
- Write and optimize SQL queries and stored procedures supporting operational data stores.
- Work with Azure services including Storage Accounts, Key Vault, Azure Functions, Event Grid, and App Insights.
- Support CI/CD pipelines for data integration and Fabric workloads (Azure DevOps).
- Develop and maintain Fabric Lakehouse pipelines to support reporting, forecasting, and analytics.
- Use Fabric Notebooks and PySpark for data transformations, batch processing, and scaling large retail data workloads.
- Collaborate with BI teams to ensure data is ready and reliable for dashboards and real-time insights.
- Identify opportunities to automate manual processes and improve reliability across retail systems.
Requirements
- 7+ years of experience in Reliability Engineering, Data Engineering, Cloud Engineering, or similar roles.
- Strong hands-on experience with:
- SQL (debugging, tuning, modeling)
- Azure Data Factory (ADF)
- Azure Logic Apps
- Microsoft Azure services (Storage, Functions, Key Vault, Monitor)
- Solid understanding of monitoring, observability, and incident management.
- Strong analytical, problem-solving, and communication skills.
- Ability to work in fast-paced environments with frequent data updates (common in retail).
Preferred Qualifications
- Experience in the retail industry (POS systems, inventory, supply chain, merchandising, or loyalty data).
- Familiarity with Power BI or other visualization tools.
- Experience with Git, Azure DevOps, and CI/CD workflows.
- Practical experience with Microsoft Fabric, such as working with Lakehouse, Pipelines, or Dataflows through Notebooks and PySpark