Principal Data Engineer

Confiz

  • Lahore, Punjab Multan, Punjab
  • Permanent
  • Full-time
  • 1 month ago
We are seeking a Data Engineering Lead with 8+ years of hands-on experience and a strong background in real-time and batch data processing, containerization, and cloud-based data orchestration. This role is ideal for someone passionate about building robust, scalable, and efficient data pipelines, and who thrives in agile, collaborative environments. Key Responsibilities Design, build, and maintain real-time data pipelines using streaming frameworks such as Kafka, Apache Flink, and Spark Structured Streaming. Develop batch processing workflows with Apache Spark (PySpark) Orchestrate and schedule data workflows using orchestration frameworks such as Apache Airflow and Azure Data Factory Containerize applications using Docker, manage deployments with Helm, and run them on Kubernetes Implement modern storage solutions using open formats such as Parquet, Delta Lake, and Apache Iceberg Build high-performance analytics engines using tools like Trino or Presto Collaborate with DevOps to manage infrastructure with Terraform and integrate with CI/CD pipelines via Azure DevOps Ensure data quality and consistency using tools like Great Expectations Write modular, well-tested, and maintainable Python and SQL code Develop an observability layer to monitor and optimize performance across data pipelines Participate in agile ceremonies and contribute to sprint planning and reviews Required Skills & Experience Advanced Python programming with a strong focus on modular and testable code Strong knowledge of SQL and experience working with large-scale datasets Hands-on experience with at least one major cloud platform (Azure preferred) Solid experience with real-time data processing (Kafka, Flink, or Spark Streaming) Expertise in Apache Spark (PySpark) for batch processing Experience implementing lakehouse architectures and working with columnar storage (e.g., ClickHouse) Proficient in using Azure Data Factory or Apache Airflow for data orchestration Experience in building APIs to expose large datasets Solid experience with Docker, Kubernetes, and Helm Familiarity with data lake open formats such as Parquet, Delta Lake, and Iceberg Basic experience with Terraform for infrastructure provisioning Practical experience with data quality frameworks (e.g., Great Expectations) Comfortable working in agile development teams Proven ability in debugging and performance tuning of streaming and batch data jobs Experience with AI-driven tools (e.g., text-to-SQL) is a plus We have an amazing team of 700+ individuals working on highly innovative enterprise projects & products. Our customer base includes Fortune 100 retail and CPG companies, leading store chains, fast-growth fintech, and multiple Silicon Valley startups. What makes Confiz stand out is our focus on processes and culture. Confiz is ISO 9001:2015 (QMS), ISO 27001:2022 (ISMS), ISO 20000-1:2018 (ITSM) and ISO 14001:2015 (EMS) Certified. We have a vibrant culture of learning via collaboration and making workplace fun. People who work with us work with cutting-edge technologies while contributing success to the company as well as to themselves. To know more about Confiz Limited, visit: https://www.linkedin.com/company/confiz-pakistan/

Confiz

Similar Jobs

  • Principal Data Engineer

    NorthBay Solutions

    • Lahore, Punjab
    Experience Range: Minimum 7+ years experience required in desire technologies. Data Engineering (Mandatory): Strong SQL skills; experience with NoSQL databases; proven ability to…
    • 2 months ago
  • Principal Data Engineer

    NorthBay Solutions

    • Lahore, Punjab
    Experience Range: Minimum 7+ years experience required in desire technologies. Data Engineering (Mandatory): Strong SQL skills; experience with NoSQL databases; proven ability to…
    • 2 months ago
    • Apply easily
  • Principal Data Engineer

    Confiz

    • Lahore, Punjab
    • Multan, Punjab
    We are seeking a Data Engineering Lead with 8+ years of hands-on experience and a strong background in real-time and batch data processing, containerization, and cloud-based data o…
    • 1 month ago