Job Summary
We are seeking a skilled Data Engineer to design, build, and maintain robust data pipeline architectures that support our business intelligence and analytics needs. The ideal candidate will have extensive experience working with large, complex datasets and cloud-based big data technologies. You will play a critical role in optimizing data delivery, automating processes, and enhancing infrastructure scalability to drive actionable insights across customer acquisition, operational efficiency, and overall business performance.
Key Responsibilities
- Design, develop, and maintain efficient data pipeline architectures that meet both functional and non-functional business requirements.
- Assemble and manage large, complex datasets from diverse sources, ensuring data integrity and quality throughout the pipeline.
- Identify opportunities to improve internal processes by automating manual tasks, optimizing data workflows, and redesigning infrastructure for enhanced scalability and performance.
- Build and maintain infrastructure for the extraction, transformation, and loading (ETL) of data using SQL and cloud-based big data platforms such as Azure or Google Cloud Platform (GCP).
- Develop analytics tools that leverage data pipelines to generate actionable insights related to customer acquisition, operational metrics, and key business indicators.
- Collaborate closely with cross-functional teams including Data Architects, Product Owners, and Data and Design teams to resolve technical challenges and support evolving data infrastructure needs.
Required Qualifications
- Proficiency in object-oriented and functional programming languages such as Python, Java, or Scala.
- Hands-on experience with big data processing tools like Apache Spark and Kafka.
- Strong knowledge of relational databases (e.g., SQL Server, PostgreSQL) and NoSQL databases such as Cassandra.
- Familiarity with data pipeline orchestration and workflow management tools like Apache Airflow.
- Experience working with cloud data analytics services, including GCP, Azure, or AWS.
- Practical knowledge of stream-processing frameworks such as Spark Streaming and Apache Flink.
- Expertise in maintaining data quality within big data pipelines, including implementing unit tests and validation checks.
- Advanced skills in writing complex SQL analytical queries and optimizing database performance.
- Proven ability to build and optimize scalable big data pipelines, architectures, and datasets.
- Strong analytical skills with the capability to perform root cause analysis on data and processes to support business decision-making and identify improvement areas.
- Understanding of best security practices in data computing, including encryption of sensitive data both at rest and in transit.
- Experience handling unstructured datasets and extracting meaningful insights from large, disconnected data sources.
- Working knowledge of message queuing systems, stream processing, and scalable big data storage solutions.
- Demonstrated ability to collaborate effectively with cross-functional teams in fast-paced, dynamic environments.
Preferred Qualifications and Benefits
While not explicitly listed, candidates with additional certifications in cloud platforms (GCP, Azure, AWS) or experience in advanced machine learning integration within data pipelines will be highly valued. Our company offers a collaborative work environment, opportunities for professional growth, and the chance to work on cutting-edge data projects that directly impact business outcomes.
If you are passionate about building scalable data solutions and driving business insights through innovative data engineering, we encourage you to apply.