Data Engineer Specialist in Databricks

Descrição do trabalho

  • Job Description:
  • Focus on designing, building and maintaining scalable Data Platforms required for data storage, processing, orchestration, and analysis.
  • Collect, process, and analyze large and complex data sets from various sources.
  • Develop and implement data processing workflows using data processing framework technologies such as Spark and Apache Beam.
  • Implement scalable and performant data pipelines and data integration solutions.
  • Monitor system performance and optimize for high availability and scalability.
  • Collaborate with cross-functional teams to ensure data accuracy and integrity.
  • Ensure data security and privacy through the proper implementation of access controls and data encryption.
  • Extraction of data from various sources, including databases, file systems, and APIs.
  • Agnostic of data sources and technologies to ensure efficient data flow and high data quality, enabling data scientists, analysts, and other stakeholders to access and analyze data effectively.
  • Key Requirements:
  • Experience with cloud platforms and services for data engineering (Databricks).
  • Proficiency in programming languages like Python, Java, or Scala.
  • Use of Big Data Tools as Spark, Flink, Kafka, Elastic Search, Hadoop, Hive, Sqoop, Flume, Impala, Kafka Streams, Connect, Druid, etc.
  • Knowledge of data modelling and database design principles.
  • Familiarity with data integration and ETL tools (e.g., Apache Kafka, Talend).
  • Understanding of distributed systems and data processing architectures.
  • Strong SQL skills and experience with relational and NoSQL databases.
  • Familiarity with cloud platforms and services for data engineering (e.g., AWS S3, Azure Data Factory).
  • Experience with version control tools such as Git.
  • Strong problem-solving and analytical skills.
  • Strong communication competencies.
  • Knowledge of agile methodologies as Scrum, Kanban, etc.
  • Fluent in English.

Other Details:

Position involves remote work flexibility and is focused on data engineering within an agile environment.