Your role
Data engineers are the central productive force of PriceHubble. As a data engineer, your mission will be to build and maintain our extract-transform-load infrastructure which consumes raw data and transforms it to valuable real estate insights. Your daily challenges will be to mine a wide range and variety of new datasets of all sort, build new datasets, extract and create new features. These features and insights are either directly used as part of our product or as a signal in our machine learning algorithms.
Responsibilities
- Understand real estate market and urban related data
- Extract, cleanup, structure and transform complex raw and processed datasets to extract insights from it
- Retrieve a wide variety of datasets and integrate them into the data pipeline
- Create and maintain an efficient data infrastructure
- Continuously provide new ideas to improve our engines and products
Requirements
- BSc or MSc in Computer Science or equivalent
- At least 3 years of experience in a similar position
- Proficiency in at least one object-oriented programming language (preferably Python) and at least one scripting language
- In-depth understanding of basic data structures and algorithms
- Familiarity with software engineering best practices (clean code, code review, test-driven development, …) and version control systems
- Experience with the ETL and data processing tools we’re using is a strong advantage:
- Python, Pandas
- Luigi and PySpark
- PostgreSQL and PostGIS
- Scikit-learn and Tensorflow
- Working experience with cloud providers (Google cloud, AWS or Azure)
- Advanced knowledge of relational databases
- Experience with Docker and Kubernetes orchestration is a strong advantage
- Understanding of core machine learning concepts is an advantage
- Worked previously in ‘agile’ team(s) and are looking forward to doing it again,
- Comfortable working in English; you have a great read, good spoken command of it