We are looking for a full-time data engineer. You will be at the core of making people’s data work for them. You will design and maintain the ETL data pipeline—from pulling and parsing data from various APIs and downloaded data stores to populating normalized RDBs and calculating cached views (usually in a NoSQL form) to power our various data products and services. While you are not constrained in your tools, our current stack involves Python, js/node, PostgreSQL, Airflow, AWS Lambda and hosting. You will be a core part of a highly skilled and motivated team located between Berlin and New York City that is changing one of the most unethical sectors in our modern economy.
What We are Looking For:
- Expertise in building out data pipelines, efficient ETL design, implementation, and maintenance
- Mastery of RDBs and ability to generate normative schemas from datasets
- Experience with NoSql dbs (such as MongoDB)
- Passion for creating data infrastructure technologies from scratch using the right tools for the job
- Experience building and maintaining a data warehouse in production environments
- Ability to turn vague requirements into clear deliverables with minimal guidance
- Experience with Apache Airflow, AWS tools, git, Linux.
- Experience with systems for transforming large datasets such as Spark or Hadoop
- Familiarity with Python-based data science tools (e.g., pandas) is also highly desirable.
- Highly competitive wages
- Top-of-the-line equipment: Laptop of choice, custom monitor setup, optional standing desk, etc.
- Opportunities to travel to both the Berlin and New York City HQ