Background: Boutique Advisory firm specializing in cutting edge Litigation and Risk Analytics is looking to augment its Data Science practice by bringing in a Principal Data Engineer. Principal Data Engineer will build out a best-in-class data ingestion practice for the organization and optimize data processing and its transformation. Data Engineers’ data wrangling skills will enable Data Scientists to perform Analytics in a more optimal and efficient manner that will add great value to Litigation and Risk initiatives.
What Will the Principal Data Engineer Do:
- be absolutely hands-on and provide strategic support on best practices with regards to data: ingestion, processing, transformation, and integration strategies within its cloud data environment
- help build cloud infrastructure and architecture for collecting, storing, processing, and analyzing big data in batch and streaming pipelines
- develop, manage, and automate ETL processes and data pipelines to efficiently ingest data
- lead operational efforts to make the lives of Data Scientists easier by cleansing and aggregating data including productionalizing machine learning algorithms and BI reports
- provide general guidance on data management, data warehousing, data modeling, data governance, data security, and cloud infrastructure initiatives
- lead the evaluation, implementation, and deployment of emerging tools and process for data engineering within a Data Science and Analytics environment
More Info – Principal Data Engineer Role:
- full-time role + great benefits
- hybrid work environment (onsite + remote). They have offices in the following locations: Austin, Boston, Chicago, Dallas, Denver, Houston, Los Angeles, Miami, New York, Palo Alto, San Francisco, Washington, DC. Figure you will be in the office 2 – 3 times per week and rest will be remote
- comfortable being the senior most Data Engineer within Analytics practice with one other Data Engineer
Responsibilities/Experience:
- experience as a Data Engineer with strong data ingestion skills i.e.: ETL, automation of data pipelines, batch / stream processing
- exposure to leading cloud platforms is a HUGE plus: Azure, AWS, GCP
- following tools / stacks are desired: SQL, Python, Spark, understanding of Snowflake, Databricks, Redshift, BigQuery