Data Engineer, Analytics & DataOps | Rosetta.ai
Rosetta.ai helps fashion e-commerce uncover consumer’s shopping preferences and provide personalized shopping experiences to consumers. We’re making e-commerce merchants have enterprise-level AI backup on day 1.
We are looking for a Data Engineer of Analytics & DataOps, to not only build data pipelines but also extend the next generation of our data tools. As a Data Engineer of Analytics & DataOps, you will develop a clear sense of connection with our organization and leadership.
In this role, you will define and find solutions to complex and ambiguous problems. You will be leveraging your deep knowledge and experience to collaboratively define the technical vision, strategy, and architecture for ETL, Logging, Streaming, Batch/Compute engines (Presto/Spark), Semantic Data and Metadata models, ML workflows, Consumption workflows (Visualization/Notebooks), and Big Data development lifecycle (coding, testing, deploying, discovery, etc.)
- Immerse yourself in all aspects of the product, understand the problems, and tie them back to data engineering solutions.
- Articulate strategy within teams and effectively communicate with cross-functional.
- Drive comprehensive Technical Vision on fundamental aspects and evolution of Analytics/Data Infra Foundation/Tooling.
- Build and data (dimensional) model core datasets and analytics applications and make them scalable and fault-tolerant.
- Craft and own the optimal data processing architecture and systems for new data and ETL pipelines/analytics applications
- Drive internal process improvements and automating manual processes for data quality checks and SLA management.
- Collaborate and work with different cross-functional partners – Data Infrastructure, Product Software Engineering, Data Engineering, and Product Management team members – on use cases to foundationally evolve long-term, architecture-driven, E2E analytics development cycle.
- Build visualizations to provide insights into the data & metrics.
- Apply proven expertise and build high-performance scalable data warehouses, online caches, and real-time systems.
- Design, build and launch new data extraction, transformation, and loading processes in production.
- Work with data infrastructure to triage infra issues and drive to resolution.
- Support existing processes running in production.
- Healthcare (Medical, Dental, Vision)
- Retirement savings or 401(K)
- Paid time off (Annual)
- Maternity/Paternity leave
- Life insurance
- Tuition reimbursement and training
- Personal facilities (Laptop, Screen)
- Free health screening
- Free snacks and drinks
- Open and creative environment
- Irregular dinner/outing, happy hours
- Extended annual time off
- Paid time off to volunteer (Birthday, Menstrual, Funeral)
- Flexible schedules and working time
- Remote working optionally
- Employee stock ownership plan (ESOP)
Culture/6 Core Values:
Grit – We thrive outside of our comfort zone, pushing ourselves to go even further. We think long-term and constantly strive to be better, even if things don’t always go as expected.
Trust – We earn that trust by listening to each other, following through with our commitments, and keeping our words. We exercise transparency within the company, our customers, and our community.
Humility – We learn from everyone and everywhere, and we approach each new challenge knowing that we may not have all the answers.
Empathy – We craft our intention to keep curious about the industry, business, and practical scenarios that we purify the insights and forge the approaches.
Candor – We are open and honest. We give each other praise and criticism because we believe each team member is as important as the other.
Craftsmanship – We simplify, innovate, perfect, and start over until everything we touch enhances each life it touches.
Taipei Tech Arena
- Email to firstname.lastname@example.org
- Or, send us on CakeResume
Rosetta.ai is an Equal Employment Opportunity employer. Rosetta.ai conducts all employment-related activities without regard to race, religion, color, national origin, age, sex, marital status, sexual orientation, disability, citizenship status, genetics, or any other characteristic protected by law.
Background & Skills:
- BS/BA in Technical Field, Computer Science, or Mathematics.
- Experience with writing complex SQL statements.
- Experience with data engineering, applying Data Warehousing, and ETL best practices.
- Experience with either a MapReduce or an MPP system.
- Experience with schema design and dimensional data modeling.
- Experience with anomaly/outlier detection.
- Experience in LAMP and the Big Data stack environments (Hadoop, MapReduce, Hive).
- Experience with workflow management engines (i.e. Airflow, Luigi, Prefect, Dagster, digdag.io, Google Cloud Composer, AWS Step Functions, Azure Data Factory, UC4, Control-M).
- Ability in managing and communicating data warehouse plans to internal clients.
- Communication skills including the ability to identify and communicate data-driven insights.
- Ability to analyze data to identify opportunities, deliverables, gaps, and inconsistencies.
- Flexibility and comfort working in a dynamic, team environment with a possible remote organization with minimal documentation and process.
[Preferred Qualifications] (Optionally, the more, the better)
Background & Skills:
- Knowledge in Python or Java or Scala or Pandas.
- Designing and implementing real-time pipelines.
- Experience with data quality and validation.
- Experience with SQL performance tuning and e2e process optimization.
- Experience with notebook-based Data Science workflow.
- Experience collaborating, defining, and communicating complex technical concepts to a broad variety of audiences.
- Experience with cloud or on-prem Big Data/MPP analytics platform (i.e. Netezza, Teradata, AWS Redshift, Google BigQuery, Azure Data Warehouse, or similar).
- Experience with scaling analytics architecture and worked with open-source big-data stacks (Spark, Koalas, etc.)
- Experience with Airflow.
- Experience with ELK.
- Experience querying massive datasets using Spark, Presto, Hive, Impala, etc.
- Experience with filesystems, server architectures, and distributed systems.
850K ~ 950K TWD/年