We are seeking a Data Engineer to join our dynamic team. The ideal candidate is an enthusiastic problem-solver who excels at building scalable data systems and has hands-on experience with Databricks, Looker, AWS, MongoDB, PostgreSQL, and Terraform. You will work alongside sales, customer success and engineering to design, implement, and maintain the operational data infrastructure that powers our analytics and platform offerings.
Key Responsibilities
- Data Pipeline & Integration
- Design, build, and maintain end-to-end data pipelines using Databricks (SparkSQL, PySpark) for data ingestion, transformation, and processing.
- Integrate data from various structured and unstructured sources, including medical imaging systems, EMRs, Change-Data-Capture from SQL Databases, and external APIs.
- Analytics & Visualization
- Collaborate with the analytics team to create, optimize, and maintain dashboards in Looker.
- Implement best practices in data modeling and visualization for operational efficiency.
- Cloud Infrastructure
- Deploy and manage cloud-based solutions on AWS (e.g., S3, EMR, Lambda, EC2) to ensure scalability, availability, and cost-efficiency using IaC tooling (Terraform and Databricks Asset Bundles).
- Develop and maintain CI/CD pipelines for data-related services and applications.
- Database Management
- Oversee MongoDB and PostgreSQL databases, including schema design, indexing, and performance tuning.
- Ensure data integrity, availability, and optimized querying for both transactional and analytical workloads.
- Security & Compliance
- Adhere to healthcare compliance requirements (e.g., HIPAA) and best practices for data privacy and security.
- Implement error handling, logging, and monitoring frameworks to ensure reliability and transparency.
- Implement data governance frameworks to maintain data integrity and confidentiality.
- Collaboration & Documentation
- Work cross-functionally with data scientists, product managers, and other engineering teams to gather requirements and define data workflows.
- Document data pipelines, system architecture, and processes for internal and external stakeholders.
Requirements
- Education & Experience
- Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.
- 3+ years of professional experience in data engineering or a similar role.
- Technical Skills
- Databricks (Spark): Proven expertise in building large-scale data pipelines.
- Looker: Experience in creating dashboards, data models, and self-service analytics solutions.
- AWS: Proficient with core services like S3, EMR, Lambda, IAM, EC2, etc.
- MongoDB & PostgreSQL: Demonstrated ability to design schemas, optimize queries, and manage high-volume databases.
- SQL & Scripting: Strong SQL skills, plus familiarity with Python, Scala, or Java for data-related tasks.
- Soft Skills
- Excellent communication and team collaboration abilities.
- Strong problem-solving aptitude and analytical thinking.
- Detail-oriented, with a focus on delivering reliable, high-quality solutions.
- Preferred
- Experience in healthcare or imaging (e.g., DICOM, HL7/FHIR).
- Familiarity with DevOps tools (Docker, Kubernetes, Terraform) and CI/CD pipelines.
- Knowledge of machine learning workflows and MLOps practices.
Benefits
- Health Care Plan (Medical, Dental & Vision)
- Retirement Plan (401k, IRA)
- Paid Time Off (Vacation, Sick & Public Holidays)
- Training & Development
- Work From Home