Short Description:
The Data Engineer - Clean Room position, based in Navi Mumbai, requires a candidate with a minimum of 4 years of experience in database development, ETL tools, and scheduler/orchestration tools. The role involves designing, developing, and supporting data engineering projects with various data sources, including structured and unstructured files, traditional relational databases, and non-traditional databases. Strong communication skills are essential for collaboration with both technical and non-technical teams. Preferred qualifications include knowledge of big data databases, experience with the Hadoop ecosystem, and familiarity with tools like Apache Airflow. The selection process involves an online assessment, recruiter screen, technical interview, and meetings with both WebMD and PulsePoint hiring managers.
Job Title: Data Engineer - Clean Room
Location: Navi Mumbai, Maharashtra
About the Company: WebMD Health Corp., an Internet Brands Company, is a prominent provider of health information services across various platforms. Our services cater to patients, physicians, healthcare professionals, employers, and health plans through online portals, mobile platforms, and health-focused publications. The WebMD Health Network encompasses a range of brands, including WebMD Health, Medscape, Jobson Healthcare Information, prIME Oncology, MediQuality, Frontline, QxMD, Vitals Consumer Services, MedicineNet, eMedicineHealth, RxList, OnHealth, Medscape Education, and other WebMD sites
For more details about the company, please visit our website: www.webmd.com / www.internetbrands.com
Qualifications:
- Education: B.E. in Computer Science/IT or related engineering discipline.
- Experience: Minimum 4 years
- Work Timings: 2 PM to 11 PM IST
About PulsePoint: PulsePoint, now part of Internet Brands, is a leading technology company specializing in real-world data utilization to optimize campaign performance and transform health decision-making. Leveraging proprietary datasets and methodology, PulsePoint delivers unprecedented accuracy in targeting healthcare professionals and patients, providing exceptional results to our clients.
Location: This position will be hybrid (3 days/week) at the Mumbai office located at Plot No K-10, Liberty Tower, Unit No -801,802 8th Floor, Kalwa Industrial Area, Airoli, Navi Mumbai, Maharashtra. Explore our Mumbai office here.
Responsibilities:
- Design, develop, and support various data engineering projects involving heterogeneous data sources such as files, traditional relational databases (e.g., Postgres DB), and non-traditional databases (e.g., Vertica DB, Hive).
- Analyze business requirements, design and implement necessary data models, and build ETL/ELT strategies.
- Lead data architecture and engineering decision-making/planning.
- Translate complex technical subjects into understandable terms for both technical and non-technical audiences.
- Support the Data Clean Room project.
Required Qualifications:
- 4+ years of experience in database development (advanced SQL) on traditional and non-traditional databases.
- 2+ years with a specific ETL tool (Pentaho, Talend, Informatica, DataStage).
- 1+ years of experience in scheduler/orchestration tools (Control-M, Autosys, Airflow, JAMS).
- Basic Python scripting and troubleshooting skills.
- Strong communication and documentation skills, working with both technical and non-technical teams.
- Experience with infrastructure setup and architectural requirements.
- Ability to work with minimal or no direct supervision.
- Experience collaborating with teams outside of IT (e.g., Business Intelligence, Finance, Marketing, Sales).
Preferred Qualifications:
- Knowledge of big data databases such as Vertica, Snowflake, or Redshift.
- Experience with the Hadoop ecosystem, including components such as HIVE, Spark, and Sqoop for processing terabyte-level data.
- Experience with GCP/Big Query.
- Familiarity with Apache Airflow or in-depth understanding of its workings.
- Background in web analytics or business intelligence is advantageous.
- Understanding of Digital Marketing Transactional Data (Click Stream Data, Ad Interaction Data, Email Marketing Data).
- Understanding of Medical/Clinical data.
- Exposure or understanding of scheduling tools such as Airflow.
- Preferred experience in a Linux environment.
Selection Process:
- Initial online assessment
- Recruiter screen
- Technical interview
- Interview with WebMD hiring manager
- Interview with PulsePoint hiring managers
'Red Flags' for Us: Candidates may face challenges if they lack hands-on experience with datasets or if they have primarily translated requirements into SQL without a deep understanding of how data impacts business and client success metrics.
Please click here to apply.
Comments
Post a Comment
Please feel free to share your thoughts and discuss.