Skip to main content



Data Platform Engineer- Python, Pyspark Lead - DAB (Platform Ops & Delivery) (EDB)

応募後で応募 求人ID R0019707 掲載日 05/28/2020 Location:Cambridge, Massachusetts

By clicking the “Apply” button, I understand that my employment application process with Takeda will commence and that I agree with Takeda’s Privacy Notice, Privacy Policy and Terms of Use.

Job Description

Are you looking for a patient-focused, innovation-driven company that will inspire you and empower you to shine? Join us in our Cambridge, MA office.

At Takeda, we are transforming the pharmaceutical industry through our R&D-driven market leadership and being a values-led company. To do this, we empower our people to realize their potential through life-changing work. Certified as a Global Top Employer, we offer stimulating careers, encourage innovation, and strive for excellence in everything we do. We foster an inclusive, collaborative workplace, in which our global teams are united by an unwavering commitment to deliver Better Health and a Brighter Future to people around the world.

Here, you will be a vital contributor to our inspiring, bold mission.

For the Takeda global scientific community who aspire to deliver transformative medicines to patients Scientific Informatics is the technology leader, with domain specific expertise, that provides innovative platforms and technologies to improve both operational efficiency and data-driven decision making. As an embedded partner Scientific Informatics is at the leading edge of both science and technology and proactively reacts to the changing scientific environment and needs of researchers in an agile fashion to deliver timely, value-adding solutions.

As a Technology Lead on the Scientific Assets & Decision Support team, you will be applying artificial intelligence and machine learning in solving real and difficult problems in cheminformatics, bioinformatics and imaging informatics and enable data-driven decision making in drug discovery and development in all modalities and therapeutic focuses.

The Digital Advisory Board programs aim to ensure our growing data is easily accessible to everyone so we can help our patients and HCPs faster and more effectively.

Enterprise Data Backbone

Enterprise Data Backbone - The Enterprise Data Backbone provides the most user-friendly real-time data for all our internal and external stakeholders, accessible through self-service tools.

We will transform the way we gather and use insights by integrating completely new types of data to better support our patient and others in the healthcare system. We will build capabilities to gather real world data outside of clinical trials in ways we have not been able to do so far.

We will make data more easily accessible across Takeda and find a way to share it so that we don’t keep re-inventing the wheel. We will push ourselves to think beyond our normal constructs of clinical trials or today's supply chains.

We will explore a future where we get data from all corners of the world – digital communities, wearable devices, social media, surveys, for a, consumer data combined with hospital, pharmacy and EMR data.

DATA PLATFORM ENGINEER – Python, Pyspark – Lead (Platform Operations & Delivery)


  • The Data Platform Engineer Lead for “Python/Pyspark” is an IT partner for the build and delivering of data platforms.
  • This lead will collaborate with “Platform Operations & Delivery”/Enterprise Data Backbone team as the architecture/foundation for Platform Ops is being built.  


  • Strategic Business Impact – The role is a key enabler for Takeda strategy to become a Data Driven Enterprise. By connecting with TET-1 and their data teams the data platform lead will strategically align data, processes and technology to achieve faster time to market for life saving products 
  • Operational responsibility – the Data platform & Ops delivery lead takes the responsibility of the quality and timeliness of customer project deliveries and internal research and development. Ultimately, help Takeda to make better decisions that improve the quality and efficiency of care for patients.  
  • Delivery Scalability –Oversee and influence the other product leads, demand leads and solution architects to build and deliver data platforms and projects aligned with the new DAB program/initiative. 
  • Develop data driven solutions utilizing current and next generation technologies to meet evolving business needs. 
  • Ability to quickly identify an opportunity and recommend possible technical solutions. 
  • Develop application systems that comply with the standard system development methodology and concepts for design, programming, backup, and recovery to deliver solutions that have superior performance and integrity. 
  • Contribute to determining programming approach, tools, and techniques that best meet the business requirements. 
  • Understand and follow the PDP process to develop, deploy and deliver the solutions. 
  • Be proactive and diligent in identifying and communicating design and development issues. 
  • Utilize multiple development languages/tools such as Python, Spark, Hive to build prototypes and evaluate results for effectiveness and feasibility. 
  • Operationalize open source data-analytic tools for enterprise use. 
  • Develop real-time data ingestion and stream-analytic solutions leveraging technologies such as Kafka, Apache Spark, Python, SCALA and AWS based solutions. 
  • Custom Data pipeline development (Cloud and locally hosted) 
  • Work heavily within the Cloud ecosystem and migrate data  
  • Provide support for deployed data applications and analytical models by being a trusted advisor to Data Scientists and other data consumers by identifying data problems and guiding issue resolution with partner Data Engineers and source data providers. 
  • Provide subject matter expertise in the analysis, preparation of specifications and plans for the development of data processes. 
  • Ensure proper data governance policies are followed by implementing or validating Data Lineage, Quality checks, classification, etc. 


  • Bachelor’s degree in engineering or equivalent 
  • 8+ years of experience in IT field, with a focus in BI (Business Intelligence and Data Analytics) 
  • Extensive experience with the AWS cloud platform along with technologies including Python, PySpark, KAFKA, SCALA and Databricks
  • Significant experience in an analytical role in the healthcare industry 
  • Successfully leading and delivering Big data programs for a Fortune 500 organization 
  • Experience in a leadership position, including project management and team management 
  • Strong time management and organizational skills 
  • Strong communication and presentation skills 
  • Experience working in a collaborative environment, including supporting and teaching other team members 
  • Very strong customer-facing experience 


Cambridge, MA

Worker Type


Worker Sub-Type


Time Type

Full time