Loading…
September 13-16, 2022
Dublin, Ireland + Virtual
View More Details & Registration
Note: The schedule is subject to change.

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for Open Source Summit Europe 2022 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

This schedule is automatically displayed in Irish Standard Time (UTC +1). To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date."

IMPORTANT NOTE: Timing of sessions and room locations are subject to change.

Back To Schedule
Thursday, September 15 • 16:10 - 16:50
Truly Open Lineage - Mandy Chessell, Pragmatic Data Research Ltd

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Feedback form is now closed.
One of the most requested metadata use cases is lineage. This is is the ability to understand the origin of your data and the processing (reformatting, enrichment, merging, ...) it has gone through between the data's origin and your AI model. Lineage helps to build trust in your model since it shows you have used appropriate data. Many individual technologies provide some lineage support that covers its own processing. Some data catalogs provide proprietary ways to gather lineage from many sources. However this is expensive to implement and only makes the lineage information available through the data catalog. Now three open source projects from LF AI and Data have come together to create a truely open ecosystem for lineage. Egeria provides open metadata that describes the data sources, data structures, data profiling results and the data pipelines. OpenLinege provides the event mechanism that records each time a data pipeline runs. Marquez provides visualization for lineage. In this talk you will learn about: * What is lineage and how it is used * What makes lineage difficult to collect and maintain * How the open ecosystem for lineage works * How you can use lineage in your data science tools (using Jupyter Notebooks as an example)

Speakers
avatar for Mandy Chessell

Mandy Chessell

Founder, Pragmatic Data Research Ltd
Mandy Chessell CBE FREng CEng FBCS honFIED is a trusted advisor to executives from large organisations, working with them to develop their strategy and architecture relating to the governance, integration and management of information. Mandy worked for IBM for 35 years, the last 15... Read More →



Thursday September 15, 2022 16:10 - 16:50 IST
Wicklow Meeting Room 2 (Level 2)