Agenda
Two days, all things web data extraction
Join us for two action packed days featuring expert talks, interactive workshops, and networking opportunities.
october 10th 2024 | Austin, Texas
Main Event
Harnessing the Power of Large Language Models for Advanced Data Engineering and Data Science
Neelabh Pant | Senior Manager Data Science @ Walmart
Learn how Large Language Models (LLMs) automate data cleaning and preprocessing. Discover methods for handling missing values, outlier detection, data normalization, and feature engineering to enhance efficiency and accuracy.
Web Data Extraction Mastery: Real-World Implementations and ROI-Driven Success Stories
John Fraser | Founder @ PartsAsap
Learn practical strategies for implementing scalable web data extraction. Gain insights into strategic planning, innovation, and operational efficiency to maximize ROI and drive organizational growth.
A Practical Demonstration of How to Responsibly Use Big Data to Train LLMs
Joachim Asare | AI/ML Engineer & Master’s in Design Engineering @Harvard University
Explore ethical methods for extracting and leveraging big data to train LLMs, focusing on privacy, transparency, and fairness throughout the AI development lifecycle.
How We Transformed Zyte's Data Business with Cutting-Edge AI Technology
Iain Lennon | CPO @ Zyte
Learn how we leveraged a composite AI approach of in-house trained models and open-source LLMs to revolutionize our data operations, and learn to apply these insights to transform your own data processes.
The Future of Proxy Technology: Trends, Innovations, and Real-World Applications in Residential, Mobile, and DataCenter Proxies
Shane Evans -CEO @Zyte | JASON GRAD - CEO AND CO-FOUNDER @MASSIVE | NEIL EMEIGH - CEO @RAYOBYTE | Ovidiu DRAGUSIN - BUSINESS DEVELOPMENT DIRECTOR @SERVERSFACTORY | Vlad Harmanescu - Proxy Department Manager @ Pubconcierge | Tal Klinger - Co-founder & CEO @ The Social Proxy
Zyte CEO Shane Evans will host a panel of proxy market leaders, discussing emerging trends, breakthrough innovations, and practical applications across residential, mobile, and datacenter proxies.
Distributed Intelligence for Distributed Data
Matthew Blumberg | Co-Founder @ Charity Engine
Learn how to discover and learn from data at web scale, by pairing distributed computing with distributed data collection. Explore techniques for efficiently processing and analyzing massive datasets across the web.
Navigating the Legal Landscape of Web Data Extraction | Legal Panel
Sanaea Daruwalla - Chief Legal & People Officer @Zyte | Hope Skibitsky - Partner @Quinn Emanuel | Stacey Brandenburg - Shareholder @ZwillGen | Don D'Amico, Founder & CEO @Glacier Network and former General Counsel at Neudata
Join Zyte's Legal Counsel, Sanaea Daruwala, as she leads a panel of top legal minds exploring crucial insights on compliance, regulations, and ethical considerations in web data extraction.
Advanced techniques and innovations for extracting specific data attributes from diverse sources
Iván Sánchez | Senior Data Scientist @ Zyte
Discover the latest techniques and innovations in data extraction. Learn how to precisely pull specific data attributes from various sources using advanced methods like machine learning and natural language processing
Cache, Cookies & Reconnects: How to accelerate your scrapes with session management
Joel Griffith | CIO @ Browserless
Get tips on optimizing headless browsers for web scraping with Playwright and Puppeteer. Learn how Browserless enhances managed browser use in creative marketing and data extraction.
How to feed Large language models (LLMs) with data from the web
Jan Čurn | CEO @ Apify
Gain a comprehensive understanding of feeding large language models (LLMs) with web data, addressing challenges like blocking, dynamic content rendering, and data volume.
Enabling Large Language Models agents to understand the web
Asim Shrestha | Co-founder & CEO @ Reworkd AI
Discover the future of AI in web data extraction. Explore how open-source AI makes public data accessible and see demonstrations of innovative techniques transforming data extraction.
october 9th 2024 | Austin, Texas
Extract Labs
Deep Dive into Zyte AI Spiders
Designed to provide you with in-depth knowledge about the mechanics, applications, and adaptation of Zyte AI Spiders. This session is tailored for developers, data scientists, and anyone keen on leveraging the power of advanced web scraping technologies.
Efficient Data Extraction from Modern Web Apps
Dive into advanced data extraction methods, including strategies for tackling JavaScript-heavy websites, choosing between browser-based and headless approaches, and expert techniques for handling AJAX requests and network capture. Learn how to automate complex user interactions for efficient data collection.
Master Advanced Web Scraping Techniques
In the talk, Fabien will perform live demos on how to bypass many protections. He will share tips and techniques to bypass these protections. The workshop will demonstrate many open-source tools and the Zyte API (+Scripting API) to tackle advanced protections.
Join Us at Extract Summit 2024!
Don't miss out on this incredible opportunity to expand your knowledge, grow your network, and be part of a thriving community of data experts. Reserve your spot today !