Arnold Alexander
1 min read ·
Join us for an insightful webinar with Konstantin Lopukhin, Head of Data Science at Zyte. This session is specifically designed for AI enthusiasts, data scientists, and machine learning engineers who are looking to optimize their handling of large language models.
In this session, you will learn about:
Techniques such as quantization, continuous batching, and speculative decoding to enhance efficiency.
The pros and cons of various implementations, including exllamav2, vllm, and TensorRT-LLM.
Guidance on selecting the best approach based on model size, available hardware, and target performance metrics.
Whether you are looking to improve your current model serving strategies or planning to implement new ones, this webinar will equip you with the insights and practical advice needed to achieve a throughput-optimized regime.
For any follow-up questions after watching the webinar, join our Discord community and engage directly with the team. We are a thriving community of 3000+ web scraping enthusiasts, committed to sharing insights, learning and exploring new technologies, and advancing in web scraping.
More webinars
AnnouncementA practical walkthrough of the Web Scraping Industry Report 2026, covering how AI, automation, and access controls are reshaping web data collection at scale.
2 min read
AnnouncementLearn how to prepare for modern anti-bot systems with advanced unblocking tactics.
2 min read
How ToJoin Hyder Khan | Data Engineer, @ Flipdish as he shares how to extract, clean, analyze, and visualize web data using a seamless workflow with Streamlit.
1 min read
G2.com