Why Attend?
KX’s Generative AI Meetups are for developers in all stages of their AI journey. We focus on aiding the use of AI in production applications through talks from various companies with specialized knowledge of AI development.
This meetup will focus on “Embeddings at Scale” which is a crucial topic for deploying production GenAI use cases, especially with vector databases. We have a great list of speakers and hope you can join us to further your knowledge and network with peers in the industry!
Speakers
Charles Frye
AI Engineer, Modal Labs
Amog Kamsetty
ex Founding ML Engineer, Anyscale
Bin Fan
Founding Member, Alluxio
Agenda
5:30pm – 6:00pm Welcome & Registration
6:00pm – 7:30pm Showcase
Scaling Temporal Similarity Search for Technical Analysis – Michael Ryaboy, Developer Advocate, KDB.AI
Michael will go over how Temporal Similarity Search can be used for Technical Analysis, anomaly detection, and scaled to hundreds of millions of vectors.
Effortlessly Infinite Embeddings with Modal – Charles Frye, AI Engineer, Modal Labs
In this talk, Modal AI Engineer Charles Frye will share projects built using embeddings, including a recommendation system for a simulated Twitter, vibes-based search of California, and more.
Designing a Scalable Distributed Cache for ML Training Datasets in the Cloud – Bin Fan, Founding Member, Alluxio
Bin Fan, Founding Engineer at Alluxio, will go over insights from building a distributed cache for ML training.
In the rapidly evolving landscape of machine learning, efficiently managing and accessing large datasets is critical for training models at scale. In this session, Bin Fan, Founding Engineer at Alluxio, will share insights from the journey of creating a scalable and reliable distributed cache specifically designed for ML training datasets and checkpoints on popular frameworks like PyTorch, Ray, and TensorFlow.
The talk will explore the distinct requirements of machine learning compared to traditional big data analytics—where Alluxio was initially developed. Bin will highlight the challenges posed by ML-specific data access patterns, diverse data formats, and the need for optimal resource management in these environments. He will discuss how these factors influenced the design and optimization of a distributed cache that meets the stringent demands of modern ML workloads.
The session will conclude with an analysis of benchmark results, showcasing the performance gains and scalability improvements achieved through this distributed caching solution, and how it leads to enhanced GPU utilization and overall efficiency in ML training pipelines.
Scaling LLM and embedding generation workloads with Ray – Amog Kamsetty, ex Founding ML Engineer, Anyscale
7:30pm – 9:00pm Networking & Complimentary Food and Beverages
Venue
Marigold Event Space
194 Church St.
San Francisco, CA 94606
The Marigold entrance is just to the right of Churchill’s main door –
please look for a black gate-door and a security guard.