Skip to content

DeepikaMobileDeveloper/women-in-ai-hackathon

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

40 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Women in AI Hackathon

Registration Page, Discord, Looping Deck, Final Presentation template, Team Registration form

Basic RAG locally, Basic RAG in Google Colab

Introduction

Welcome to the first Women in AI Hackathon, hosted by Zilliz and sponsored by TwelveLabs, Arize AI, OmniStack, StreamNative, AWS, and Mistral.

This repo provides all required information for the day as well as serving as the starting point for your submission. Direct any questions to Stefan Webb before the day of the hackathon and to the Discord or in-person mentors on the day.

Schedule

  • 8.30-9.00: Check-in, light breakfast
  • 9.00-9.30: Kickoff
  • 9:30-10.00: Team reveal and challenge recap
  • 10.00: Let the Hacking Begin!
  • 12.00-13.00: Lunch and speakers
  • 13.00-17.30: More Hacking!
  • 17.30: Hard submission and code freeze
  • 17.30-18.00: Work on presentations
  • 18.00-19.30: Showcase your project
  • 19.30-20.00: Judges award prizes

Before the Day

There a couple of items we recommend completing in advance of the hackathon:

GitHub, Discord

If you have not already, set up a GitHub account plus the necessary Git tooling on your system. Also, join the Discord server, for the hackathon and introduce yourself.

Set Up Dev Environment

Clone this repo and set up your development environment. Your environment must allow you to develop a solution within the constraints of the prompt, that is, developing a RAG application in Python using Milvus or Zilliz Cloud.

We recommend:

Please confirm that you can run the starter notebooks on your platform:

You may also wish to confirm that you can start and use a Milvus Standalone deployment locally and access the free-tier of Zilliz Cloud.

Download Datasets

We recommend downloading in advance any datasets you wish to explore with your teammates to save time and reduce stress on the on-site WiFi.

Here are some suggested open-source datasets:

Note

The choice of dataset and data modality is an excellent opportunity to showcase your creativity!

It may help to choose datasets whose vector embeddings have been pre-calculated, or else to calculate and save them in advance. Otherwise, you can calculate embeddings for the dataset locally during the hackathon, or use free credits provided by our sponsors to perform this embedding in the cloud.

Here are some suggested open-source embedding models for text:

Note

You are not restricted to working with text. Consider image, video, audio, 3d meshes, graphs, and other modalities. Twelve Labs offers some excellent models for video embedding and inference. See their website for more details.

Download Foundation Models

We also recommend downloading in advance any foundation models you plan to use locally during the hackathon. Here are some suggested open-source general-purpose foundation models (also look for quantized versions on HF):

And specialized fine-tuned models:

Important

Some foundation models on HuggingFace, for example, Llama 3.x, require obtaining permission from the authors to download. It can take up to several days for permission to be granted, so we recommend that you do this in advance of the hackerthon.

Note

Multimodal models offer many avenues for creativity, and a technically sophisticated solution is likely to make use of several fine-tuned models for specific parts of the pipeline.

Tip

As an alternative, see here for free credits provided by our sponsors to perform model inference.

Zilliz, AWS, and Mistral have a generous free-tier for their cloud services.

Twelve Labs has kindly provided 10 free hours of credit for their inference service, including video foundation models.

OmniStack is providing over $500 credits for their inference, monitoring, and deployment services.

StreamNative is offering $200 in free credits for their cloud data platform.

Let's Hack!

Overview

At 9.30-10am, we will reveal the team assignment. Teams comprise 3-5 hackers of varying experience and backgrounds. Of course, you may negotiate a team change with your fellow hackers if you wish although encourage you to pair with people you have not previously met.

After settling on your teams, please decide on a team lead and complete the Team Registration form. You will have from 10am - 5.30pm to develop a submission with your team. Before 5.30pm push your final submission to your cloned repo.

Important

At this time, no further code changes will be considered by the judges.

Additional time from 5.30-6.00pm is provided to work on your presentation (see submission instructions below). Finally, each team will make a short presentation before the judges make a decision and announce the results!

Prompt

Build a retrieval-augmented generation (RAG) system for one of the following applications:

  • A recommender system;
  • A question/answering system for a specialized > domain;
  • A product review summarizer;
  • A personalized job recruiter; or,
  • Something of your own imagination!

Your submission must run in Python and use Milvus (any deployment type) or Zilliz Cloud as the underlying vector database. We recommend but do not require your submission to use Jupyter Notebook or Gradio.

You may use agentic steps in your RAG pipeline and free credits from our sponsors are available for embedding and foundation model inference.

Note

We provide suggested RAG applications, datasets, models etc. to give some structure to your starting point. Although, we want to emphasize that these are only suggestions - follow your creativity and passion!

Submission Instructions

Your chosen team lead submits your team's code via their fork of this GitHub repo.

Important

Set the necessary permissions so that the judges have access both to your GitHub repo and the final presentation slides.

  • 10am - 5.30pm: Hack, hack, hack! Submit your code via pushes to your forked GitHub repo throughout the day.

Important

Ensure your final code is submitted before 5.30pm!

  • 5.30pm - 6pm: Finalize your presentation slides saving to your copy of the Google slides template.
  • 6pm - 7.30pm: Each team presents their project via Jupyter notebook, Gradio app, or some other way.
  • 7.30pm - 8pm: Judges announce results!

Judging Criteria

The judges will rank the teams' submissions in 3 criteria, separately:

  • creativity;
  • technical sophistication; and,
  • potential business impact.

In the spirit of RAG, the teams rankings will be combined into a single score per-judge with Reciprocal Rank Fusion (RRF). The per-judge score of a team is,

k = 10
score = 1 / (rank_creativity + k) + 1 / (rank_technical + k) + 1 / (rank_business + k)

where the rank terms denote the team's ranking for a given judge and criterion. The final score per team is the average of team scores across judges. What this means is that the winning team must score highly across all 3 criteria with a consensus across judges.

We will provide a breakdown of team scores by final score and score per criterion separately (naturally, with error bars).

Prizes

  • First prize: $1000 bucks, $10,000 AWS credits, Zilliz Blog Opportunity, Social Mentions, Swag
  • Second prize: $700 bucks, Zilliz Blog Opportunity, Social Mentions, Swag
  • Third prize: $500 bucks, Social mentions, Swag
  • Top score using Mistral models: $500 Mistral credits
  • Everybody: Satisfaction from a job well done!

Resources

Sponsors (Alphabetical Order)

More details of our sponsors and how to use their free cloud credits are provided here.

Gold

Silver

Prizes

About

Starting point for the Women in AI RAG Hackathon, Jan 25 2025

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%