leetcode-difficulty-estimator

Semestral project for Neural Networks course at MFF, Charles University.

Authors: Mihal Filip, Trappl Juraj 2024.

The work is summarized in slides.pdf.

Task

Given a text description of a programming problem, predict its difficulty - Easy/Medium/Hard. We tried both clasiffication and regression approaches.

Data

Our dataset consists of 2366 free programming problems from the LeetCode. We queried the LeetCode GraphQL API to get the data. Class imbalance present (~50% medium problems, the rest are ~equal proportions of Easy and Hard problems).

Models

Classification

Perceptron, Linear SVM, MLP classifier, MLP with BERT embeddings as features

Regression

MLP regressor

Results

Best parameters are written in slides.pdf.

Shallow learning models: Obtained from stratified 5-fold CV that averages f1 scores, tf-idf features:

model	average f1 macro	average f1 micro
Perceptron	58.26	58.56
SVM linear kernel	60.94	61.15
MLP classifier	59.42	59.61

Classifier on top of contextualized BERT embeddings:

bert-base-uncased
- no fine-tuning (not enough training examples)
- mean pooling, final feature size is (768,)
Tensorflow 2.12
- HyperBand hyperparameter optimization from KerasTuner

model	test accuracy
1. layer embeddings	51.7
2. layer embeddings	49.1

In-context learning classification

Using Llama-13b-chat from HF. Selected one representative from each difficulty (tried to take a problem with ~25% acceptance rate) and created a few-shot learning prompt. Built with LangChain.

prompt = PromptTemplate.from_template(
    """
    <s>[INST] <<SYS>>
    Task: Given a programming problem description, predict its difficulty.
    The difficulty can be one of easy, medium and hard.
    
    Example:
    Given a programming problem description: {programming_problem_example_1}, the difficulty is:
    easy

    Example:
    Given a programming problem description: {programming_problem_example_1}, the difficulty is:
    medium
    
    Example:
    Given a programming problem description: {programming_problem_example_1}, the difficulty is:
    hard

    <<SYS>>
    Now, given a programming problem description: {programming_problem}, the difficulty is:
    [/INST]
    """
)

3 training examples, 2360 testing examples - ~40% accuracy.

Final words

generally a hard task, difficulty of task may not seem objective
hyperparameters were not so much optimized
- they were only optimized for dense classifiers, but those does not have enough training data
not enough training data for models to be able to generalize well
fun project :-)

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
bert_embeddings		bert_embeddings
confusion_matrices		confusion_matrices
data		data
trained_models		trained_models
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
bert_embeddings_mlp.ipynb		bert_embeddings_mlp.ipynb
llama2-few-shot-leetcode.ipynb		llama2-few-shot-leetcode.ipynb
requirements.txt		requirements.txt
rnn.ipynb		rnn.ipynb
sklearn_pipeline.py		sklearn_pipeline.py
sklearn_pipeline_v2.ipynb		sklearn_pipeline_v2.ipynb
slides.pdf		slides.pdf
visualize.ipynb		visualize.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

leetcode-difficulty-estimator

Task

Data

Models

Classification

Regression

Results

Final words

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

leetcode-difficulty-estimator

Task

Data

Models

Classification

Regression

Results

Final words

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages