QuantumCLEF

Our Tasks

QuantumCLEF 2025 addresses three different tasks involving computationally-intensive problems that are closely related to the Information Access field: Feature Selection, Instance Selection, and Clustering. There is one problem for each task and each problem is solvable with the QA paradigm. For each one of the tasks, participants will be asked to both submit their solutions using Quantum Annealing and Simulated Annealing to compare the two methods both in terms of efficiency and effectiveness.

Task 1: Feature Selection

Apply quantum annealers to find the most relevant subset of features to train a learning model, e.g., for ranking. This problem is very impactful, since many IR and RS systems involve the optimization of learning models, and reducing the dimensionality of the input data can improve their performance.

Task 1A: The IR Task

Select the most relevant features in the considered datasets to train a LambdaMART model and thus achieve the highest score. A baseline using RFE with the Logistic Regression classifier will be used as an overall alternative.

Datasets

MQ2007 (one of the LETOR datasets)
ISTELLA (this one will be an additional challenge since the number of features cannot fit directly in the QPU)

Metrics

The obtained features will be then used to train a LambdaMART model and measure its performance on the Test Dataset. The performance will be measured in terms of nDCG@10.

Task 1B: The RS Task

The task is to select the subset of features that will produce the best recommendation quality when used for an Item-Based KNN recommendation model. The KNN model computes the item-item similarity with cosine on the feature vectors and applies to the denominator a shrinkage of 5, the number of kneighbors to is 100. The baselines for this task are the same Item-Based KNN recommendation model trained using all the features, and then trained using the features selected by a bayesian search optimizing the model recommendation effectiveness.

Datasets

The dataset is private and refers to a task of music recommendation. The dataset contains both collaborative data and two different sets of item features:

150_ICM: Contains 150 features for each item.
500_ICM: Contains 500 features for each item.

The User Rating Matrix (URM) contains tuples in the form (UserID, ItemID), listing which user interacted with which item. The Item Content Matrix (ICM) contains tuples in the form (ItemID, FeatureID, Value), note that the ICM is sparse and any missing (ItemID, FeatureID) couples should be treated as missing data. A common assumption is to use a value of 0. The features refer to different types of descriptors and tags associated to the songs. Some of the features have been normalized. The Training Dataset can be downloaded HERE. Note that a private holdout of the data will be used for testing.

Metrics

The selected features will be used to train an Item-Based KNN recommendation model and measure its performance on the Test Dataset with nDCG@10.

Submissions

Participants should submit the final set of features selected through their own solution using only the provided Training Datasets. Each participating team can provide at most 5 different subsets of features, so that it is possible to try different alternatives to achieve the best selection.
The submissions should be done by using Quantum Annealing and Simulated Annealing to compare the performance of quantum annealers and a possible traditional hardware alternative.
The submissions should be txt files according to the following format:

[Task]_[Dataset]_[Method]_[Groupname]_[SubmissionID].txt1
4
5
8
...
44
45
['id1', 'id2', ..., 'idn']

where each line reports one of the features that is kept. All the features that are removed should not be reported in this file. In this example, features 1, 4, 5, 8, ..., 44, 45 are kept while features 2, 3, 6, 7, ... are removed.
The last line of the file represents the ids of the solved problems that relate to this given submission. For example, if you solved 3 different problems with QA or SA to achieve the final submission (e.g., you split the problem into subproblems and solve them separately), you should provide their ids in a list at the end of the submission file. Their ids can be retrieved directly from the code or through the dashboard.
The submission files should be left in your workspace in the directory called /config/workspace/submissions and the name of the files should comply to the following rules:

[Task]: it should be either 1A or 1B based on the task the submission refers to
[Dataset]: it should be either MQ2007, ISTELLA, 150_ICM or 500_ICM based on the dataset used
[Method]: it should be either QA or SA based on the method used
[Groupname]: the name of your group
[SubmissionID]: a submission ID that must be the same for the submissions using the same algorithm but performed with different methods (e.g., QA or SA)

Task 2: Instance Selection

Apply quantum annealers to identify the most representative subset of instances to train a learning model, specifically for text classification tasks. This problem is critical as it addresses the high computational costs of fine-tuning large language models (LLMs) like Llama3.1, while ensuring that their effectiveness remains comparable to training on the entire dataset. The provided datasets will be Vader NYT and Yelp Reviews, which will be split into five-fold cross-validation sets. The extracted subsets will then be used to fine-tune the Llama3.1 model for text classification. Notice, there is a clear trade-off between effectiveness and reduction. For instance, you could select the sub-sample randomly, which would be fast, but you wouldn’t guarantee the effectiveness. Effectiveness will be measured in terms of the Macro-F1 score. We also will measure the achieved reduction. A baseline using a simple random sampling method will be provided as an alternative.

Datasets

Vader NYT: A sentiment-labeled dataset from New York Times articles;
Yelp Reviews: A collection of customer reviews labeled with sentiment scores.

Metrics
The performance will be evaluated on the test sets from the five-fold cross-validation splits using the Macro-F1 score, ensuring a fair and comprehensive assessment of effectiveness.

Submissions

Participants should submit the final set of instances selected through their own solution using only the provided Training Datasets. Each participating team can provide at most 5 different subsets of instances for each dataset based on the respective folds. The submissions should be done by using Quantum Annealing and Simulated Annealing to compare the performance of quantum annealers and a possible traditional hardware alternative.

[Dataset]_[FoldNumber]_[Method]_[Groupname]_[SubmissionID].txt0
1
2
5
...
n
['id1', 'id2', ..., 'idn']

where each line reports a number corresponding to the document to keep in the considered dataset/fold. In this case, for the considered dataset and fold, you decided to keep documents number 0,1,2,5,...,n and to discard the others.
The last line of the file represents the ids of the solved problems that relate to this given submission. For example, if you solved 3 different problems with QA or SA to achieve the final submission (e.g., you split the problem into subproblems and solve them separately), you should provide their ids in a list at the end of the submission file. Their ids can be retrieved directly from the code or through the dashboard.
The submission files should be left in your workspace in the directory called config/workspace/submissions and the name of the files should comply to the following rules:

[Dataset]: it should be either Vader or Yelp based on the dataset considered
[Fold]: the fold number from 0 to 4
[Method]: it should be either QA, SA, or H based on the method used
[Groupname]: the name of your group
[SubmissionID]: a custom submission ID decided by you to identify this specific submission. It must be the same for the submissions using the same algorithm but performed with different methods (e.g., QA or SA) so that we will easily compare the performance between Quantum and traditional approaches.

For example, given an algorithm to solve this problem, the first fold of the Yelp dataset, you can provide a submission using QA and SA as follows:

Yelp_1_QA_QuantumGroup_MyAlgorithm.txt

Yelp_1_SA_QuantumGroup_MyAlgorithm.txt

Task 3: Clustering

Use QA to cluster different documents in the form of embeddings to ease the browsing process of large collections. Clustering can be helpful for organizing large collections, helping users to explore a collection and providing similar search results to a given query. Furthermore, it can be helpful to divide users according to their interests or build user models with the cluster centroids speeding up the runtime of the system or its effectiveness for users with limited data. Clustering is however a very complex task in the case of QA since it is possible to perform clustering only considering a limited number of items and clusters due to the architecture of quantum annealers. A baseline using K-medoids clustering with cosine distance will be used as an overall alternative.

Task 3A: The IR Task

Obtain a list of representative centroids of the given dataset of embeddings (each embedding is a sentence picked from Yahoo). The cluster quality will be then measured with standard evaluation measures for clustering and also with opportune test queries that will be used to retrieve the most relevant movie plots for each query. Instead of comparing the query embedding with every document embedding in the corpus, the search will be restricted to the clusters that are most likely to contain relevant documents, thereby reducing the search space and improving retrieval speed.

Datasets

A split of the ANTIQUE dataset in which each sentence taken from Yahoo is turned into an embedding using a transformer. The split size will be of roughly 6500 sentences and also another smaller dataset of roughly 2200 sentences is provided to test the clustering algorithm.

Metrics

the Davies-Bouldin Index will be used to measure the overall cluster quality;
the nDCG@10 will be used to measure the overall retrieval quality based on a set of 50 queries. Each query will be transformed into its corresponding embedding, then the Cosine Similarity is used to get the closest centroid and its corresponding cluster of documents, finally all the documents belonging to that cluster will be retrieved and ranked using the Cosine Similarity between the documents and the query.

Submissions

Participants should submit a list of 10, 25 and 50 vectors that represent the final centroids achieved through their clustering algorithm. Each centroid should also be followed by the list of documents that belong to the given cluster. Each team can provide at most 5 different submissions. The submissions should be done by using Quantum Annealing and Simulated Annealing to compare the performance of quantum annealers and a possible traditional hardware alternative.
The submissions should be txt files according to the following format:

[Centroids]_[Method]_[Groupname]_[SubmissionID].txt[ 
	{'centroid' : [coord1, coord2, ..., coord767, coord768], 'docs': ['id1', 'id2', ..., 'idn']},
	{'centroid' : [coord1, coord2, ..., coord767, coord768], 'docs': ['id1', 'id2', ..., 'idn']},
	...
	{'centroid' : [coord1, coord2, ..., coord767, coord768], 'docs': ['id1', 'id2', ..., 'idn']},
]
['id1', 'id2', ..., 'idn']

where each line reports one of the centroids with its coordinates and its corresponding associated documents using only their ids.
The last line of the file represents the ids of the solved problems that relate to this given submission. For example, if you solved 3 different problems with QA or SA to achieve the final submission (e.g., you split the problem into subproblems and solve them separately), you should provide their ids in a list at the end of the submission file. Their ids can be retrieved directly from the code or through the dashboard.
The submission files should be left in your workspace in the directory called config/workspace/submissions and the name of the files should comply to the following rules:

[Centroids]: it should be either 10, 25 or 50 based on the number of centroids
[Method]: it should be either QA or SA based on the method used
[Groupname]: the name of your group
[SubmissionID]: a submission ID that must be the same for the submissions using the same algorithm but performed with different methods (e.g., QA or SA)

Deadlines

Here you can find all the important strict deadlines:

Registration closes: April 25, 2025
Runs submission deadline: ~~May 10, 2025~~ Possible extension. More info will be provided soon.
Evaluation results out: May 20, 2025
Participant's papers submission deadline: May 30, 2025.
Notification of acceptance for participant's papers: June 27, 2025
Camera-ready participant's papers submission: July 7, 2025
QuantumCLEF Workshop: September 9-12, 2025 during the CLEF Conference

Contacts

Here it is possible to find the email addresses of all the organizers of the Lab. Do not hesitate to contact us for any information!

Maurizio Ferrari Dacrema: maurizio.ferrari@polimi.it
Paolo Cremonesi: paolo.cremonesi@polimi.it
Nicola Ferro: nicola.ferro@unipd.it
Marcos André Gonçalves: mgoncalv@dcc.ufmg.br
Washington Cunha: washingtoncunha@dcc.ufmg.br
Andrea Pasin: andrea.pasin.1@phd.unipd.it