immuneML is a software platform for machine learning analysis of adaptive immune receptors and repertoires (AIRR). This dataset contains the original specification files and complete results for immuneML use case 2: Extending immuneML with a deep learning component for predicting antigen specificity of paired receptor data For more information about immuneML, see the documentation: https://docs.immuneml.uio.no/ The immuneML specification files in this dataset (use_case_AVFDRKSDAK.yaml, use_case_GILGFVFTL.yaml, use_case_KLGGALQAK.yaml) are compatible with immuneML version 1.1.1. Results (AVFDRKSDAK.zip, GILGFVFTL.zip, KLGGALQAK.zip) were generated with immuneML version 1.1.1 and TCRdist3 version 0.1.9 (doi: 10.1101/2020.12.24.424260). In this use case, various methods are compared for predicting antigen binding of immune receptors and immune receptor sequence clustering. The three data files (KLGGALQAK.tsv, AVFDRKSDAK.tsv and GILGFVFTL.tsv) contain examples of immune receptors that bind and do not bind to the epitopes KLGGALQAK, AVFDRKSDAK and GILGFVFTL. For detailed information about this use case, the input datasets, and versions of the specification files compatible with the latest version of immuneML, see the documentation for this use case: https://docs.immuneml.uio.no/usecases/extendability_use_case.html This use case consists of two parts: - First, immuneML was used to compare compare the following methods for predicting antigen binding of paired-chain immune receptors, and to produce antigen specificity-determining motifs: - Logistic regression with k-mer frequency encoding - CNN with one-hot encoding - A k-nearest neighbors classifier using the TCRdist3 distance metric This was done using the configuration files use_case_AVFDRKSDAK.yaml, use_case_GILGFVFTL.yaml and use_case_KLGGALQAK.yaml The results produced by immuneML and accompanying HTML files for navigating the results can be found in AVFDRKSDAK.zip, GILGFVFTL.zip and KLGGALQAK.zip - Furthermore, GLIPH version 2 (doi: 10.1038/s41587-020-0505-4) was used to cluster the immune receptor sequences that bind to the epitope GILGFVFTL, to produce antigen specificity-determining motifs. The input file for this was GLIPH2_input_GILGFVFTL.tsv, and the output file is GLIPH2_output_GILGFVFTL.csv For details, see the documentation for this use case: https://docs.immuneml.uio.no/usecases/extendability_use_case.html