A comprehensive benchmark for estimating the accuracy of protein complex structural models (EMA)
PSBench consists of 4 complementary datasets:
-
- CASP15_inhouse_dataset
-
- CASP15_community_dataset
-
- CASP16_inhouse_dataset
-
- CASP16_community_dataset
Category | Quality scores / features |
---|---|
Global Quality Scores | tmscore (4 variants), rmsd |
Local Quality Scores | lddt |
Interface Quality Scores | ics, ics_precision, ics_recall, ips, qs_global, qs_best, dockq_wave |
Additional Input Features (CASP15_inhouse_dataset and CASP16_inhouse_dataset) | type, afm_confidence_score, af3_ranking_score, iptm, num_inter_pae, mpDockQ/pDockQ |
For detailed explanations of each quality score and feature, please refer to Quality_Scores_Definitions
CASP15_inhouse_dataset consists of a total of 7,885 models generated by MULTICOM3 during the 2022 CASP15 competition.
CASP15_community_dataset consists of a total of 10,942 models generated by all the participating groups during the 2022 CASP15 competition.
CASP16_inhouse_dataset consists of a total of 1,009,050 models generated by MULTICOM4 during the 2024 CASP16 competition.
CASP16_community_dataset consists of a total of 12,904 models generated by all the participating groups during the 2024 CASP16 competition.
generate various evlaution scores
Following are the prerequisites to generate the labels for new benchmark dataset:
- Predicted structures
- Native structure
- Fasta file
- Openstructure
- USalign
Download the PSBench repository and cd into scripts
git clone https://github.com/BioinfoMachineLearning/PSBench.git
cd PSBench
cd scripts
docker pull registry.scicore.unibas.ch/schwede/openstructure:latest
Check the docker installation with
# should print the latest version of openstructure
docker run -it registry.scicore.unibas.ch/schwede/openstructure:latest --version
Requires 6 arguments:
- -f : path to the fasta file for the target
- -pp : path to the predicted pdbs directory for the target
- -np : path to the native pdb file for the target
- -o : path to the output directory
- -tmp : path to the temporary directory
- -c : path to the clustalw binary (available in tools/clustalw1.83/clustalw)
python filter_pdb.py --f /path/to/fasta_file -pp /path/to/predicted_pdbs_directory -np /path/to/native_pdb_file -o /path/to/output_directory -tmp /path/to/temporary_directory -c /path/to/clustalw_binary_file
Run openstructure (required for ics, ics_precision, ics_recall, ips, qs_global, qs_best, lddt, rmsd, dockq_wave, mmalign_tmscore)
Requires 3 arguments:
- --indir : path to the folder containing predicted pdbs
- --nativedir : path to the corresponding native pdb
- --outdir : path to the output folder
python run_openstructure.py --indir /path/to/predicted_pdb_folder/ --nativedir /path/to/native_pdb_file --outdir /path/to/output_folder
Run USalign for original predicted structure and original native structure (required for tmscore_usalign)
Requires 4 arguments:
- --indir : path to the folder containing original predicted pdbs
- --nativedir : path to the corresponding original native pdb
- --outdir : path to the output folder
- --usalign_program : path to the USalign binary (available at tools/USalign)
python run_usalign.py --indir /path/to/predicted_pdb_folder/ --nativedir /path/to/native_pdb_file --outdir /path/to/output_folder --usalign_program /path/to/USalign_binary
Run USalign for filtered predicted structure and filtered native structure (required for tmscore_usalign_aligned)
Requires 4 arguments:
- --indir : path to the folder containing filtered predicted pdbs
- --nativedir : path to the corresponding filtered native pdb
- --outdir : path to the output folder
- --usalign_program : path to the USalign binary (available at tools/USalign)
python run_usalign.py --indir /path/to/predicted_pdb_folder/ --nativedir /path/to/native_pdb_file --outdir /path/to/output_folder --usalign_program /path/to/USalign_binary
Requires 5 arguments:
- -pp : path to the predicted pdbs directory for the target
- -os : path to the openstructure results for the target
- -tm_u : path to the tmscore_usalign results for the target
- -tm_ua : path to the tmscore_usalign_aligned results for the target
- -oc : path where the output csv is to be saved
python create_csv.py -pp /path/to/predicted_pdbs_directory -os /path/to/openstructure_results_directory/ -tm_u /path/to/tmscore_usalign_results_directory -tm_ua /path/to/tmscore_usalign_aligned_results_directory -oc /path/to/output_csv_file