CandidateSearch 1.1.2
Proof-of-concept implementation of a search engine that uses sparse matrix multiplication to identify the best peptide candidates for a given mass spectrum.
|
The following are benchmarks of the different sparse matrix/vector multiplication methods of Eigen and cuSPARSE that are implemented in CandidateSearch using real mass spectrometry data.
For all benchmarks we search the spectra from file benchmarks/XLpeplib_Beveridge_QEx-HFX_DSS_R1_deconvoluted.mgf
against the database benchmarks/Cas9+uniprotkb_proteome_UP000005640_AND_revi_2024_03_22.fasta
(Cas9 + human SwissProt sequences) with settings benchmarks/settings.txt
.
We ran every benchmark five times to get a more comprehensive overview of computation times. The averages are plotted below, with error bars denoting standard deviation. All benchmarks were conducted during light background usage (e.g. open browser, text editor, etc.).
The following terms are used synonymously throughout the document:
f32CPU_SV
: Float32-(CPU-)based sparse matrix * sparse vector search (using Eigen)i32CPU_SV
: Int32-(CPU-)based sparse matrix * sparse vector search (using Eigen)f32CPU_DV
: Float32-(CPU-)based sparse matrix * dense vector search (using Eigen)i32CPU_DV
: Int32-(CPU-)based sparse matrix * dense vector search (using Eigen)f32CPU_SM
: Float32-(CPU-)based sparse matrix * sparse matrix search (using Eigen)i32CPU_SM
: Int32-(CPU-)based sparse matrix * sparse matrix search (using Eigen)f32CPU_DM
: Float32-(CPU-)based sparse matrix * dense matrix search (using Eigen)i32CPU_DM
: Int32-(CPU-)based sparse matrix * dense matrix search (using Eigen)f32GPU_DV
: Float32-(GPU-)based sparse matrix * dense vector search (using cuSPARSE)f32GPU_DM
: Float32-(GPU-)based sparse matrix * dense matrix search (using cuSPARSE)f32GPU_SM
: Float32-(GPU-)based sparse matrix * sparse matrix search (using cuSPARSE)The system we tested this on was a desktop PC with the following hardware:
*_Note:_ Dual
is part of the name, this is a single graphics card!
We ran four different parameter sets to also investigate the effects of parameter NORMALIZE
and parameter USE_GAUSSIAN
on the performance.
Figure 1: Int32-based sparse matrix * dense matrix search using Eigen yields the fastest computation time of 170.17 seconds.
Method | Run 1 | Run 2 | Run 3 | Run 4 | Run 5 | Min | Max | Mean | SD | Rank | Normalize | Use Gaussian | Spectra | Database |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
f32CPU_DV | 284.598 | 282.308 | 281.566 | 281.179 | 285.348 | 281.179 | 285.348 | 283 | 1.86529 | 6 | False | False | XLpeplib_Beveridge_QEx-HFX_DSS_R1_deconvoluted.mgf | Cas9+uniprotkb_proteome_UP000005640_AND_revi_2024_03_22.fasta |
i32CPU_DV | 261.492 | 262.781 | 262.77 | 265.513 | 256.328 | 256.328 | 265.513 | 261.777 | 3.38113 | 5 | False | False | XLpeplib_Beveridge_QEx-HFX_DSS_R1_deconvoluted.mgf | Cas9+uniprotkb_proteome_UP000005640_AND_revi_2024_03_22.fasta |
f32CPU_SM | 373.172 | 379.473 | 377.16 | 379.847 | 382.457 | 373.172 | 382.457 | 378.422 | 3.48481 | 8 | False | False | XLpeplib_Beveridge_QEx-HFX_DSS_R1_deconvoluted.mgf | Cas9+uniprotkb_proteome_UP000005640_AND_revi_2024_03_22.fasta |
i32CPU_SM | 357.109 | 375.115 | 357.663 | 369.649 | 356.206 | 356.206 | 375.115 | 363.148 | 8.66346 | 7 | False | False | XLpeplib_Beveridge_QEx-HFX_DSS_R1_deconvoluted.mgf | Cas9+uniprotkb_proteome_UP000005640_AND_revi_2024_03_22.fasta |
f32CPU_DM | 194.913 | 197.195 | 198.649 | 194.847 | 197.228 | 194.847 | 198.649 | 196.567 | 1.64775 | 4 | False | False | XLpeplib_Beveridge_QEx-HFX_DSS_R1_deconvoluted.mgf | Cas9+uniprotkb_proteome_UP000005640_AND_revi_2024_03_22.fasta |
i32CPU_DM | 171.416 | 170.329 | 170.542 | 169.433 | 169.146 | 169.146 | 171.416 | 170.173 | 0.909477 | 1 | False | False | XLpeplib_Beveridge_QEx-HFX_DSS_R1_deconvoluted.mgf | Cas9+uniprotkb_proteome_UP000005640_AND_revi_2024_03_22.fasta |
f32GPU_DV | 183.134 | 190.537 | 194.355 | 189.926 | 186.384 | 183.134 | 194.355 | 188.867 | 4.27386 | 2 | False | False | XLpeplib_Beveridge_QEx-HFX_DSS_R1_deconvoluted.mgf | Cas9+uniprotkb_proteome_UP000005640_AND_revi_2024_03_22.fasta |
f32GPU_DM | 188.206 | 188.611 | 194.606 | 192.397 | 195.066 | 188.206 | 195.066 | 191.777 | 3.23992 | 3 | False | False | XLpeplib_Beveridge_QEx-HFX_DSS_R1_deconvoluted.mgf | Cas9+uniprotkb_proteome_UP000005640_AND_revi_2024_03_22.fasta |
Figure 2: Int32-based sparse matrix * dense matrix search using Eigen yields the fastest computation time of 211.33 seconds.
Method | Run 1 | Run 2 | Run 3 | Run 4 | Run 5 | Min | Max | Mean | SD | Rank | Normalize | Use Gaussian | Spectra | Database |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
f32CPU_DV | 342.748 | 337.409 | 330.165 | 331.124 | 326.996 | 326.996 | 342.748 | 333.689 | 6.31853 | 6 | False | True | XLpeplib_Beveridge_QEx-HFX_DSS_R1_deconvoluted.mgf | Cas9+uniprotkb_proteome_UP000005640_AND_revi_2024_03_22.fasta |
i32CPU_DV | 299.689 | 299.897 | 309.623 | 309.983 | 310.848 | 299.689 | 310.848 | 306.008 | 5.69166 | 5 | False | True | XLpeplib_Beveridge_QEx-HFX_DSS_R1_deconvoluted.mgf | Cas9+uniprotkb_proteome_UP000005640_AND_revi_2024_03_22.fasta |
f32CPU_SM | 423.513 | 431.445 | 429.494 | 427.879 | 429.016 | 423.513 | 431.445 | 428.269 | 2.95429 | 8 | False | True | XLpeplib_Beveridge_QEx-HFX_DSS_R1_deconvoluted.mgf | Cas9+uniprotkb_proteome_UP000005640_AND_revi_2024_03_22.fasta |
i32CPU_SM | 394.987 | 400.611 | 401.15 | 399.615 | 399.258 | 394.987 | 401.15 | 399.124 | 2.43362 | 7 | False | True | XLpeplib_Beveridge_QEx-HFX_DSS_R1_deconvoluted.mgf | Cas9+uniprotkb_proteome_UP000005640_AND_revi_2024_03_22.fasta |
f32CPU_DM | 242.766 | 245.519 | 244.166 | 244.985 | 244.774 | 242.766 | 245.519 | 244.442 | 1.05473 | 2 | False | True | XLpeplib_Beveridge_QEx-HFX_DSS_R1_deconvoluted.mgf | Cas9+uniprotkb_proteome_UP000005640_AND_revi_2024_03_22.fasta |
i32CPU_DM | 210.471 | 211.451 | 211.719 | 212.283 | 210.703 | 210.471 | 212.283 | 211.325 | 0.742546 | 1 | False | True | XLpeplib_Beveridge_QEx-HFX_DSS_R1_deconvoluted.mgf | Cas9+uniprotkb_proteome_UP000005640_AND_revi_2024_03_22.fasta |
f32GPU_DV | 254.05 | 252.647 | 255.355 | 251.401 | 254.514 | 251.401 | 255.355 | 253.593 | 1.57015 | 4 | False | True | XLpeplib_Beveridge_QEx-HFX_DSS_R1_deconvoluted.mgf | Cas9+uniprotkb_proteome_UP000005640_AND_revi_2024_03_22.fasta |
f32GPU_DM | 240.989 | 247.472 | 247.143 | 247.778 | 246.566 | 240.989 | 247.778 | 245.989 | 2.83103 | 3 | False | True | XLpeplib_Beveridge_QEx-HFX_DSS_R1_deconvoluted.mgf | Cas9+uniprotkb_proteome_UP000005640_AND_revi_2024_03_22.fasta |
Figure 3: Int32-based sparse matrix * dense matrix search using Eigen yields the fastest computation time of 441.45 seconds.
Method | Run 1 | Run 2 | Run 3 | Run 4 | Run 5 | Min | Max | Mean | SD | Rank | Normalize | Use Gaussian | Spectra | Database |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
f32CPU_DV | 649.063 | 661.19 | 650.95 | 675.843 | 677.927 | 649.063 | 677.927 | 662.995 | 13.5136 | 7 | True | False | XLpeplib_Beveridge_QEx-HFX_DSS_R1_deconvoluted.mgf | Cas9+uniprotkb_proteome_UP000005640_AND_revi_2024_03_22.fasta |
i32CPU_DV | 548.815 | 571.154 | 571.506 | 556.422 | 574.498 | 548.815 | 574.498 | 564.479 | 11.2317 | 3 | True | False | XLpeplib_Beveridge_QEx-HFX_DSS_R1_deconvoluted.mgf | Cas9+uniprotkb_proteome_UP000005640_AND_revi_2024_03_22.fasta |
f32CPU_SM | 711.401 | 712.366 | 708.906 | 710.551 | 709.708 | 708.906 | 712.366 | 710.586 | 1.36252 | 8 | True | False | XLpeplib_Beveridge_QEx-HFX_DSS_R1_deconvoluted.mgf | Cas9+uniprotkb_proteome_UP000005640_AND_revi_2024_03_22.fasta |
i32CPU_SM | 614.278 | 630.73 | 600.277 | 592.49 | 591.84 | 591.84 | 630.73 | 605.923 | 16.5517 | 5 | True | False | XLpeplib_Beveridge_QEx-HFX_DSS_R1_deconvoluted.mgf | Cas9+uniprotkb_proteome_UP000005640_AND_revi_2024_03_22.fasta |
f32CPU_DM | 530.933 | 508.778 | 515.837 | 512.128 | 527.792 | 508.778 | 530.933 | 519.094 | 9.76414 | 2 | True | False | XLpeplib_Beveridge_QEx-HFX_DSS_R1_deconvoluted.mgf | Cas9+uniprotkb_proteome_UP000005640_AND_revi_2024_03_22.fasta |
i32CPU_DM | 432.976 | 449.078 | 449.505 | 435.932 | 439.756 | 432.976 | 449.505 | 441.449 | 7.55284 | 1 | True | False | XLpeplib_Beveridge_QEx-HFX_DSS_R1_deconvoluted.mgf | Cas9+uniprotkb_proteome_UP000005640_AND_revi_2024_03_22.fasta |
f32GPU_DV | 582.093 | 600.802 | 595.453 | 589.298 | 602.231 | 582.093 | 602.231 | 593.975 | 8.36706 | 4 | True | False | XLpeplib_Beveridge_QEx-HFX_DSS_R1_deconvoluted.mgf | Cas9+uniprotkb_proteome_UP000005640_AND_revi_2024_03_22.fasta |
f32GPU_DM | 560.338 | 658.734 | 675.252 | 669.258 | 619.287 | 560.338 | 675.252 | 636.574 | 47.8698 | 6 | True | False | XLpeplib_Beveridge_QEx-HFX_DSS_R1_deconvoluted.mgf | Cas9+uniprotkb_proteome_UP000005640_AND_revi_2024_03_22.fasta |
Figure 4: Int32-based sparse matrix * dense matrix search using Eigen yields the fastest computation time of 485.12 seconds.
Method | Run 1 | Run 2 | Run 3 | Run 4 | Run 5 | Min | Max | Mean | SD | Rank | Normalize | Use Gaussian | Spectra | Database |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
f32CPU_DV | 722.099 | 725.386 | 718.058 | 719.054 | 724.449 | 718.058 | 725.386 | 721.809 | 3.22115 | 7 | True | True | XLpeplib_Beveridge_QEx-HFX_DSS_R1_deconvoluted.mgf | Cas9+uniprotkb_proteome_UP000005640_AND_revi_2024_03_22.fasta |
i32CPU_DV | 626.118 | 630.296 | 637.837 | 621.074 | 622.894 | 621.074 | 637.837 | 627.644 | 6.68956 | 4 | True | True | XLpeplib_Beveridge_QEx-HFX_DSS_R1_deconvoluted.mgf | Cas9+uniprotkb_proteome_UP000005640_AND_revi_2024_03_22.fasta |
f32CPU_SM | 782.141 | 784.777 | 787.899 | 785.991 | 779.819 | 779.819 | 787.899 | 784.126 | 3.18703 | 8 | True | True | XLpeplib_Beveridge_QEx-HFX_DSS_R1_deconvoluted.mgf | Cas9+uniprotkb_proteome_UP000005640_AND_revi_2024_03_22.fasta |
i32CPU_SM | 675.703 | 667.53 | 665.649 | 669.617 | 666.607 | 665.649 | 675.703 | 669.021 | 4.01312 | 5 | True | True | XLpeplib_Beveridge_QEx-HFX_DSS_R1_deconvoluted.mgf | Cas9+uniprotkb_proteome_UP000005640_AND_revi_2024_03_22.fasta |
f32CPU_DM | 592.399 | 593.383 | 592.434 | 588.955 | 590.425 | 588.955 | 593.383 | 591.519 | 1.79273 | 2 | True | True | XLpeplib_Beveridge_QEx-HFX_DSS_R1_deconvoluted.mgf | Cas9+uniprotkb_proteome_UP000005640_AND_revi_2024_03_22.fasta |
i32CPU_DM | 484.45 | 481.504 | 481.025 | 484.537 | 494.108 | 481.025 | 494.108 | 485.125 | 5.27773 | 1 | True | True | XLpeplib_Beveridge_QEx-HFX_DSS_R1_deconvoluted.mgf | Cas9+uniprotkb_proteome_UP000005640_AND_revi_2024_03_22.fasta |
f32GPU_DV | 666.539 | 669.565 | 679.34 | 664.082 | 666.39 | 664.082 | 679.34 | 669.183 | 6.00249 | 6 | True | True | XLpeplib_Beveridge_QEx-HFX_DSS_R1_deconvoluted.mgf | Cas9+uniprotkb_proteome_UP000005640_AND_revi_2024_03_22.fasta |
f32GPU_DM | 623.122 | 612.436 | 619.752 | 622.259 | 605.058 | 605.058 | 623.122 | 616.526 | 7.66539 | 3 | True | True | XLpeplib_Beveridge_QEx-HFX_DSS_R1_deconvoluted.mgf | Cas9+uniprotkb_proteome_UP000005640_AND_revi_2024_03_22.fasta |
Generally on this machine Int32-based sparse matrix * dense matrix multiplication on the CPU performs the fastest. It is also noticeable how Int32-based approaches consistently outperform their Float32-based counterparts. Both of these statements are also true when using the HeLa dataset for benchmarking, as recorded in benchmark B (data not shown here, see benchmarks/vis_B
). It should further be mentioned that this system features a high-end CPU and low-end GPU, claiming that CPU-based approaches are faster than GPU-based approaches for the problem at hand would be true for this particular system but does definitely not hold as a general statement.