MSCI: Mass Spectrometry Content Information Python Library
MSCI is a Python package designed for the assessment of peptide fragmentation spectra information content. It helps researchers identify indistinguishable peptides in a given proteome by analyzing spectral similarity scores.
Contents
- Installation Guide
- Example of usage
- MSCI Documentation
- Functions
- generate_variable_length_peptides(protein_sequence, min_length=8, max_length=11)
- extract_peptides_from_fasta(fasta_path, min_length=8, max_length=11)
- keep_top_n_peaks(spectrum, n)
- filter_spectra_by_top_peaks(input_file_path, output_file_path, n_peaks)
- reading MS spectra
- read_msp_file
- read_mgf_file
- read_mzml_file
- read_ms_file
- Grouping MS1 Module
- make_data_compatible(index_df)
- within_ppm(pair, ppm_tolerance1, ppm_tolerance2)
- within_tolerance(pair, tolerance1, tolerance2)
- find_combinations_kdtree(data, tolerance1, tolerance2, use_ppm=True)
- Similarity Module
- ndotproduct(x, y, m=0, n=0.5, na_rm=True)
- nspectraangle(x, y, m=0, n=0.5, na_rm=True)
- joinPeaks(tolerance=0, ppm=0)
- Mutation Module
- ProteinMutator
- tryptic_digest(sequence)
- Contributing
API
The MSCI package offers functionalities for:
Data Import: Load proteomes and spectral libraries.
Spectra Prediction & Processing: Predict peptide spectra and filter fragments.
Spectra Grouping: Group peptides based on m/z and iRT values.
Similarity Measurement: Compute spectral similarity using different scoring functions.
Output & Visualization: Export similarity results and generate fragmentation plots.
For full API documentation, see MSCI Documentation.
Usage
Example workflow: Visit https://msci.readthedocs.io/en/latest/tutorial.html
For a full tutorial, visit our Colab notebook: https://colab.research.google.com/drive/1ny97RNgvnpD7ZrHW8TTRXWCAQvIcavkk
For Code and datasets visit https://github.com/proteomicsunitcrg/MSCI
Graphical User Interface (GUI)
MSCI includes a web-based GUI for non-programmers, accessible at: https://msci–proteomicsunit.streamlit.app/
License
MSCI is released under the MIT License.