Copy Number Segment#

The module copy_number_segment provides functions related to Copy Number Segment section of cBioPortal Web Public API.

pybioportal.copy_number_segment.fetch_copy_number_segments(sample_study_ids, chromosome=None, projection='SUMMARY')#

Fetch copy number segments from cBioPortal by sample ID.

Parameters:
  • sample_study_ids (list of dict) –

    List of sample identifiers.

    Each list should have the following format:

    sample_study_ids = [
    {“sample_ids”: [“P-0000004-T01-IM3”, “P-0000950-T01-IM3”],

    “study_id”: “msk_met_2021”},

    {“sample_ids”: [“TCGA-5T-A9QA-01”, “TCGA-A1-A0SB-01”],

    “study_id”: “brca_tcga”}

    ]

  • chromosome (str) – Chromosome (e.g., “1”).

  • projection (str) –

    Level of detail of the response.

    Possible values:

    • ”DETAILED”: Detailed information.

    • ”ID”: Information with only IDs.

    • ”META”: Metadata information.

    • ”SUMMARY”: Summary information (default).

Returns:

A DataFrame containing the fetched copy number segments.

Return type:

pandas.DataFrame

pybioportal.copy_number_segment.get_copy_number_segments_in_sample_in_study(study_id, sample_id, chromosome=None, direction='ASC', pageNumber=0, pageSize=20000, projection='SUMMARY', sortBy='chromosome')#

Get copy number segments in a sample in a study.

Parameters:
  • study_id (str) – Study ID (e.g., “acc_tcga”).

  • sample_id (str) – Sample ID (e.g., “TCGA-OR-A5J2-01”).

  • chromosome (str) – Chromosome (e.g., “1”).

  • direction (str) –

    Direction of the sort.

    Possible values:

    • ”ASC”: Ascending (default).

    • ”DESC”: Descending.

  • pageNumber (int) –

    Page number of the result list.

    • Minimum value is 0.

  • pageSize (int) –

    Page size of the result list.

    • Minimum value is 1, maximum value is 20000.

  • projection (str) –

    Level of detail of the response.

    Possible values:

    • ”DETAILED”: Detailed information.

    • ”ID”: Information with only IDs.

    • ”META”: Metadata information.

    • ”SUMMARY”: Summary information (default).

  • sortBy (str) –

    Name of the property that the result list is sorted by.

    Possible values:

    • ”chromosome”: Sort by chromosome (default).

    • ”end”: Sort by end position.

    • ”numberOfProbes”: Sort by the number of probes.

    • ”segmentMean”: Sort by segment mean.

    • ”start”: Sort by start position.

Returns:

A DataFrame containing copy number segments for the specified sample in the study.

Return type:

pandas.DataFrame


Examples#

from pybioportal import copy_number_segment as cns
df1a = cns.fetch_copy_number_segments(sample_study_ids=[
                                              {"sample_ids": ["P-0000004-T01-IM3", "P-0000950-T01-IM3"],  "study_id": "msk_met_2021"},
                                              {"sample_ids": ["TCGA-5T-A9QA-01", "TCGA-A1-A0SB-01"], "study_id": "brca_tcga"}
                                        ])
df1a
uniqueSampleKey uniquePatientKey patientId start end segmentMean studyId sampleId chromosome numberOfProbes
0 VENHQS1BMS1BMFNCLTAxOmJyY2FfdGNnYQ VENHQS1BMS1BMFNCOmJyY2FfdGNnYQ TCGA-A1-A0SB 3218610 247813706 0.0034 brca_tcga TCGA-A1-A0SB-01 1 129072
1 VENHQS1BMS1BMFNCLTAxOmJyY2FfdGNnYQ VENHQS1BMS1BMFNCOmJyY2FfdGNnYQ TCGA-A1-A0SB 484222 174313755 0.0014 brca_tcga TCGA-A1-A0SB-01 2 92446
2 VENHQS1BMS1BMFNCLTAxOmJyY2FfdGNnYQ VENHQS1BMS1BMFNCOmJyY2FfdGNnYQ TCGA-A1-A0SB 174314142 174314161 -1.2680 brca_tcga TCGA-A1-A0SB-01 2 2
3 VENHQS1BMS1BMFNCLTAxOmJyY2FfdGNnYQ VENHQS1BMS1BMFNCOmJyY2FfdGNnYQ TCGA-A1-A0SB 174315778 194887369 0.0019 brca_tcga TCGA-A1-A0SB-01 2 10897
4 VENHQS1BMS1BMFNCLTAxOmJyY2FfdGNnYQ VENHQS1BMS1BMFNCOmJyY2FfdGNnYQ TCGA-A1-A0SB 194888052 194892814 -0.9361 brca_tcga TCGA-A1-A0SB-01 2 4
... ... ... ... ... ... ... ... ... ... ...
443 UC0wMDAwOTUwLVQwMS1JTTM6bXNrX21ldF8yMDIx UC0wMDAwOTUwOm1za19tZXRfMjAyMQ P-0000950 36164670 37954722 0.3705 msk_met_2021 P-0000950-T01-IM3 21 11
444 UC0wMDAwOTUwLVQwMS1JTTM6bXNrX21ldF8yMDIx UC0wMDAwOTUwOm1za19tZXRfMjAyMQ P-0000950 39625779 45660694 0.0544 msk_met_2021 P-0000950-T01-IM3 21 48
445 UC0wMDAwOTUwLVQwMS1JTTM6bXNrX21ldF8yMDIx UC0wMDAwOTUwOm1za19tZXRfMjAyMQ P-0000950 19053406 49370834 -0.2394 msk_met_2021 P-0000950-T01-IM3 22 101
446 UC0wMDAwOTUwLVQwMS1JTTM6bXNrX21ldF8yMDIx UC0wMDAwOTUwOm1za19tZXRfMjAyMQ P-0000950 451142 1331488 -0.5502 msk_met_2021 P-0000950-T01-IM3 X 7
447 UC0wMDAwOTUwLVQwMS1JTTM6bXNrX21ldF8yMDIx UC0wMDAwOTUwOm1za19tZXRfMjAyMQ P-0000950 3085464 149924738 0.3045 msk_met_2021 P-0000950-T01-IM3 X 312

448 rows × 10 columns

df1b = cns.fetch_copy_number_segments(sample_study_ids=[
                                              {"sample_ids": ["P-0000004-T01-IM3", "P-0000950-T01-IM3"],  "study_id": "msk_met_2021"},
                                              {"sample_ids": ["TCGA-5T-A9QA-01", "TCGA-A1-A0SB-01"], "study_id": "brca_tcga"}
                                        ], chromosome="5")
df1b.head(10)
uniqueSampleKey uniquePatientKey patientId start end segmentMean studyId sampleId chromosome numberOfProbes
0 VENHQS1BMS1BMFNCLTAxOmJyY2FfdGNnYQ VENHQS1BMS1BMFNCOmJyY2FfdGNnYQ TCGA-A1-A0SB 914233 31655645 -0.0041 brca_tcga TCGA-A1-A0SB-01 5 18310
1 VENHQS1BMS1BMFNCLTAxOmJyY2FfdGNnYQ VENHQS1BMS1BMFNCOmJyY2FfdGNnYQ TCGA-A1-A0SB 31656726 31656768 -1.7983 brca_tcga TCGA-A1-A0SB-01 5 2
2 VENHQS1BMS1BMFNCLTAxOmJyY2FfdGNnYQ VENHQS1BMS1BMFNCOmJyY2FfdGNnYQ TCGA-A1-A0SB 31657098 53873836 -0.0002 brca_tcga TCGA-A1-A0SB-01 5 9892
3 VENHQS1BMS1BMFNCLTAxOmJyY2FfdGNnYQ VENHQS1BMS1BMFNCOmJyY2FfdGNnYQ TCGA-A1-A0SB 53873994 53874176 -1.5816 brca_tcga TCGA-A1-A0SB-01 5 2
4 VENHQS1BMS1BMFNCLTAxOmJyY2FfdGNnYQ VENHQS1BMS1BMFNCOmJyY2FfdGNnYQ TCGA-A1-A0SB 53875569 180360469 0.0025 brca_tcga TCGA-A1-A0SB-01 5 72240
5 VENHQS01VC1BOVFBLTAxOmJyY2FfdGNnYQ VENHQS01VC1BOVFBOmJyY2FfdGNnYQ TCGA-5T-A9QA 914233 111111798 0.0329 brca_tcga TCGA-5T-A9QA-01 5 58513
6 VENHQS01VC1BOVFBLTAxOmJyY2FfdGNnYQ VENHQS01VC1BOVFBOmJyY2FfdGNnYQ TCGA-5T-A9QA 111112994 111119490 -1.0722 brca_tcga TCGA-5T-A9QA-01 5 4
7 VENHQS01VC1BOVFBLTAxOmJyY2FfdGNnYQ VENHQS01VC1BOVFBOmJyY2FfdGNnYQ TCGA-5T-A9QA 111119554 112443378 0.0489 brca_tcga TCGA-5T-A9QA-01 5 1080
8 VENHQS01VC1BOVFBLTAxOmJyY2FfdGNnYQ VENHQS01VC1BOVFBOmJyY2FfdGNnYQ TCGA-5T-A9QA 112450364 112450424 -1.4587 brca_tcga TCGA-5T-A9QA-01 5 2
9 VENHQS01VC1BOVFBLTAxOmJyY2FfdGNnYQ VENHQS01VC1BOVFBOmJyY2FfdGNnYQ TCGA-5T-A9QA 112453010 116143516 0.0538 brca_tcga TCGA-5T-A9QA-01 5 2225
df2 = cns.get_copy_number_segments_in_sample_in_study(study_id="msk_met_2021", sample_id="P-0000950-T01-IM3")
df2.head(10)
uniqueSampleKey uniquePatientKey patientId start end segmentMean studyId sampleId chromosome numberOfProbes
0 UC0wMDAwOTUwLVQwMS1JTTM6bXNrX21ldF8yMDIx UC0wMDAwOTUwOm1za19tZXRfMjAyMQ P-0000950 2488138 245977996 -0.2332 msk_met_2021 P-0000950-T01-IM3 1 545
1 UC0wMDAwOTUwLVQwMS1JTTM6bXNrX21ldF8yMDIx UC0wMDAwOTUwOm1za19tZXRfMjAyMQ P-0000950 104375092 104389868 -0.5508 msk_met_2021 P-0000950-T01-IM3 10 5
2 UC0wMDAwOTUwLVQwMS1JTTM6bXNrX21ldF8yMDIx UC0wMDAwOTUwOm1za19tZXRfMjAyMQ P-0000950 1510047 104359246 -0.2373 msk_met_2021 P-0000950-T01-IM3 10 113
3 UC0wMDAwOTUwLVQwMS1JTTM6bXNrX21ldF8yMDIx UC0wMDAwOTUwOm1za19tZXRfMjAyMQ P-0000950 105517162 134199067 -0.1375 msk_met_2021 P-0000950-T01-IM3 10 34
4 UC0wMDAwOTUwLVQwMS1JTTM6bXNrX21ldF8yMDIx UC0wMDAwOTUwOm1za19tZXRfMjAyMQ P-0000950 67198913 68758509 -0.4173 msk_met_2021 P-0000950-T01-IM3 11 12
5 UC0wMDAwOTUwLVQwMS1JTTM6bXNrX21ldF8yMDIx UC0wMDAwOTUwOm1za19tZXRfMjAyMQ P-0000950 532696 67197032 -0.0058 msk_met_2021 P-0000950-T01-IM3 11 91
6 UC0wMDAwOTUwLVQwMS1JTTM6bXNrX21ldF8yMDIx UC0wMDAwOTUwOm1za19tZXRfMjAyMQ P-0000950 69069538 133495556 0.0651 msk_met_2021 P-0000950-T01-IM3 11 235
7 UC0wMDAwOTUwLVQwMS1JTTM6bXNrX21ldF8yMDIx UC0wMDAwOTUwOm1za19tZXRfMjAyMQ P-0000950 394725 133263871 0.2584 msk_met_2021 P-0000950-T01-IM3 12 372
8 UC0wMDAwOTUwLVQwMS1JTTM6bXNrX21ldF8yMDIx UC0wMDAwOTUwOm1za19tZXRfMjAyMQ P-0000950 21004738 111119077 0.3586 msk_met_2021 P-0000950-T01-IM3 13 207
9 UC0wMDAwOTUwLVQwMS1JTTM6bXNrX21ldF8yMDIx UC0wMDAwOTUwOm1za19tZXRfMjAyMQ P-0000950 38055061 38068061 0.5235 msk_met_2021 P-0000950-T01-IM3 14 5