Copy Number Segment#
The module copy_number_segment provides functions related to Copy Number Segment section of
cBioPortal Web Public API.
- pybioportal.copy_number_segment.fetch_copy_number_segments(sample_study_ids, chromosome=None, projection='SUMMARY')#
Fetch copy number segments from cBioPortal by sample ID.
- Parameters:
sample_study_ids (list of dict) –
List of sample identifiers.
Each list should have the following format:
- sample_study_ids = [
- {“sample_ids”: [“P-0000004-T01-IM3”, “P-0000950-T01-IM3”],
“study_id”: “msk_met_2021”},
- {“sample_ids”: [“TCGA-5T-A9QA-01”, “TCGA-A1-A0SB-01”],
“study_id”: “brca_tcga”}
]
chromosome (str) – Chromosome (e.g., “1”).
projection (str) –
Level of detail of the response.
Possible values:
”DETAILED”: Detailed information.
”ID”: Information with only IDs.
”META”: Metadata information.
”SUMMARY”: Summary information (default).
- Returns:
A DataFrame containing the fetched copy number segments.
- Return type:
pandas.DataFrame
- pybioportal.copy_number_segment.get_copy_number_segments_in_sample_in_study(study_id, sample_id, chromosome=None, direction='ASC', pageNumber=0, pageSize=20000, projection='SUMMARY', sortBy='chromosome')#
Get copy number segments in a sample in a study.
- Parameters:
study_id (str) – Study ID (e.g., “acc_tcga”).
sample_id (str) – Sample ID (e.g., “TCGA-OR-A5J2-01”).
chromosome (str) – Chromosome (e.g., “1”).
direction (str) –
Direction of the sort.
Possible values:
”ASC”: Ascending (default).
”DESC”: Descending.
pageNumber (int) –
Page number of the result list.
Minimum value is 0.
pageSize (int) –
Page size of the result list.
Minimum value is 1, maximum value is 20000.
projection (str) –
Level of detail of the response.
Possible values:
”DETAILED”: Detailed information.
”ID”: Information with only IDs.
”META”: Metadata information.
”SUMMARY”: Summary information (default).
sortBy (str) –
Name of the property that the result list is sorted by.
Possible values:
”chromosome”: Sort by chromosome (default).
”end”: Sort by end position.
”numberOfProbes”: Sort by the number of probes.
”segmentMean”: Sort by segment mean.
”start”: Sort by start position.
- Returns:
A DataFrame containing copy number segments for the specified sample in the study.
- Return type:
pandas.DataFrame
Examples#
from pybioportal import copy_number_segment as cns
df1a = cns.fetch_copy_number_segments(sample_study_ids=[
{"sample_ids": ["P-0000004-T01-IM3", "P-0000950-T01-IM3"], "study_id": "msk_met_2021"},
{"sample_ids": ["TCGA-5T-A9QA-01", "TCGA-A1-A0SB-01"], "study_id": "brca_tcga"}
])
df1a
| uniqueSampleKey | uniquePatientKey | patientId | start | end | segmentMean | studyId | sampleId | chromosome | numberOfProbes | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | VENHQS1BMS1BMFNCLTAxOmJyY2FfdGNnYQ | VENHQS1BMS1BMFNCOmJyY2FfdGNnYQ | TCGA-A1-A0SB | 3218610 | 247813706 | 0.0034 | brca_tcga | TCGA-A1-A0SB-01 | 1 | 129072 |
| 1 | VENHQS1BMS1BMFNCLTAxOmJyY2FfdGNnYQ | VENHQS1BMS1BMFNCOmJyY2FfdGNnYQ | TCGA-A1-A0SB | 484222 | 174313755 | 0.0014 | brca_tcga | TCGA-A1-A0SB-01 | 2 | 92446 |
| 2 | VENHQS1BMS1BMFNCLTAxOmJyY2FfdGNnYQ | VENHQS1BMS1BMFNCOmJyY2FfdGNnYQ | TCGA-A1-A0SB | 174314142 | 174314161 | -1.2680 | brca_tcga | TCGA-A1-A0SB-01 | 2 | 2 |
| 3 | VENHQS1BMS1BMFNCLTAxOmJyY2FfdGNnYQ | VENHQS1BMS1BMFNCOmJyY2FfdGNnYQ | TCGA-A1-A0SB | 174315778 | 194887369 | 0.0019 | brca_tcga | TCGA-A1-A0SB-01 | 2 | 10897 |
| 4 | VENHQS1BMS1BMFNCLTAxOmJyY2FfdGNnYQ | VENHQS1BMS1BMFNCOmJyY2FfdGNnYQ | TCGA-A1-A0SB | 194888052 | 194892814 | -0.9361 | brca_tcga | TCGA-A1-A0SB-01 | 2 | 4 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 443 | UC0wMDAwOTUwLVQwMS1JTTM6bXNrX21ldF8yMDIx | UC0wMDAwOTUwOm1za19tZXRfMjAyMQ | P-0000950 | 36164670 | 37954722 | 0.3705 | msk_met_2021 | P-0000950-T01-IM3 | 21 | 11 |
| 444 | UC0wMDAwOTUwLVQwMS1JTTM6bXNrX21ldF8yMDIx | UC0wMDAwOTUwOm1za19tZXRfMjAyMQ | P-0000950 | 39625779 | 45660694 | 0.0544 | msk_met_2021 | P-0000950-T01-IM3 | 21 | 48 |
| 445 | UC0wMDAwOTUwLVQwMS1JTTM6bXNrX21ldF8yMDIx | UC0wMDAwOTUwOm1za19tZXRfMjAyMQ | P-0000950 | 19053406 | 49370834 | -0.2394 | msk_met_2021 | P-0000950-T01-IM3 | 22 | 101 |
| 446 | UC0wMDAwOTUwLVQwMS1JTTM6bXNrX21ldF8yMDIx | UC0wMDAwOTUwOm1za19tZXRfMjAyMQ | P-0000950 | 451142 | 1331488 | -0.5502 | msk_met_2021 | P-0000950-T01-IM3 | X | 7 |
| 447 | UC0wMDAwOTUwLVQwMS1JTTM6bXNrX21ldF8yMDIx | UC0wMDAwOTUwOm1za19tZXRfMjAyMQ | P-0000950 | 3085464 | 149924738 | 0.3045 | msk_met_2021 | P-0000950-T01-IM3 | X | 312 |
448 rows × 10 columns
df1b = cns.fetch_copy_number_segments(sample_study_ids=[
{"sample_ids": ["P-0000004-T01-IM3", "P-0000950-T01-IM3"], "study_id": "msk_met_2021"},
{"sample_ids": ["TCGA-5T-A9QA-01", "TCGA-A1-A0SB-01"], "study_id": "brca_tcga"}
], chromosome="5")
df1b.head(10)
| uniqueSampleKey | uniquePatientKey | patientId | start | end | segmentMean | studyId | sampleId | chromosome | numberOfProbes | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | VENHQS1BMS1BMFNCLTAxOmJyY2FfdGNnYQ | VENHQS1BMS1BMFNCOmJyY2FfdGNnYQ | TCGA-A1-A0SB | 914233 | 31655645 | -0.0041 | brca_tcga | TCGA-A1-A0SB-01 | 5 | 18310 |
| 1 | VENHQS1BMS1BMFNCLTAxOmJyY2FfdGNnYQ | VENHQS1BMS1BMFNCOmJyY2FfdGNnYQ | TCGA-A1-A0SB | 31656726 | 31656768 | -1.7983 | brca_tcga | TCGA-A1-A0SB-01 | 5 | 2 |
| 2 | VENHQS1BMS1BMFNCLTAxOmJyY2FfdGNnYQ | VENHQS1BMS1BMFNCOmJyY2FfdGNnYQ | TCGA-A1-A0SB | 31657098 | 53873836 | -0.0002 | brca_tcga | TCGA-A1-A0SB-01 | 5 | 9892 |
| 3 | VENHQS1BMS1BMFNCLTAxOmJyY2FfdGNnYQ | VENHQS1BMS1BMFNCOmJyY2FfdGNnYQ | TCGA-A1-A0SB | 53873994 | 53874176 | -1.5816 | brca_tcga | TCGA-A1-A0SB-01 | 5 | 2 |
| 4 | VENHQS1BMS1BMFNCLTAxOmJyY2FfdGNnYQ | VENHQS1BMS1BMFNCOmJyY2FfdGNnYQ | TCGA-A1-A0SB | 53875569 | 180360469 | 0.0025 | brca_tcga | TCGA-A1-A0SB-01 | 5 | 72240 |
| 5 | VENHQS01VC1BOVFBLTAxOmJyY2FfdGNnYQ | VENHQS01VC1BOVFBOmJyY2FfdGNnYQ | TCGA-5T-A9QA | 914233 | 111111798 | 0.0329 | brca_tcga | TCGA-5T-A9QA-01 | 5 | 58513 |
| 6 | VENHQS01VC1BOVFBLTAxOmJyY2FfdGNnYQ | VENHQS01VC1BOVFBOmJyY2FfdGNnYQ | TCGA-5T-A9QA | 111112994 | 111119490 | -1.0722 | brca_tcga | TCGA-5T-A9QA-01 | 5 | 4 |
| 7 | VENHQS01VC1BOVFBLTAxOmJyY2FfdGNnYQ | VENHQS01VC1BOVFBOmJyY2FfdGNnYQ | TCGA-5T-A9QA | 111119554 | 112443378 | 0.0489 | brca_tcga | TCGA-5T-A9QA-01 | 5 | 1080 |
| 8 | VENHQS01VC1BOVFBLTAxOmJyY2FfdGNnYQ | VENHQS01VC1BOVFBOmJyY2FfdGNnYQ | TCGA-5T-A9QA | 112450364 | 112450424 | -1.4587 | brca_tcga | TCGA-5T-A9QA-01 | 5 | 2 |
| 9 | VENHQS01VC1BOVFBLTAxOmJyY2FfdGNnYQ | VENHQS01VC1BOVFBOmJyY2FfdGNnYQ | TCGA-5T-A9QA | 112453010 | 116143516 | 0.0538 | brca_tcga | TCGA-5T-A9QA-01 | 5 | 2225 |
df2 = cns.get_copy_number_segments_in_sample_in_study(study_id="msk_met_2021", sample_id="P-0000950-T01-IM3")
df2.head(10)
| uniqueSampleKey | uniquePatientKey | patientId | start | end | segmentMean | studyId | sampleId | chromosome | numberOfProbes | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | UC0wMDAwOTUwLVQwMS1JTTM6bXNrX21ldF8yMDIx | UC0wMDAwOTUwOm1za19tZXRfMjAyMQ | P-0000950 | 2488138 | 245977996 | -0.2332 | msk_met_2021 | P-0000950-T01-IM3 | 1 | 545 |
| 1 | UC0wMDAwOTUwLVQwMS1JTTM6bXNrX21ldF8yMDIx | UC0wMDAwOTUwOm1za19tZXRfMjAyMQ | P-0000950 | 104375092 | 104389868 | -0.5508 | msk_met_2021 | P-0000950-T01-IM3 | 10 | 5 |
| 2 | UC0wMDAwOTUwLVQwMS1JTTM6bXNrX21ldF8yMDIx | UC0wMDAwOTUwOm1za19tZXRfMjAyMQ | P-0000950 | 1510047 | 104359246 | -0.2373 | msk_met_2021 | P-0000950-T01-IM3 | 10 | 113 |
| 3 | UC0wMDAwOTUwLVQwMS1JTTM6bXNrX21ldF8yMDIx | UC0wMDAwOTUwOm1za19tZXRfMjAyMQ | P-0000950 | 105517162 | 134199067 | -0.1375 | msk_met_2021 | P-0000950-T01-IM3 | 10 | 34 |
| 4 | UC0wMDAwOTUwLVQwMS1JTTM6bXNrX21ldF8yMDIx | UC0wMDAwOTUwOm1za19tZXRfMjAyMQ | P-0000950 | 67198913 | 68758509 | -0.4173 | msk_met_2021 | P-0000950-T01-IM3 | 11 | 12 |
| 5 | UC0wMDAwOTUwLVQwMS1JTTM6bXNrX21ldF8yMDIx | UC0wMDAwOTUwOm1za19tZXRfMjAyMQ | P-0000950 | 532696 | 67197032 | -0.0058 | msk_met_2021 | P-0000950-T01-IM3 | 11 | 91 |
| 6 | UC0wMDAwOTUwLVQwMS1JTTM6bXNrX21ldF8yMDIx | UC0wMDAwOTUwOm1za19tZXRfMjAyMQ | P-0000950 | 69069538 | 133495556 | 0.0651 | msk_met_2021 | P-0000950-T01-IM3 | 11 | 235 |
| 7 | UC0wMDAwOTUwLVQwMS1JTTM6bXNrX21ldF8yMDIx | UC0wMDAwOTUwOm1za19tZXRfMjAyMQ | P-0000950 | 394725 | 133263871 | 0.2584 | msk_met_2021 | P-0000950-T01-IM3 | 12 | 372 |
| 8 | UC0wMDAwOTUwLVQwMS1JTTM6bXNrX21ldF8yMDIx | UC0wMDAwOTUwOm1za19tZXRfMjAyMQ | P-0000950 | 21004738 | 111119077 | 0.3586 | msk_met_2021 | P-0000950-T01-IM3 | 13 | 207 |
| 9 | UC0wMDAwOTUwLVQwMS1JTTM6bXNrX21ldF8yMDIx | UC0wMDAwOTUwOm1za19tZXRfMjAyMQ | P-0000950 | 38055061 | 38068061 | 0.5235 | msk_met_2021 | P-0000950-T01-IM3 | 14 | 5 |