Samples#
The module samples provides functions related to Samples section of
cBioPortal Web Public API.
- pybioportal.samples.fetch_samples(sample_identifiers=None, sample_list_ids=None, unique_sample_keys=None, projection='SUMMARY')#
Fetch samples by ID.
- Parameters:
sample_identifiers (list of dict) –
List of Sample ID / Study ID pairs.
Each dict should have the following format:
- sample_identifiers=[
- {“sample_ids”: [‘TCGA-AR-A1AR-01’,’TCGA-BH-A1EO-01’,’TCGA-BH-A1ES-01’],
“study_id”: “brca_tcga”},
- {“sample_ids”: [‘TCGA-A2-A0T2-01’,’TCGA-A2-A04P-01’],
“study_id”: “brca_tcga_pub”}
]
sample_list_ids (list of str) – List of Sample List IDs (e.g., [‘brca_tcga_cna’, ‘brca_tcga_mrna’, ‘brca_tcga_pub_cna’]).
unique_sample_keys (list of str) – List of Unique Sample Keys, e.g. [‘VENHQS1BUi1BMUFSLTAxOmJyY2FfdGNnYQ’, ‘VENHQS1CNi1BMElRLTAxOmJyY2FfdGNnYV9wdWI’, ‘VENHQS1CSC1BMUZELTAxOmJyY2FfdGNnYQ’]
projection (str) –
Level of detail of the response.
Possible values:
”DETAILED”: Detailed information.
”ID”: Information with only IDs.
”META”: Metadata information.
”SUMMARY”: Summary information (default).
- Returns:
A DataFrame containing samples by ID.
- Return type:
pandas.DataFrame
- pybioportal.samples.get_all_samples_in_study(study_id, direction='ASC', pageNumber=0, pageSize=10000000, projection='SUMMARY', sortBy=None)#
Get all samples in a study.
- Parameters:
study_id (str) – Study ID (e.g., “acc_tcga”).
direction (str) –
Direction of the sort.
Possible values:
”ASC”: Ascending (default).
”DESC”: Descending.
pageNumber (int) –
Page number of the result list.
Minimum value is 0.
pageSize (int) –
Page size of the result list.
Minimum value is 1, maximum value is 10000000.
projection (str) –
Level of detail of the response.
Possible values:
”DETAILED”: Detailed information.
”ID”: Information with only IDs.
”META”: Metadata information.
”SUMMARY”: Summary information (default).
sortBy (str) –
Name of the property that the result list is sorted by.
Possible values:
”sampleId”: Sort by sample ID.
”sampleType”: Sort by sample type.
- Returns:
A DataFrame containing samples in the specified study.
- Return type:
pandas.DataFrame
- pybioportal.samples.get_all_samples_of_patient_in_study(study_id, patient_id, direction='ASC', pageNumber=0, pageSize=10000000, projection='SUMMARY', sortBy=None)#
Get all samples of a patient in a study.
- Parameters:
study_id (str) – Study ID (e.g., “acc_tcga”).
patient_id (str) – Patient ID (e.g., “TCGA-OR-A5J2”).
direction (str) –
Direction of the sort.
Possible values:
”ASC”: Ascending (default).
”DESC”: Descending.
pageNumber (int) –
Page number of the result list.
Minimum value is 0.
pageSize (int) –
Page size of the result list.
Minimum value is 1, maximum value is 10000000.
projection (str) –
Level of detail of the response.
Possible values:
”DETAILED”: Detailed information.
”ID”: Information with only IDs.
”META”: Metadata information.
”SUMMARY”: Summary information (default).
sortBy (str) –
Name of the property that the result list is sorted by.
Possible values:
”sampleId”: Sort by sample ID.
”sampleType”: Sort by sample type.
- Returns:
A DataFrame containing samples of the specified patient in the study.
- Return type:
pandas.DataFrame
- pybioportal.samples.get_sample_in_study(study_id, sample_id)#
Get information about a specific sample in a study.
- Parameters:
study_id (str) – Study ID (e.g., “acc_tcga”).
sample_id (str) – Sample ID (e.g., “TCGA-OR-A5J2-01”).
- Returns:
A DataFrame containing information about the specified sample.
- Return type:
pandas.DataFrame
- pybioportal.samples.get_samples_by_keyword(keyword=None, direction='ASC', pageNumber=0, pageSize=10000000, projection='SUMMARY', sortBy=None)#
Get all samples matching a keyword.
- Parameters:
keyword (str) – Search keyword that applies to the study ID.
direction (str) –
Direction of the sort.
Possible values:
”ASC”: Ascending (default).
”DESC”: Descending.
pageNumber (int) –
Page number of the result list.
Minimum value is 0.
pageSize (int) –
Page size of the result list.
Minimum value is 1, maximum value is 10000000.
projection (str) –
Level of detail of the response.
Possible values:
”DETAILED”: Detailed information.
”ID”: Information with only IDs.
”META”: Metadata information.
”SUMMARY”: Summary information (default).
sortBy (str) –
Name of the property that the result list is sorted by.
Possible values:
”sampleId”
”sampleType”
- Returns:
A DataFrame containing samples matching the keyword.
- Return type:
pandas.DataFrame
Examples#
from pybioportal import samples as sp
df1 = sp.get_samples_by_keyword(keyword="TCGA")
df1
| uniqueSampleKey | uniquePatientKey | sampleType | sampleId | patientId | studyId | |
|---|---|---|---|---|---|---|
| 0 | VENHQS0wMi0wMDAxLTAxOmdibV90Y2dhX3Bhbl9jYW5fYX... | VENHQS0wMi0wMDAxOmdibV90Y2dhX3Bhbl9jYW5fYXRsYX... | Primary Solid Tumor | TCGA-02-0001-01 | TCGA-02-0001 | gbm_tcga_pan_can_atlas_2018 |
| 1 | VENHQS0wMi0wMDAxLTAxOmxnZ2dibV90Y2dhX3B1Yg | VENHQS0wMi0wMDAxOmxnZ2dibV90Y2dhX3B1Yg | Primary Solid Tumor | TCGA-02-0001-01 | TCGA-02-0001 | lgggbm_tcga_pub |
| 2 | VENHQS0wMi0wMDAxLTAxOmdibV90Y2dhX3B1Yg | VENHQS0wMi0wMDAxOmdibV90Y2dhX3B1Yg | Primary Solid Tumor | TCGA-02-0001-01 | TCGA-02-0001 | gbm_tcga_pub |
| 3 | VENHQS0wMi0wMDAxLTAxOmdibV90Y2dhX3B1YjIwMTM | VENHQS0wMi0wMDAxOmdibV90Y2dhX3B1YjIwMTM | Primary Solid Tumor | TCGA-02-0001-01 | TCGA-02-0001 | gbm_tcga_pub2013 |
| 4 | VENHQS0wMi0wMDAxLTAxOmdibV90Y2dh | VENHQS0wMi0wMDAxOmdibV90Y2dh | Primary Solid Tumor | TCGA-02-0001-01 | TCGA-02-0001 | gbm_tcga |
| ... | ... | ... | ... | ... | ... | ... |
| 33581 | SURUQ0dBLTAyOm1peGVkX21za190Y2dhXzIwMjE | SURUQ0dBLTAyOm1peGVkX21za190Y2dhXzIwMjE | Primary Solid Tumor | IDTCGA-02 | IDTCGA-02 | mixed_msk_tcga_2021 |
| 33582 | SURUQ0dBLTAzOm1peGVkX21za190Y2dhXzIwMjE | SURUQ0dBLTAzOm1peGVkX21za190Y2dhXzIwMjE | Primary Solid Tumor | IDTCGA-03 | IDTCGA-03 | mixed_msk_tcga_2021 |
| 33583 | SURUQ0dBLTA0Om1peGVkX21za190Y2dhXzIwMjE | SURUQ0dBLTA0Om1peGVkX21za190Y2dhXzIwMjE | Primary Solid Tumor | IDTCGA-04 | IDTCGA-04 | mixed_msk_tcga_2021 |
| 33584 | SURUQ0dBLTA1Om1peGVkX21za190Y2dhXzIwMjE | SURUQ0dBLTA1Om1peGVkX21za190Y2dhXzIwMjE | Primary Solid Tumor | IDTCGA-05 | IDTCGA-05 | mixed_msk_tcga_2021 |
| 33585 | SURUQ0dBLTA2Om1peGVkX21za190Y2dhXzIwMjE | SURUQ0dBLTA2Om1peGVkX21za190Y2dhXzIwMjE | Primary Solid Tumor | IDTCGA-06 | IDTCGA-06 | mixed_msk_tcga_2021 |
33586 rows × 6 columns
df2a = sp.fetch_samples(sample_identifiers=[
{"sample_ids": ["TCGA-AR-A1AR-01","TCGA-BH-A1EO-01","TCGA-BH-A1ES-01"],
"study_id": "brca_tcga"},
{"sample_ids": ["TCGA-A2-A0T2-01","TCGA-A2-A04P-01"],
"study_id": "brca_tcga_pub"}
])
df2a
| uniqueSampleKey | uniquePatientKey | sampleType | sampleId | patientId | studyId | |
|---|---|---|---|---|---|---|
| 0 | VENHQS1BUi1BMUFSLTAxOmJyY2FfdGNnYQ | VENHQS1BUi1BMUFSOmJyY2FfdGNnYQ | Primary Solid Tumor | TCGA-AR-A1AR-01 | TCGA-AR-A1AR | brca_tcga |
| 1 | VENHQS1CSC1BMUVPLTAxOmJyY2FfdGNnYQ | VENHQS1CSC1BMUVPOmJyY2FfdGNnYQ | Primary Solid Tumor | TCGA-BH-A1EO-01 | TCGA-BH-A1EO | brca_tcga |
| 2 | VENHQS1CSC1BMUVTLTAxOmJyY2FfdGNnYQ | VENHQS1CSC1BMUVTOmJyY2FfdGNnYQ | Primary Solid Tumor | TCGA-BH-A1ES-01 | TCGA-BH-A1ES | brca_tcga |
| 3 | VENHQS1BMi1BMFQyLTAxOmJyY2FfdGNnYV9wdWI | VENHQS1BMi1BMFQyOmJyY2FfdGNnYV9wdWI | Primary Solid Tumor | TCGA-A2-A0T2-01 | TCGA-A2-A0T2 | brca_tcga_pub |
| 4 | VENHQS1BMi1BMDRQLTAxOmJyY2FfdGNnYV9wdWI | VENHQS1BMi1BMDRQOmJyY2FfdGNnYV9wdWI | Primary Solid Tumor | TCGA-A2-A04P-01 | TCGA-A2-A04P | brca_tcga_pub |
df2b = sp.fetch_samples(sample_list_ids=["brca_tcga_cna", "brca_tcga_mrna", "brca_tcga_pub_cna"])
df2b
| uniqueSampleKey | uniquePatientKey | sampleType | sampleId | patientId | studyId | |
|---|---|---|---|---|---|---|
| 0 | VENHQS1BUi1BMUFSLTAxOmJyY2FfdGNnYQ | VENHQS1BUi1BMUFSOmJyY2FfdGNnYQ | Primary Solid Tumor | TCGA-AR-A1AR-01 | TCGA-AR-A1AR | brca_tcga |
| 1 | VENHQS1CSC1BMUVPLTAxOmJyY2FfdGNnYQ | VENHQS1CSC1BMUVPOmJyY2FfdGNnYQ | Primary Solid Tumor | TCGA-BH-A1EO-01 | TCGA-BH-A1EO | brca_tcga |
| 2 | VENHQS1CSC1BMUVTLTAxOmJyY2FfdGNnYQ | VENHQS1CSC1BMUVTOmJyY2FfdGNnYQ | Primary Solid Tumor | TCGA-BH-A1ES-01 | TCGA-BH-A1ES | brca_tcga |
| 3 | VENHQS1CSC1BMUVULTAxOmJyY2FfdGNnYQ | VENHQS1CSC1BMUVUOmJyY2FfdGNnYQ | Primary Solid Tumor | TCGA-BH-A1ET-01 | TCGA-BH-A1ET | brca_tcga |
| 4 | VENHQS1CSC1BMUVVLTAxOmJyY2FfdGNnYQ | VENHQS1CSC1BMUVVOmJyY2FfdGNnYQ | Primary Solid Tumor | TCGA-BH-A1EU-01 | TCGA-BH-A1EU | brca_tcga |
| ... | ... | ... | ... | ... | ... | ... |
| 2382 | VENHQS1BQy1BMkZGLTAxOmJyY2FfdGNnYV9wdWI | VENHQS1BQy1BMkZGOmJyY2FfdGNnYV9wdWI | Primary Solid Tumor | TCGA-AC-A2FF-01 | TCGA-AC-A2FF | brca_tcga_pub |
| 2383 | VENHQS1BQy1BMkZCLTAxOmJyY2FfdGNnYV9wdWI | VENHQS1BQy1BMkZCOmJyY2FfdGNnYV9wdWI | Primary Solid Tumor | TCGA-AC-A2FB-01 | TCGA-AC-A2FB | brca_tcga_pub |
| 2384 | VENHQS1BQy1BMkZHLTAxOmJyY2FfdGNnYV9wdWI | VENHQS1BQy1BMkZHOmJyY2FfdGNnYV9wdWI | Primary Solid Tumor | TCGA-AC-A2FG-01 | TCGA-AC-A2FG | brca_tcga_pub |
| 2385 | VENHQS1HSS1BMkM4LTAxOmJyY2FfdGNnYV9wdWI | VENHQS1HSS1BMkM4OmJyY2FfdGNnYV9wdWI | Primary Solid Tumor | TCGA-GI-A2C8-01 | TCGA-GI-A2C8 | brca_tcga_pub |
| 2386 | VENHQS1FOS1BMjk1LTAxOmJyY2FfdGNnYV9wdWI | VENHQS1FOS1BMjk1OmJyY2FfdGNnYV9wdWI | Primary Solid Tumor | TCGA-E9-A295-01 | TCGA-E9-A295 | brca_tcga_pub |
2387 rows × 6 columns
df2c = sp.fetch_samples(unique_sample_keys=["VENHQS1BUi1BMUFSLTAxOmJyY2FfdGNnYQ",
"VENHQS1CNi1BMElRLTAxOmJyY2FfdGNnYV9wdWI",
"VENHQS1CSC1BMUZELTAxOmJyY2FfdGNnYQ"])
df2c
| uniqueSampleKey | uniquePatientKey | sampleType | sampleId | patientId | studyId | |
|---|---|---|---|---|---|---|
| 0 | VENHQS1BUi1BMUFSLTAxOmJyY2FfdGNnYQ | VENHQS1BUi1BMUFSOmJyY2FfdGNnYQ | Primary Solid Tumor | TCGA-AR-A1AR-01 | TCGA-AR-A1AR | brca_tcga |
| 1 | VENHQS1CSC1BMUZELTAxOmJyY2FfdGNnYQ | VENHQS1CSC1BMUZEOmJyY2FfdGNnYQ | Primary Solid Tumor | TCGA-BH-A1FD-01 | TCGA-BH-A1FD | brca_tcga |
| 2 | VENHQS1CNi1BMElRLTAxOmJyY2FfdGNnYV9wdWI | VENHQS1CNi1BMElROmJyY2FfdGNnYV9wdWI | Primary Solid Tumor | TCGA-B6-A0IQ-01 | TCGA-B6-A0IQ | brca_tcga_pub |
df3 = sp.get_all_samples_of_patient_in_study(study_id="brca_tcga", patient_id="TCGA-AR-A1AR")
df3
| uniqueSampleKey | uniquePatientKey | sampleType | sampleId | patientId | studyId | |
|---|---|---|---|---|---|---|
| 0 | VENHQS1BUi1BMUFSLTAxOmJyY2FfdGNnYQ | VENHQS1BUi1BMUFSOmJyY2FfdGNnYQ | Primary Solid Tumor | TCGA-AR-A1AR-01 | TCGA-AR-A1AR | brca_tcga |
df4 = sp.get_all_samples_in_study(study_id="brca_tcga")
df4
| uniqueSampleKey | uniquePatientKey | sampleType | sampleId | patientId | studyId | |
|---|---|---|---|---|---|---|
| 0 | VENHQS1BUi1BMUFSLTAxOmJyY2FfdGNnYQ | VENHQS1BUi1BMUFSOmJyY2FfdGNnYQ | Primary Solid Tumor | TCGA-AR-A1AR-01 | TCGA-AR-A1AR | brca_tcga |
| 1 | VENHQS1CSC1BMUVPLTAxOmJyY2FfdGNnYQ | VENHQS1CSC1BMUVPOmJyY2FfdGNnYQ | Primary Solid Tumor | TCGA-BH-A1EO-01 | TCGA-BH-A1EO | brca_tcga |
| 2 | VENHQS1CSC1BMUVTLTAxOmJyY2FfdGNnYQ | VENHQS1CSC1BMUVTOmJyY2FfdGNnYQ | Primary Solid Tumor | TCGA-BH-A1ES-01 | TCGA-BH-A1ES | brca_tcga |
| 3 | VENHQS1CSC1BMUVTLTA2OmJyY2FfdGNnYQ | VENHQS1CSC1BMUVTOmJyY2FfdGNnYQ | Metastatic | TCGA-BH-A1ES-06 | TCGA-BH-A1ES | brca_tcga |
| 4 | VENHQS1CSC1BMUVULTAxOmJyY2FfdGNnYQ | VENHQS1CSC1BMUVUOmJyY2FfdGNnYQ | Primary Solid Tumor | TCGA-BH-A1ET-01 | TCGA-BH-A1ET | brca_tcga |
| ... | ... | ... | ... | ... | ... | ... |
| 1103 | VENHQS1FMi1BMUI0LTAxOmJyY2FfdGNnYQ | VENHQS1FMi1BMUI0OmJyY2FfdGNnYQ | Primary Solid Tumor | TCGA-E2-A1B4-01 | TCGA-E2-A1B4 | brca_tcga |
| 1104 | VENHQS1FMi1BMUI1LTAxOmJyY2FfdGNnYQ | VENHQS1FMi1BMUI1OmJyY2FfdGNnYQ | Primary Solid Tumor | TCGA-E2-A1B5-01 | TCGA-E2-A1B5 | brca_tcga |
| 1105 | VENHQS1FMi1BMUI2LTAxOmJyY2FfdGNnYQ | VENHQS1FMi1BMUI2OmJyY2FfdGNnYQ | Primary Solid Tumor | TCGA-E2-A1B6-01 | TCGA-E2-A1B6 | brca_tcga |
| 1106 | VENHQS1FMi1BMUJDLTAxOmJyY2FfdGNnYQ | VENHQS1FMi1BMUJDOmJyY2FfdGNnYQ | Primary Solid Tumor | TCGA-E2-A1BC-01 | TCGA-E2-A1BC | brca_tcga |
| 1107 | VENHQS1FMi1BMUJELTAxOmJyY2FfdGNnYQ | VENHQS1FMi1BMUJEOmJyY2FfdGNnYQ | Primary Solid Tumor | TCGA-E2-A1BD-01 | TCGA-E2-A1BD | brca_tcga |
1108 rows × 6 columns
df5 = sp.get_sample_in_study(study_id="brca_tcga",sample_id="TCGA-AR-A1AR-01")
df5
| sampleType | sequenced | copyNumberSegmentPresent | sampleId | patientId | studyId | |
|---|---|---|---|---|---|---|
| 0 | Primary Solid Tumor | True | True | TCGA-AR-A1AR-01 | TCGA-AR-A1AR | brca_tcga |