Studies#
The module studies provides functions related to Studies section of
cBioPortal Web Public API.
- pybioportal.studies.fetch_studies(study_ids=[], projection='SUMMARY')#
Fetch studies by IDs.
- Parameters:
study_ids (list of str) – List of study identifiers (e.g., [“brca_tcga”,”acc_tcga”]).
projection (str) –
Level of detail of the response.
Possible values:
”DETAILED”: Detailed information.
”ID”: Information with only IDs.
”META”: Metadata information.
”SUMMARY”: Summary information (default).
- Returns:
A DataFrame containing list of studies by IDs.
- Return type:
pandas.DataFrame
- pybioportal.studies.fetch_tags_for_studies(study_ids=[])#
Get the study tags by IDs.
- Parameters:
study_ids (list of str) – List of study identifiers.
- Returns:
A DataFrame containing study tags for multiple studies.
- Return type:
pandas.DataFrame
- pybioportal.studies.get_all_studies(keyword=None, direction='ASC', pageNumber=0, pageSize=10000000, projection='SUMMARY', sortBy=None)#
Get all studies.
- Parameters:
keyword (str) – Search keyword that applies to name and cancer type of the studies.
direction (str) –
Direction of the sort.
Possible values:
”ASC”: Ascending (default).
”DESC”: Descending.
pageNumber (int) –
Page number of the result list.
Minimum value is 0.
pageSize (int) –
Page size of the result list.
Minimum value is 1, maximum value is 10000000.
projection (str) –
Level of detail of the response.
Possible values:
”DETAILED”: Detailed information.
”ID”: Information with only IDs.
”META”: Metadata information.
”SUMMARY”: Summary information (default).
sortBy (str) –
Name of the property that the result list is sorted by.
Possible values:
”cancerTypeId”
”citation”
”description”
”groups”
”importDate”
”name”
”pmid”
”publicStudy”
”status”
”studyId”
- Returns:
A DataFrame containing list of studies.
- Return type:
pandas.DataFrame
- pybioportal.studies.get_study(study_id)#
Get a study by ID.
- Parameters:
study_id (str) – Study ID (e.g., “acc_tcga”).
- Returns:
A DataFrame containing information about the study.
- Return type:
pandas.DataFrame
- pybioportal.studies.get_tags_of_study(study_id)#
Get the tags of a study.
- Parameters:
study_id (str) – Study ID (e.g., “acc_tcga”).
- Returns:
A DataFrame containing tags associated with the study.
- Return type:
pandas.DataFrame
Examples#
from pybioportal import studies as std
df1 = std.get_all_studies(keyword="TCGA", projection="DETAILED")
df1
| name | description | publicStudy | groups | status | importDate | allSampleCount | sequencedSampleCount | cnaSampleCount | mrnaRnaSeqSampleCount | ... | studyId | cancerTypeId | cancerType_name | cancerType_dedicatedColor | cancerType_shortName | cancerType_parent | cancerType_cancerTypeId | referenceGenome | pmid | citation | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Acute Myeloid Leukemia (TCGA, Firehose Legacy) | TCGA Acute Myeloid Leukemia. Source data from ... | True | PUBLIC | 0 | 2023-06-19 09:43:19 | 200 | 197 | 191 | 179 | ... | laml_tcga | aml | Acute Myeloid Leukemia | LightSalmon | AML | mnm | aml | hg19 | NaN | NaN |
| 1 | Acute Myeloid Leukemia (TCGA, NEJM 2013) | Whole-genome or whole-exome sequencing analysi... | True | PUBLIC | 0 | 2023-06-19 23:07:48 | 200 | 200 | 191 | 0 | ... | laml_tcga_pub | aml | Acute Myeloid Leukemia | LightSalmon | AML | mnm | aml | hg19 | 23634996 | TCGA, NEJM 2013 |
| 2 | Acute Myeloid Leukemia (TCGA, PanCancer Atlas) | Acute Myeloid Leukemia TCGA PanCancer data. Th... | True | PUBLIC;PANCAN | 0 | 2023-06-19 12:05:01 | 200 | 200 | 191 | 0 | ... | laml_tcga_pan_can_atlas_2018 | aml | Acute Myeloid Leukemia | LightSalmon | AML | mnm | aml | hg19 | 29625048,29596782,29622463,29617662,29625055,2... | TCGA, Cell 2018 |
| 3 | Adrenocortical Carcinoma (TCGA, Firehose Legacy) | TCGA Adrenocortical Carcinoma. Source data fro... | True | PUBLIC | 0 | 2023-06-19 09:42:47 | 92 | 90 | 90 | 0 | ... | acc_tcga | acc | Adrenocortical Carcinoma | Purple | ACC | adrenal_gland | acc | hg19 | NaN | NaN |
| 4 | Adrenocortical Carcinoma (TCGA, PanCancer Atlas) | Adrenocortical Carcinoma TCGA PanCancer data. ... | True | PUBLIC;PANCAN | 0 | 2023-08-12 08:25:24 | 92 | 91 | 89 | 0 | ... | acc_tcga_pan_can_atlas_2018 | acc | Adrenocortical Carcinoma | Purple | ACC | adrenal_gland | acc | hg19 | 29625048,29596782,29622463,29617662,29625055,2... | TCGA, Cell 2018 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 86 | Uterine Corpus Endometrial Carcinoma (TCGA, Na... | Whole exome sequencing of 373 endometrial carc... | True | PUBLIC | 0 | 2023-06-20 17:02:23 | 373 | 248 | 363 | 0 | ... | ucec_tcga_pub | ucec | Endometrial Carcinoma | PeachPuff | UCEC | uterus | ucec | hg19 | 23636398 | TCGA, Nature 2013 |
| 87 | Uterine Corpus Endometrial Carcinoma (TCGA, Pa... | Uterine Corpus Endometrial Carcinoma TCGA PanC... | True | PUBLIC;PANCAN | 0 | 2023-08-15 17:17:50 | 529 | 517 | 523 | 0 | ... | ucec_tcga_pan_can_atlas_2018 | ucec | Endometrial Carcinoma | PeachPuff | UCEC | uterus | ucec | hg19 | 29625048,29596782,29622463,29617662,29625055,2... | TCGA, Cell 2018 |
| 88 | Uveal Melanoma (TCGA, Firehose Legacy) | TCGA Uveal Melanoma. Source data from <A HREF=... | True | PUBLIC | 0 | 2023-06-19 11:16:24 | 80 | 80 | 80 | 0 | ... | uvm_tcga | um | Uveal Melanoma | Green | UM | om | um | hg19 | NaN | NaN |
| 89 | Uveal Melanoma (TCGA, PanCancer Atlas) | Uveal melanoma TCGA PanCancer data. The origin... | True | PUBLIC;PANCAN | 0 | 2023-08-15 21:52:13 | 80 | 80 | 80 | 0 | ... | uvm_tcga_pan_can_atlas_2018 | um | Uveal Melanoma | Green | UM | om | um | hg19 | 29625048,29596782,29622463,29617662,29625055,2... | TCGA, Cell 2018 |
| 90 | RAD51B Associated Mixed Cancers (Mandelker 2021) | Germline RAD51B loss-of-function variants conf... | True | 0 | 2023-06-20 10:33:20 | 19 | 19 | 0 | 0 | ... | mixed_msk_tcga_2021 | mixed | Mixed Cancer Types | Black | MIXED | other | mixed | hg19 | NaN | NaN |
91 rows × 29 columns
df2 = std.get_study(study_id="brca_tcga")
df2
| name | description | publicStudy | groups | status | importDate | allSampleCount | sequencedSampleCount | cnaSampleCount | mrnaRnaSeqSampleCount | ... | readPermission | treatmentCount | studyId | cancerTypeId | cancerType_name | cancerType_dedicatedColor | cancerType_shortName | cancerType_parent | cancerType_cancerTypeId | referenceGenome | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Breast Invasive Carcinoma (TCGA, Firehose Legacy) | TCGA Breast Invasive Carcinoma. Source data fr... | True | PUBLIC | 0 | 2023-11-09 17:45:45 | 1108 | 982 | 1080 | 0 | ... | True | 0 | brca_tcga | brca | Invasive Breast Carcinoma | HotPink | BRCA | breast | brca | hg19 |
1 rows × 27 columns
df3 = std.get_tags_of_study(study_id="bowel_colitis_msk_2022")
df3
df4 = std.fetch_studies(study_ids=["brca_tcga","acc_tcga"], projection="DETAILED")
df4
| name | description | publicStudy | groups | status | importDate | allSampleCount | sequencedSampleCount | cnaSampleCount | mrnaRnaSeqSampleCount | ... | readPermission | treatmentCount | studyId | cancerTypeId | cancerType_name | cancerType_dedicatedColor | cancerType_shortName | cancerType_parent | cancerType_cancerTypeId | referenceGenome | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Adrenocortical Carcinoma (TCGA, Firehose Legacy) | TCGA Adrenocortical Carcinoma. Source data fro... | True | PUBLIC | 0 | 2023-06-19 09:42:47 | 92 | 90 | 90 | 0 | ... | True | 0 | acc_tcga | acc | Adrenocortical Carcinoma | Purple | ACC | adrenal_gland | acc | hg19 |
| 1 | Breast Invasive Carcinoma (TCGA, Firehose Legacy) | TCGA Breast Invasive Carcinoma. Source data fr... | True | PUBLIC | 0 | 2023-11-09 17:45:45 | 1108 | 982 | 1080 | 0 | ... | True | 0 | brca_tcga | brca | Invasive Breast Carcinoma | HotPink | BRCA | breast | brca | hg19 |
2 rows × 27 columns
df5 = std.fetch_tags_for_studies(study_ids=["brca_tcga","acc_tcga"])
df5
Response is empty. No data available.