Studies#

The module studies provides functions related to Studies section of cBioPortal Web Public API.

pybioportal.studies.fetch_studies(study_ids=[], projection='SUMMARY')#

Fetch studies by IDs.

Parameters:
  • study_ids (list of str) – List of study identifiers (e.g., [“brca_tcga”,”acc_tcga”]).

  • projection (str) –

    Level of detail of the response.

    Possible values:

    • ”DETAILED”: Detailed information.

    • ”ID”: Information with only IDs.

    • ”META”: Metadata information.

    • ”SUMMARY”: Summary information (default).

Returns:

A DataFrame containing list of studies by IDs.

Return type:

pandas.DataFrame

pybioportal.studies.fetch_tags_for_studies(study_ids=[])#

Get the study tags by IDs.

Parameters:

study_ids (list of str) – List of study identifiers.

Returns:

A DataFrame containing study tags for multiple studies.

Return type:

pandas.DataFrame

pybioportal.studies.get_all_studies(keyword=None, direction='ASC', pageNumber=0, pageSize=10000000, projection='SUMMARY', sortBy=None)#

Get all studies.

Parameters:
  • keyword (str) – Search keyword that applies to name and cancer type of the studies.

  • direction (str) –

    Direction of the sort.

    Possible values:

    • ”ASC”: Ascending (default).

    • ”DESC”: Descending.

  • pageNumber (int) –

    Page number of the result list.

    • Minimum value is 0.

  • pageSize (int) –

    Page size of the result list.

    • Minimum value is 1, maximum value is 10000000.

  • projection (str) –

    Level of detail of the response.

    Possible values:

    • ”DETAILED”: Detailed information.

    • ”ID”: Information with only IDs.

    • ”META”: Metadata information.

    • ”SUMMARY”: Summary information (default).

  • sortBy (str) –

    Name of the property that the result list is sorted by.

    Possible values:

    • ”cancerTypeId”

    • ”citation”

    • ”description”

    • ”groups”

    • ”importDate”

    • ”name”

    • ”pmid”

    • ”publicStudy”

    • ”status”

    • ”studyId”

Returns:

A DataFrame containing list of studies.

Return type:

pandas.DataFrame

pybioportal.studies.get_study(study_id)#

Get a study by ID.

Parameters:

study_id (str) – Study ID (e.g., “acc_tcga”).

Returns:

A DataFrame containing information about the study.

Return type:

pandas.DataFrame

pybioportal.studies.get_tags_of_study(study_id)#

Get the tags of a study.

Parameters:

study_id (str) – Study ID (e.g., “acc_tcga”).

Returns:

A DataFrame containing tags associated with the study.

Return type:

pandas.DataFrame


Examples#

from pybioportal import studies as std
df1 = std.get_all_studies(keyword="TCGA", projection="DETAILED")
df1
name description publicStudy groups status importDate allSampleCount sequencedSampleCount cnaSampleCount mrnaRnaSeqSampleCount ... studyId cancerTypeId cancerType_name cancerType_dedicatedColor cancerType_shortName cancerType_parent cancerType_cancerTypeId referenceGenome pmid citation
0 Acute Myeloid Leukemia (TCGA, Firehose Legacy) TCGA Acute Myeloid Leukemia. Source data from ... True PUBLIC 0 2023-06-19 09:43:19 200 197 191 179 ... laml_tcga aml Acute Myeloid Leukemia LightSalmon AML mnm aml hg19 NaN NaN
1 Acute Myeloid Leukemia (TCGA, NEJM 2013) Whole-genome or whole-exome sequencing analysi... True PUBLIC 0 2023-06-19 23:07:48 200 200 191 0 ... laml_tcga_pub aml Acute Myeloid Leukemia LightSalmon AML mnm aml hg19 23634996 TCGA, NEJM 2013
2 Acute Myeloid Leukemia (TCGA, PanCancer Atlas) Acute Myeloid Leukemia TCGA PanCancer data. Th... True PUBLIC;PANCAN 0 2023-06-19 12:05:01 200 200 191 0 ... laml_tcga_pan_can_atlas_2018 aml Acute Myeloid Leukemia LightSalmon AML mnm aml hg19 29625048,29596782,29622463,29617662,29625055,2... TCGA, Cell 2018
3 Adrenocortical Carcinoma (TCGA, Firehose Legacy) TCGA Adrenocortical Carcinoma. Source data fro... True PUBLIC 0 2023-06-19 09:42:47 92 90 90 0 ... acc_tcga acc Adrenocortical Carcinoma Purple ACC adrenal_gland acc hg19 NaN NaN
4 Adrenocortical Carcinoma (TCGA, PanCancer Atlas) Adrenocortical Carcinoma TCGA PanCancer data. ... True PUBLIC;PANCAN 0 2023-08-12 08:25:24 92 91 89 0 ... acc_tcga_pan_can_atlas_2018 acc Adrenocortical Carcinoma Purple ACC adrenal_gland acc hg19 29625048,29596782,29622463,29617662,29625055,2... TCGA, Cell 2018
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
86 Uterine Corpus Endometrial Carcinoma (TCGA, Na... Whole exome sequencing of 373 endometrial carc... True PUBLIC 0 2023-06-20 17:02:23 373 248 363 0 ... ucec_tcga_pub ucec Endometrial Carcinoma PeachPuff UCEC uterus ucec hg19 23636398 TCGA, Nature 2013
87 Uterine Corpus Endometrial Carcinoma (TCGA, Pa... Uterine Corpus Endometrial Carcinoma TCGA PanC... True PUBLIC;PANCAN 0 2023-08-15 17:17:50 529 517 523 0 ... ucec_tcga_pan_can_atlas_2018 ucec Endometrial Carcinoma PeachPuff UCEC uterus ucec hg19 29625048,29596782,29622463,29617662,29625055,2... TCGA, Cell 2018
88 Uveal Melanoma (TCGA, Firehose Legacy) TCGA Uveal Melanoma. Source data from <A HREF=... True PUBLIC 0 2023-06-19 11:16:24 80 80 80 0 ... uvm_tcga um Uveal Melanoma Green UM om um hg19 NaN NaN
89 Uveal Melanoma (TCGA, PanCancer Atlas) Uveal melanoma TCGA PanCancer data. The origin... True PUBLIC;PANCAN 0 2023-08-15 21:52:13 80 80 80 0 ... uvm_tcga_pan_can_atlas_2018 um Uveal Melanoma Green UM om um hg19 29625048,29596782,29622463,29617662,29625055,2... TCGA, Cell 2018
90 RAD51B Associated Mixed Cancers (Mandelker 2021) Germline RAD51B loss-of-function variants conf... True 0 2023-06-20 10:33:20 19 19 0 0 ... mixed_msk_tcga_2021 mixed Mixed Cancer Types Black MIXED other mixed hg19 NaN NaN

91 rows × 29 columns

df2 = std.get_study(study_id="brca_tcga")
df2
name description publicStudy groups status importDate allSampleCount sequencedSampleCount cnaSampleCount mrnaRnaSeqSampleCount ... readPermission treatmentCount studyId cancerTypeId cancerType_name cancerType_dedicatedColor cancerType_shortName cancerType_parent cancerType_cancerTypeId referenceGenome
0 Breast Invasive Carcinoma (TCGA, Firehose Legacy) TCGA Breast Invasive Carcinoma. Source data fr... True PUBLIC 0 2023-11-09 17:45:45 1108 982 1080 0 ... True 0 brca_tcga brca Invasive Breast Carcinoma HotPink BRCA breast brca hg19

1 rows × 27 columns

df3 = std.get_tags_of_study(study_id="bowel_colitis_msk_2022")
df3
df4 = std.fetch_studies(study_ids=["brca_tcga","acc_tcga"], projection="DETAILED")
df4
name description publicStudy groups status importDate allSampleCount sequencedSampleCount cnaSampleCount mrnaRnaSeqSampleCount ... readPermission treatmentCount studyId cancerTypeId cancerType_name cancerType_dedicatedColor cancerType_shortName cancerType_parent cancerType_cancerTypeId referenceGenome
0 Adrenocortical Carcinoma (TCGA, Firehose Legacy) TCGA Adrenocortical Carcinoma. Source data fro... True PUBLIC 0 2023-06-19 09:42:47 92 90 90 0 ... True 0 acc_tcga acc Adrenocortical Carcinoma Purple ACC adrenal_gland acc hg19
1 Breast Invasive Carcinoma (TCGA, Firehose Legacy) TCGA Breast Invasive Carcinoma. Source data fr... True PUBLIC 0 2023-11-09 17:45:45 1108 982 1080 0 ... True 0 brca_tcga brca Invasive Breast Carcinoma HotPink BRCA breast brca hg19

2 rows × 27 columns

df5 = std.fetch_tags_for_studies(study_ids=["brca_tcga","acc_tcga"])
df5
Response is empty. No data available.