Molecular Profiles#
The module molecular_profiles provides functions related to Molecular Profiles section of
cBioPortal Web Public API.
- pybioportal.molecular_profiles.fetch_molecular_profiles(molecular_profile_ids=None, study_ids=None, projection='SUMMARY')#
Fetch molecular profiles.
- Parameters:
molecular_profile_ids (list of str) – List of Molecular Profile IDs (e.g., [“brca_tcga_mrna”, “acc_tcga_rna_seq_v2_mrna”]).
study_ids (list of str) – List of Study IDs (e.g., .[“brca_tcga”, “acc_tcga”]).
projection (str) –
Level of detail of the response.
Possible values:
”DETAILED”: Detailed information.
”ID”: Information with only IDs.
”META”: Metadata information.
”SUMMARY”: Summary information (default).
- Returns:
A DataFrame containing fetched molecular profiles.
- Return type:
pandas.DataFrame
- pybioportal.molecular_profiles.get_all_molecular_profiles(direction='ASC', pageNumber=0, pageSize=10000000, projection='SUMMARY', sortBy=None)#
Get all molecular profiles.
- Parameters:
direction (str) –
Direction of the sort.
Possible values:
”ASC”: Ascending (default).
”DESC”: Descending.
pageNumber (int) –
Page number of the result list.
Minimum value is 0.
pageSize (int) –
Page size of the result list.
Minimum value is 1, maximum value is 10000000.
projection (str) –
Level of detail of the response.
Possible values:
”DETAILED”: Detailed information.
”ID”: Information with only IDs.
”META”: Metadata information.
”SUMMARY”: Summary information (default).
sortBy (str) –
Name of the property that the result list is sorted by.
Possible values:
”datatype”: Sort by datatype.
”description”: Sort by description.
”molecularAlterationType”: Sort by molecular alteration type.
”molecularProfileId”: Sort by molecular profile ID.
”name”: Sort by name.
”showProfileInAnalysisTab”: Sort by profile visibility.
- Returns:
A DataFrame containing molecular profiles.
- Return type:
pandas.DataFrame
- pybioportal.molecular_profiles.get_all_molecular_profiles_in_study(study_id, direction='ASC', pageNumber=0, pageSize=10000000, projection='SUMMARY', sortBy=None)#
Get all molecular profiles in a study.
- Parameters:
study_id (str) – Study ID (e.g., “acc_tcga”).
direction (str) –
Direction of the sort.
Possible values:
”ASC”: Ascending (default).
”DESC”: Descending.
pageNumber (int) –
Page number of the result list.
Minimum value is 0.
pageSize (int) –
Page size of the result list.
Minimum value is 1, maximum value is 10000000.
projection (str) –
Level of detail of the response.
Possible values:
”DETAILED”: Detailed information.
”ID”: Information with only IDs.
”META”: Metadata information.
”SUMMARY”: Summary information (default).
sortBy (str) –
Name of the property that the result dataframe is sorted by.
Possible values:
”datatype”: Sort by datatype.
”description”: Sort by description.
”molecularAlterationType”: Sort by molecular alteration type.
”molecularProfileId”: Sort by molecular profile ID.
”name”: Sort by name.
”showProfileInAnalysisTab”: Sort by profile visibility.
- Returns:
A DataFrame containing molecular profiles in the specified study.
- Return type:
pandas.DataFrame
- pybioportal.molecular_profiles.get_molecular_profile(molecular_profile_id)#
Get molecular profile.
- Parameters:
molecular_profile_id (str) – Molecular Profile ID (e.g., “acc_tcga_mutations”).
- Returns:
A DataFrame containing molecular profile information.
- Return type:
pandas.DataFrame
Examples#
from pybioportal import molecular_profiles as mf
df1 = mf.get_all_molecular_profiles()
df1.head(5)
| molecularAlterationType | datatype | name | description | showProfileInAnalysisTab | patientLevel | molecularProfileId | studyId | genericAssayType | pivotThreshold | sortOrder | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | PROTEIN_LEVEL | LOG2-VALUE | Protein expression (RPPA) | Protein expression measured by reverse-phase p... | False | False | acc_tcga_rppa | acc_tcga | NaN | NaN | NaN |
| 1 | PROTEIN_LEVEL | Z-SCORE | Protein expression z-scores (RPPA) | Protein expression, measured by reverse-phase ... | True | False | acc_tcga_rppa_Zscores | acc_tcga | NaN | NaN | NaN |
| 2 | COPY_NUMBER_ALTERATION | DISCRETE | Putative copy-number alterations from GISTIC | Putative copy-number calls on 90 cases determi... | True | False | acc_tcga_gistic | acc_tcga | NaN | NaN | NaN |
| 3 | COPY_NUMBER_ALTERATION | CONTINUOUS | Capped relative linear copy-number values | Capped relative linear copy-number values for ... | False | False | acc_tcga_linear_CNA | acc_tcga | NaN | NaN | NaN |
| 4 | MUTATION_EXTENDED | MAF | Mutations | Mutation data from whole exome sequencing. Mut... | True | False | acc_tcga_mutations | acc_tcga | NaN | NaN | NaN |
df2 = mf.get_molecular_profile("gbm_tcga_pan_can_atlas_2018_armlevel_cna")
df2
| molecularAlterationType | genericAssayType | datatype | name | description | showProfileInAnalysisTab | patientLevel | molecularProfileId | studyId | study_name | ... | study_publicStudy | study_pmid | study_citation | study_groups | study_status | study_importDate | study_readPermission | study_studyId | study_cancerTypeId | study_referenceGenome | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | GENERIC_ASSAY | ARMLEVEL_CNA | CATEGORICAL | Putative arm-level copy-number from GISTIC | Putative arm-level copy-number from GISTIC 2.0. | True | False | gbm_tcga_pan_can_atlas_2018_armlevel_cna | gbm_tcga_pan_can_atlas_2018 | Glioblastoma Multiforme (TCGA, PanCancer Atlas) | ... | True | 29625048,29596782,29622463,29617662,29625055,2... | TCGA, Cell 2018 | PUBLIC;PANCAN | 0 | 2023-08-14 08:28:47 | True | gbm_tcga_pan_can_atlas_2018 | gbm | hg19 |
1 rows × 21 columns
df3a = mf.fetch_molecular_profiles(molecular_profile_ids=["brca_tcga_mrna", "acc_tcga_rna_seq_v2_mrna"])
df3a
| molecularAlterationType | datatype | name | description | showProfileInAnalysisTab | patientLevel | molecularProfileId | studyId | |
|---|---|---|---|---|---|---|---|---|
| 0 | MRNA_EXPRESSION | CONTINUOUS | mRNA expression (RNA Seq V2 RSEM) | mRNA gene expression (RNA Seq V2 RSEM) | False | False | acc_tcga_rna_seq_v2_mrna | acc_tcga |
| 1 | MRNA_EXPRESSION | CONTINUOUS | mRNA expression (microarray) | Expression levels for 17155 genes in 590 brca ... | False | False | brca_tcga_mrna | brca_tcga |
df3b = mf.fetch_molecular_profiles(study_ids=["brca_tcga", "acc_tcga"])
df3b.head(5)
| molecularAlterationType | datatype | name | description | showProfileInAnalysisTab | patientLevel | molecularProfileId | studyId | genericAssayType | pivotThreshold | sortOrder | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | PROTEIN_LEVEL | LOG2-VALUE | Protein expression (RPPA) | Protein expression measured by reverse-phase p... | False | False | acc_tcga_rppa | acc_tcga | NaN | NaN | NaN |
| 1 | PROTEIN_LEVEL | Z-SCORE | Protein expression z-scores (RPPA) | Protein expression, measured by reverse-phase ... | True | False | acc_tcga_rppa_Zscores | acc_tcga | NaN | NaN | NaN |
| 2 | COPY_NUMBER_ALTERATION | DISCRETE | Putative copy-number alterations from GISTIC | Putative copy-number calls on 90 cases determi... | True | False | acc_tcga_gistic | acc_tcga | NaN | NaN | NaN |
| 3 | COPY_NUMBER_ALTERATION | CONTINUOUS | Capped relative linear copy-number values | Capped relative linear copy-number values for ... | False | False | acc_tcga_linear_CNA | acc_tcga | NaN | NaN | NaN |
| 4 | MUTATION_EXTENDED | MAF | Mutations | Mutation data from whole exome sequencing. Mut... | True | False | acc_tcga_mutations | acc_tcga | NaN | NaN | NaN |
df4 = mf.get_all_molecular_profiles_in_study(study_id="brca_tcga", sortBy="description")
df4.head(5)
| molecularAlterationType | datatype | name | description | showProfileInAnalysisTab | patientLevel | molecularProfileId | studyId | genericAssayType | pivotThreshold | sortOrder | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | COPY_NUMBER_ALTERATION | CONTINUOUS | Capped relative linear copy-number values | Capped relative linear copy-number values for ... | False | False | brca_tcga_linear_CNA | brca_tcga | NaN | NaN | NaN |
| 1 | MRNA_EXPRESSION | CONTINUOUS | mRNA expression (microarray) | Expression levels for 17155 genes in 590 brca ... | False | False | brca_tcga_mrna | brca_tcga | NaN | NaN | NaN |
| 2 | MRNA_EXPRESSION | Z-SCORE | mRNA expression z-scores relative to all sampl... | Log-transformed mRNA expression z-scores compa... | True | False | brca_tcga_mrna_median_all_sample_Zscores | brca_tcga | NaN | NaN | NaN |
| 3 | MRNA_EXPRESSION | Z-SCORE | mRNA expression z-scores relative to all sampl... | Log-transformed mRNA expression z-scores compa... | True | False | brca_tcga_rna_seq_v2_mrna_median_all_sample_Zs... | brca_tcga | NaN | NaN | NaN |
| 4 | METHYLATION | CONTINUOUS | Methylation (HM450) | Methylation (HM450) beta-values for genes in 8... | False | False | brca_tcga_methylation_hm450 | brca_tcga | NaN | NaN | NaN |