Molecular Profiles#

The module molecular_profiles provides functions related to Molecular Profiles section of cBioPortal Web Public API.

pybioportal.molecular_profiles.fetch_molecular_profiles(molecular_profile_ids=None, study_ids=None, projection='SUMMARY')#

Fetch molecular profiles.

Parameters:
  • molecular_profile_ids (list of str) – List of Molecular Profile IDs (e.g., [“brca_tcga_mrna”, “acc_tcga_rna_seq_v2_mrna”]).

  • study_ids (list of str) – List of Study IDs (e.g., .[“brca_tcga”, “acc_tcga”]).

  • projection (str) –

    Level of detail of the response.

    Possible values:

    • ”DETAILED”: Detailed information.

    • ”ID”: Information with only IDs.

    • ”META”: Metadata information.

    • ”SUMMARY”: Summary information (default).

Returns:

A DataFrame containing fetched molecular profiles.

Return type:

pandas.DataFrame

pybioportal.molecular_profiles.get_all_molecular_profiles(direction='ASC', pageNumber=0, pageSize=10000000, projection='SUMMARY', sortBy=None)#

Get all molecular profiles.

Parameters:
  • direction (str) –

    Direction of the sort.

    Possible values:

    • ”ASC”: Ascending (default).

    • ”DESC”: Descending.

  • pageNumber (int) –

    Page number of the result list.

    • Minimum value is 0.

  • pageSize (int) –

    Page size of the result list.

    • Minimum value is 1, maximum value is 10000000.

  • projection (str) –

    Level of detail of the response.

    Possible values:

    • ”DETAILED”: Detailed information.

    • ”ID”: Information with only IDs.

    • ”META”: Metadata information.

    • ”SUMMARY”: Summary information (default).

  • sortBy (str) –

    Name of the property that the result list is sorted by.

    Possible values:

    • ”datatype”: Sort by datatype.

    • ”description”: Sort by description.

    • ”molecularAlterationType”: Sort by molecular alteration type.

    • ”molecularProfileId”: Sort by molecular profile ID.

    • ”name”: Sort by name.

    • ”showProfileInAnalysisTab”: Sort by profile visibility.

Returns:

A DataFrame containing molecular profiles.

Return type:

pandas.DataFrame

pybioportal.molecular_profiles.get_all_molecular_profiles_in_study(study_id, direction='ASC', pageNumber=0, pageSize=10000000, projection='SUMMARY', sortBy=None)#

Get all molecular profiles in a study.

Parameters:
  • study_id (str) – Study ID (e.g., “acc_tcga”).

  • direction (str) –

    Direction of the sort.

    Possible values:

    • ”ASC”: Ascending (default).

    • ”DESC”: Descending.

  • pageNumber (int) –

    Page number of the result list.

    • Minimum value is 0.

  • pageSize (int) –

    Page size of the result list.

    • Minimum value is 1, maximum value is 10000000.

  • projection (str) –

    Level of detail of the response.

    Possible values:

    • ”DETAILED”: Detailed information.

    • ”ID”: Information with only IDs.

    • ”META”: Metadata information.

    • ”SUMMARY”: Summary information (default).

  • sortBy (str) –

    Name of the property that the result dataframe is sorted by.

    Possible values:

    • ”datatype”: Sort by datatype.

    • ”description”: Sort by description.

    • ”molecularAlterationType”: Sort by molecular alteration type.

    • ”molecularProfileId”: Sort by molecular profile ID.

    • ”name”: Sort by name.

    • ”showProfileInAnalysisTab”: Sort by profile visibility.

Returns:

A DataFrame containing molecular profiles in the specified study.

Return type:

pandas.DataFrame

pybioportal.molecular_profiles.get_molecular_profile(molecular_profile_id)#

Get molecular profile.

Parameters:

molecular_profile_id (str) – Molecular Profile ID (e.g., “acc_tcga_mutations”).

Returns:

A DataFrame containing molecular profile information.

Return type:

pandas.DataFrame


Examples#

from pybioportal import molecular_profiles as mf
df1 = mf.get_all_molecular_profiles()
df1.head(5)
molecularAlterationType datatype name description showProfileInAnalysisTab patientLevel molecularProfileId studyId genericAssayType pivotThreshold sortOrder
0 PROTEIN_LEVEL LOG2-VALUE Protein expression (RPPA) Protein expression measured by reverse-phase p... False False acc_tcga_rppa acc_tcga NaN NaN NaN
1 PROTEIN_LEVEL Z-SCORE Protein expression z-scores (RPPA) Protein expression, measured by reverse-phase ... True False acc_tcga_rppa_Zscores acc_tcga NaN NaN NaN
2 COPY_NUMBER_ALTERATION DISCRETE Putative copy-number alterations from GISTIC Putative copy-number calls on 90 cases determi... True False acc_tcga_gistic acc_tcga NaN NaN NaN
3 COPY_NUMBER_ALTERATION CONTINUOUS Capped relative linear copy-number values Capped relative linear copy-number values for ... False False acc_tcga_linear_CNA acc_tcga NaN NaN NaN
4 MUTATION_EXTENDED MAF Mutations Mutation data from whole exome sequencing. Mut... True False acc_tcga_mutations acc_tcga NaN NaN NaN
df2 = mf.get_molecular_profile("gbm_tcga_pan_can_atlas_2018_armlevel_cna")
df2
molecularAlterationType genericAssayType datatype name description showProfileInAnalysisTab patientLevel molecularProfileId studyId study_name ... study_publicStudy study_pmid study_citation study_groups study_status study_importDate study_readPermission study_studyId study_cancerTypeId study_referenceGenome
0 GENERIC_ASSAY ARMLEVEL_CNA CATEGORICAL Putative arm-level copy-number from GISTIC Putative arm-level copy-number from GISTIC 2.0. True False gbm_tcga_pan_can_atlas_2018_armlevel_cna gbm_tcga_pan_can_atlas_2018 Glioblastoma Multiforme (TCGA, PanCancer Atlas) ... True 29625048,29596782,29622463,29617662,29625055,2... TCGA, Cell 2018 PUBLIC;PANCAN 0 2023-08-14 08:28:47 True gbm_tcga_pan_can_atlas_2018 gbm hg19

1 rows × 21 columns

df3a = mf.fetch_molecular_profiles(molecular_profile_ids=["brca_tcga_mrna", "acc_tcga_rna_seq_v2_mrna"])
df3a
molecularAlterationType datatype name description showProfileInAnalysisTab patientLevel molecularProfileId studyId
0 MRNA_EXPRESSION CONTINUOUS mRNA expression (RNA Seq V2 RSEM) mRNA gene expression (RNA Seq V2 RSEM) False False acc_tcga_rna_seq_v2_mrna acc_tcga
1 MRNA_EXPRESSION CONTINUOUS mRNA expression (microarray) Expression levels for 17155 genes in 590 brca ... False False brca_tcga_mrna brca_tcga
df3b = mf.fetch_molecular_profiles(study_ids=["brca_tcga", "acc_tcga"])
df3b.head(5)
molecularAlterationType datatype name description showProfileInAnalysisTab patientLevel molecularProfileId studyId genericAssayType pivotThreshold sortOrder
0 PROTEIN_LEVEL LOG2-VALUE Protein expression (RPPA) Protein expression measured by reverse-phase p... False False acc_tcga_rppa acc_tcga NaN NaN NaN
1 PROTEIN_LEVEL Z-SCORE Protein expression z-scores (RPPA) Protein expression, measured by reverse-phase ... True False acc_tcga_rppa_Zscores acc_tcga NaN NaN NaN
2 COPY_NUMBER_ALTERATION DISCRETE Putative copy-number alterations from GISTIC Putative copy-number calls on 90 cases determi... True False acc_tcga_gistic acc_tcga NaN NaN NaN
3 COPY_NUMBER_ALTERATION CONTINUOUS Capped relative linear copy-number values Capped relative linear copy-number values for ... False False acc_tcga_linear_CNA acc_tcga NaN NaN NaN
4 MUTATION_EXTENDED MAF Mutations Mutation data from whole exome sequencing. Mut... True False acc_tcga_mutations acc_tcga NaN NaN NaN
df4 = mf.get_all_molecular_profiles_in_study(study_id="brca_tcga", sortBy="description")
df4.head(5)
molecularAlterationType datatype name description showProfileInAnalysisTab patientLevel molecularProfileId studyId genericAssayType pivotThreshold sortOrder
0 COPY_NUMBER_ALTERATION CONTINUOUS Capped relative linear copy-number values Capped relative linear copy-number values for ... False False brca_tcga_linear_CNA brca_tcga NaN NaN NaN
1 MRNA_EXPRESSION CONTINUOUS mRNA expression (microarray) Expression levels for 17155 genes in 590 brca ... False False brca_tcga_mrna brca_tcga NaN NaN NaN
2 MRNA_EXPRESSION Z-SCORE mRNA expression z-scores relative to all sampl... Log-transformed mRNA expression z-scores compa... True False brca_tcga_mrna_median_all_sample_Zscores brca_tcga NaN NaN NaN
3 MRNA_EXPRESSION Z-SCORE mRNA expression z-scores relative to all sampl... Log-transformed mRNA expression z-scores compa... True False brca_tcga_rna_seq_v2_mrna_median_all_sample_Zs... brca_tcga NaN NaN NaN
4 METHYLATION CONTINUOUS Methylation (HM450) Methylation (HM450) beta-values for genes in 8... False False brca_tcga_methylation_hm450 brca_tcga NaN NaN NaN