Clinical Attributes#

The module clinical_attributes provides functions related to Clinical Attributes section of cBioPortal Web Public API.

pybioportal.clinical_attributes.fetch_clinical_attributes(study_ids, projection='SUMMARY')#

Fetch clinical attributes from cBioPortal for a list of study IDs.

Parameters:
  • study_ids (list of str) – List of Study IDs.

  • projection (str) –

    Level of detail of the response.

    Possible values:

    • ”DETAILED”: Detailed information.

    • ”ID”: Information with only IDs.

    • ”META”: Metadata information.

    • ”SUMMARY”: Summary information (default).

Returns:

A DataFrame containing the fetched clinical attributes.

Return type:

pandas.DataFrame

pybioportal.clinical_attributes.get_all_clinical_attributes(direction='ASC', pageNumber=0, pageSize=10000000, projection='SUMMARY', sortBy=None)#

Get all clinical attributes from cBioPortal.

Parameters:
  • direction (str) –

    Direction of the sort.

    Possible values:

    • ”ASC”: Ascending (default).

    • ”DESC”: Descending.

  • pageNumber (int) –

    Page number of the result list.

    • Minimum value is 0.

  • pageSize (int) –

    Page size of the result list.

    • Minimum value is 1, maximum value is 10000000.

  • projection (str) –

    Level of detail of the response.

    Possible values:

    • ”DETAILED”: Detailed information.

    • ”ID”: Information with only IDs.

    • ”META”: Metadata information.

    • ”SUMMARY”: Summary information (default).

  • sortBy (str) –

    Name of the property that the result list is sorted by.

    Possible values:

    • ”clinicalAttributeId”: Sort by clinical attribute ID.

    • ”datatype”: Sort by datatype.

    • ”description”: Sort by description.

    • ”displayName”: Sort by display name.

    • ”patientAttribute”: Sort by patient attribute.

    • ”priority”: Sort by priority.

    • ”studyId”: Sort by study ID.

Returns:

A DataFrame containing the list of clinical attributes.

Return type:

pandas.DataFrame

pybioportal.clinical_attributes.get_all_clinical_attributes_in_study(study_id, direction='ASC', pageNumber=0, pageSize=10000000, projection='SUMMARY', sortBy=None)#

Get all clinical attributes in the specified study. :param study_id: Study ID (e.g., “acc_tcga”).

Parameters:
  • direction (str) –

    Direction of the sort.

    Possible values:

    • ”ASC”: Ascending (default).

    • ”DESC”: Descending.

  • pageNumber (int) –

    Page number of the result list.

    • Minimum value is 0.

  • pageSize (int) –

    Page size of the result list.

    • Minimum value is 1, maximum value is 10000000.

  • projection (str) –

    Level of detail of the response.

    Possible values:

    • ”DETAILED”: Detailed information.

    • ”ID”: Information with only IDs.

    • ”META”: Metadata information.

    • ”SUMMARY”: Summary information (default).

  • sortBy (str) –

    Name of the property that the result list is sorted by.

    Possible values:

    • ”clinicalAttributeId”

    • ”datatype”

    • ”description”

    • ”displayName”

    • ”patientAttribute”

    • ”priority”

    • ”studyId”

Returns:

A DataFrame containing clinical attributes in the specified study.

Return type:

pandas.DataFrame

pybioportal.clinical_attributes.get_clinical_attribute_in_study(study_id, clinical_attribute_id)#

Get the specified clinical attribute in the study.

Parameters:
  • study_id (str) – Study ID (e.g., “acc_tcga”).

  • clinical_attribute_id (str) – Clinical Attribute ID (e.g., “CANCER_TYPE”).

Returns:

A DataFrame containing information about the specified clinical attribute in the study.

Return type:

pandas.DataFrame


Examples#

from pybioportal import clinical_attributes as ca
df1 = ca.get_all_clinical_attributes()
df1
displayName description datatype patientAttribute priority clinicalAttributeId studyId
0 Adjuvant Chemotherapy Adjuvant Chemotherapy STRING True 1 ADJUVANT_CHEMO acbc_mskcc_2015
1 Adjuvant Treatment Adjuvant treatment. STRING True 1 ADJUVANT_TX acbc_mskcc_2015
2 Diagnosis Age Age at which a condition or disease was first ... NUMBER True 1 AGE acbc_mskcc_2015
3 Cancer Type Cancer Type STRING False 1 CANCER_TYPE acbc_mskcc_2015
4 Cancer Type Detailed Cancer Type Detailed STRING False 1 CANCER_TYPE_DETAILED acbc_mskcc_2015
... ... ... ... ... ... ... ...
14752 Race Category The text for reporting information about race. STRING True 1 RACE wt_target_2018_pub
14753 Number of Samples Per Patient Number of Samples Per Patient STRING True 1 SAMPLE_COUNT wt_target_2018_pub
14754 Sex Sex STRING True 1 SEX wt_target_2018_pub
14755 Somatic Status Somatic Status STRING False 1 SOMATIC_STATUS wt_target_2018_pub
14756 TMB (nonsynonymous) TMB (nonsynonymous) NUMBER False 1 TMB_NONSYNONYMOUS wt_target_2018_pub

14757 rows × 7 columns

df2 = ca.fetch_clinical_attributes(study_ids=["brca_tcga", "brca_bccrc"])
df2
displayName description datatype patientAttribute priority clinicalAttributeId studyId
0 Diagnosis Age Age at which a condition or disease was first ... NUMBER True 1 AGE brca_bccrc
1 Angiolymphatic Invasion Presence of angiolymphatic invasion. STRING True 1 ANGIOLYMPHATIC_INVASION brca_bccrc
2 Cancer Type Cancer Type STRING False 1 CANCER_TYPE brca_bccrc
3 Cancer Type Detailed Cancer Type Detailed STRING False 1 CANCER_TYPE_DETAILED brca_bccrc
4 Neoplasm American Joint Committee on Cancer Cl... Extent of the distant metastasis for the cance... STRING True 1 CLIN_M_STAGE brca_bccrc
... ... ... ... ... ... ... ...
156 Time between excision and freezing Time between excision and freezing STRING False 1 TIME_BETWEEN_EXCISION_AND_FREEZING brca_tcga
157 Tissue Source Site A Tissue Source Site collects samples (tissue,... STRING True 1 TISSUE_SOURCE_SITE brca_tcga
158 TMB (nonsynonymous) TMB (nonsynonymous) NUMBER False 1 TMB_NONSYNONYMOUS brca_tcga
159 Person Neoplasm Status The state or condition of an individual's neop... STRING True 1 TUMOR_STATUS brca_tcga
160 Vial number Vial number STRING False 1 VIAL_NUMBER brca_tcga

161 rows × 7 columns

df3a = ca.get_all_clinical_attributes_in_study(study_id="brca_tcga", projection="DETAILED")
df3a
displayName description datatype patientAttribute priority clinicalAttributeId studyId
0 Diagnosis Age Age at which a condition or disease was first ... NUMBER True 1 AGE brca_tcga
1 American Joint Committee on Cancer Metastasis ... Code to represent the defined absence or prese... STRING True 1 AJCC_METASTASIS_PATHOLOGIC_PM brca_tcga
2 Neoplasm Disease Lymph Node Stage American Joi... The codes that represent the stage of cancer b... STRING True 1 AJCC_NODES_PATHOLOGIC_PN brca_tcga
3 Neoplasm Disease Stage American Joint Committe... The extent of a cancer, especially whether the... STRING True 1 AJCC_PATHOLOGIC_TUMOR_STAGE brca_tcga
4 American Joint Committee on Cancer Publication... The version or edition of the American Joint C... STRING True 1 AJCC_STAGING_EDITION brca_tcga
... ... ... ... ... ... ... ...
133 Time between excision and freezing Time between excision and freezing STRING False 1 TIME_BETWEEN_EXCISION_AND_FREEZING brca_tcga
134 Tissue Source Site A Tissue Source Site collects samples (tissue,... STRING True 1 TISSUE_SOURCE_SITE brca_tcga
135 TMB (nonsynonymous) TMB (nonsynonymous) NUMBER False 1 TMB_NONSYNONYMOUS brca_tcga
136 Person Neoplasm Status The state or condition of an individual's neop... STRING True 1 TUMOR_STATUS brca_tcga
137 Vial number Vial number STRING False 1 VIAL_NUMBER brca_tcga

138 rows × 7 columns

In case of data not present on cBioPortal a notification message is reported.

df3b = ca.get_all_clinical_attributes_in_study(study_id="brca_tcga", projection="META")
df3b
Response is empty. No data available.
df4 = ca.get_clinical_attribute_in_study(study_id="brca_tcga", clinical_attribute_id="AGE")
df4
displayName description datatype patientAttribute priority clinicalAttributeId studyId
0 Diagnosis Age Age at which a condition or disease was first ... NUMBER True 1 AGE brca_tcga