Skip to content

Expression Recipes

The following recipes assume the following context:

{
    "record": {
        "variant": "GRCH37-5-36241637-36241637-C",
        "gene": "NADK2",
        "genomic_coordinates": {
            "build": "GRCH37",
            "chromosome": "5",
            "start": 36241637,
            "stop": 36241637
        }
    }
}

Retrieve a variant's allele frequency from ExAC

dataset_field_values('solvebio:public:/ExAC/1.3.0-r0.3/ExAC-GRCh37', 'af', entities=[('variant', record.variant)])

Retrieve a variant's clinical significance from ClinVar

dataset_field_values('solvebio:public:/ClinVar/3.7.4-2017-01-30/Combined-GRCh37', 'clinical_significance', entities=[('variant', record.variant)])

Calculate the prevalence of a gene within a multi-sample dataset

prevalence('solvebio:public:/TCGA/1.2.0-2015-02-11/SomaticMutations-GRCh37', entity=('gene', record.gene), sample_field='patient_barcode')

Normalize a variant (trim and left-shuffle the variant)

normalize_variant(record.variant)

Beacon public datasets for a variant

beacon(record.variant, 'variant', visibility='public')

Calculate the top terms for a string field in a dataset

dataset_field_top_terms('solvebio:public:/ClinVar/3.7.4-2017-01-30/Combined-GRCh37', 'clinical_significance')

Calculate statistics about a numeric field in a dataset

dataset_field_stats('solvebio:public:/ClinVar/3.7.4-2017-01-30/Combined-GRCh37', 'clinical_significance')

Predict the effects of a variant on genes, transcripts, and proteins

predict_variant_effects(record.variant)

Retrieve the sequence of a particular genomic region

genomic_sequence('GRCH37-5-36241600-36241660')

# output
# CCAGCTGCTTCAGGTCCTCCTCCGAGAGCTCCGCGTAACGGTACCGCTGCTGCTCGAACTC

Get the reverse complemented sequence of a particular genomic region

''.join(reversed([{
    'A': 'T',
    'T': 'A',
    'C': 'G',
    'G': 'C'
}.get(nuc) for nuc in genomic_sequence('GRCH37-5-36241600-36241660')]))

# output
# GAGTTCGAGCAGCAGCGGTACCGTTACGCGGAGCTCTCGGAGGAGGACCTGAAGCAGCTGG

Find the GENCODE genes that overlap a genomic region

dataset_field_values('solvebio:public:/GENCODE/2.2.0-24/GENCODE-GRCh37', 'gene_symbol', entities=[('genomic_region', record.genomic_coordinates)], filters=[('feature', 'gene'), ('gene_type', 'protein_coding'), ('gene_status', 'KNOWN')])

Split a string by whitespace or specific delimiter

split(record.variant, '-')

Yaml format

To create a new recipe write it in yaml format as described in the example below:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
recipes:
  - name: cDNA Change
    version: 1.0.0
    description: Gets the cDNA change from the variant field
    is_public: true
    fields:
      name: cdna_change
      data_type: string
      expression: |
          get(translate_variant(record.variant),'cdna_change')
          if record.variant else None
  - name: dbSNP
    description: Adds dbsnp ids using the variant record from a dataset - supports both GRCh37 and GRCh38
    version: 1.0.0
    is_public: true
    fields:
      name: gene
      data_type: string
      entity_type: gene
      is_list: true
      expression: |
        dataset_field_values(
                'solvebio:public:/dbSNP/2.0.0-b151/Variants-{}'.format(get(record, 'genomic_coordinates.build')),
                field='row_id',
                entities=[('variant', record.variant)])
        if get(record, 'variant') else None

Publishing new and updating existing recipes

To publish recipe to the SolveBio use solvebio-recipes client from the command line: If you want to create/update only one of the recipes from the provided yaml file, use command --name:

1
solvebio-recipes sync --name "dbSNP (v1.0.0)" path/to/yaml/file

If you want to create/update all of the recipes from the yaml file, use command --all:

1
solvebio-recipes sync --all path/to/yaml/file

You can also delete the existing recipes with the following command:

1
solvebio-recipes delete --name "dbSNP (v1.0.0)" path/to/yaml/file

In all of described cases, you will be prompted to confirm your choice.

Export of the existing recipes

Export of the existing recipes in SolveBio is possible using the same solvebio-recipes export command. It is possible to export public and account recipes in a yaml file.

To export public recipes into yaml format use the following command:

1
solvebio-recipes export --public-recipes path/to/yaml/file

Otherwise, to export account recipes, use:

1
solvebio-recipes export --account-recipes path/to/yaml/file

Please, make sure that you specify different yaml files for public and account recipes!