Skip to content

repseqio fromFasta

Creates boilerplate JSON library from existing fasta file.

Usage:

repseqio fromPaddedFasta [-f] \
    [--chains <chain>] \
    [--functionality-index <index>] \
    [--functionality-regexp <regexp>] \
    [--gene-feature <geneFeature>] \
    [--gene-type <gt>] \
    [--ignore-duplicates] \
    [--name-index <index>] \
    [--padding-character <char>] \
    [--species-name <species>] \
    [--taxon-id <id>] \
    [-Lkey=value]... \
    [-Pkey=value]... \
    input_padded.fasta \
    [output.fasta] \
    output.json[.gz]

Options:

-f, --force
Force overwrite of output file(s).
-c, --chain
Chain
-j, --functionality-index
Functionality mark index (0-based) in |-separated FASTA description line (e.g. 3 for IMGT files). If this option is omitted, all genes are considered functional.
--functionality-regexp
Functionality regexp, gene is considered functional if field defined by -j / --functionality-index parameter matches this expression. Default: [\(\[]?[Ff].?.
--gene-feature
Defines gene feature which sequences are contained in the file (e.g. VRegion, VGene, JRegion etc..).
-g, --gene-type
Gene type (V/D/J/C)
-i, --ignore-duplicates
Ignore duplicate genes. By default duplicate genes are not ignored.
-n, --name-index
Gene name index (0-based) in |-separated FASTA description line (e.g. 1 for IMGT files). Default: 0.
-s, --species-name
Species names (can be used multiple times). Default: [].
-t, --taxon-id
Taxon id
-L
Amino-acid pattern of anchor point. Has higher priority than -P for the same anchor point. The syntax is -Lkey=value.
-P
Positions of anchor points in padded / non-padded file. To define position relative to the end of sequence use negative values: -1 = sequence end, -2 = last but one letter. Example: -PFR1Begin=0 -PVEnd=-1, equivalent of --gene-feature VRegion.