Command-Line

Here, the arguments needed to run biohansel effectively are displayed. The required and additional arguments are shown below to see what must be included in a run.

Required

Make sure to be in the directory containing all of the data needed to run a command or that the path to the input data is put into the command following the argument.

  • Genotyping Scheme

    • use -s “scheme”
  • Output/Results Files (any combination so long as there is at least one specified. Details in Output)

    • use -S “filename.tab” | for tech_results.tab

    • use -o “filename.tab” | for results.tab

    • use -O “filename.tab” | for match_results.tab

      • You can also use “.tsv” as the file extension
  • Input data

    • use -i <path/to/fasta> | to specify fasta file to analyze
    • use -p <path/to/forward_reads> <path/to/reverse_reads> | to analyze paired reads
    • use -D <path/to/directory> | to analyze a full directory of data into 1 file

Additional

If any of these arguments are left off of the command used to run biohansel, they will be set to default values for the given analysis.


-M “metadata_scheme.tsv” –> Used to input a metadata scheme that follows all requirements
found in input

--force –> Forces the existing outputs to be overwritten

--json –> Output JSON representation of output files

--min-kmer-freq <#> –> Minimum k-mer coverage needed for a raw reads fastq file to be
considered acceptable by the quality control module (default is 8)

--max-kmer-freq <#> –> Maximum k-mer coverage for a raw reads fastq file to be considered
acceptable (default is 10,000)

--low-cov-depth-freq <#> –> Coverage frequencies of raw read fastq files below this value are
considered as low coverage (default is 20)

--max-missing-kmers <#> –> Decimal proportion of maximum allowable missing kmers before
being considered an error (0.0 - 1.0) (default is 0.05 or 5%)

--min-ambiguous-kmers <#> –> Minimum number of missing kmers to be considered an ambiguous
result (default is 3)

--low-cov-warning <#> –> Overall kmer coverage below this value will trigger a low coverage
warning on raw read fastq files. (default is 20)

--max-intermediate-kmers <#> –> Decimal proportion of maximum allowable missing kmers
(0.0 - 1.0) to be considered an intermediate genotype (default is 0.05)

–threads <#_CPUs> –> Number of parallel threads used to run the analysis (default = 1)

-v –> Verbose: Logs verbosity levels where -v == show warnings and -vv == show debug info

-V –> Displays the version of biohansel installed

Hansel Help Command

If you run hansel -h, you will be provided with additional information for most of the commands along with following usage statement:

.