Specification of running parameters of CalicoST
Supporting reference files
- geneticmap_file: str
The path to genetic map file.
- hgtable_file: str
The path to the location of genes in the genome. This file should be a tab-delimited file with the following columns: gene_name, chrom, cdsStart, cdsEnd.
- normalidx_file: str, optional
The path to the file containing the indices of normal spots in the spatial transcriptomics data. Each line is a single index without header.
- tumorprop_file: str, optional
The path to inferred tumor proportions per spot. This file should be a tab-delimited file with the following columns names: barcode, Tumor.
- filtergenelist_file: str, optional
The file to a list of genes to exclude from CNA inference, based on prior knowledge.
- filterregion_file: str, optional
The file to a list of genomic regions to exclude from CNA inference in BED format. E.g., HLA regions.
Phasing parameters
- logphase_shift: float, optional
Adjustment to the strength of Markov Model self-transition in phasing. The higher the value, the higher self-transition probability. Default is -2.0.
- secondary_min_umi: int, optional
The minimum UMI count a genome segment has in pseudobulk of spots in the step of genome segmentation. Default is 300.
Clone inference parameters
- n_clones: int
The number of clones to infer using only BAF signals. Default is 3.
- n_clones_rdr: int, optional
The number of clones to refine for each BAF-identified clone using RDR and BAF signals. Default is 2.
- min_spots_per_clone: int, optional
The minimum number of spots required to call a clone should have. Default is 100.
- min_avgumi_per_clone: int, optional
The minimum average UMI count required for a clone. Default is 10.
- nodepotential: str, optional
One of the following two options: “max” or “weighted_sum”. “max” refers to using the MLE decoding of HMM in evaluating the probability of spots being in each clone. “weighted_sum” refers to using the full HMM posterior probabilities to evaluate the probability of spots being in each clone. Default is “weighted_sum”.
- spatial_weight: float, optional
The strength of spatial coherence in HMRF. The higher the value, the stronger the spatial coherence. Default is 1.0.
CNA inference parameters
- n_states: int
The number of allele-specific copy number states in the HMM for CNA inference.
- t: float, optional
The self-transition probability of HMM. The higher the value, the higher probability that adjacent genome segments are in the same CNA state. Default is 1-1e-5.
- max_iter: int, optional
The number of Baum-Welch steps to perform in HMM. Default is 30.
- tol: float, optional
The convergence threshold to terminate Baum-Welch steps. Default is 1e-4.
Merging clones with similar CNAs
- np_threshold: float, optional
The threshold of Neyman Pearson statistics to decide two clones have distinct CNA events. The higher the value, the two clones are merged more easily. Default is 1.0.
- np_eventminlen: int, optional
The minimum number of consecutive genome segments to be considered as a CN event. Default is 10.