Specification of running parameters of CalicoST =============================================== Supporting reference files -------------------------- geneticmap_file: str The path to genetic map file. hgtable_file: str The path to the location of genes in the genome. This file should be a tab-delimited file with the following columns: gene_name, chrom, cdsStart, cdsEnd. normalidx_file: str, optional The path to the file containing the indices of normal spots in the spatial transcriptomics data. Each line is a single index without header. tumorprop_file: str, optional The path to inferred tumor proportions per spot. This file should be a tab-delimited file with the following columns names: barcode, Tumor. filtergenelist_file: str, optional The file to a list of genes to exclude from CNA inference, based on prior knowledge. filterregion_file: str, optional The file to a list of genomic regions to exclude from CNA inference in BED format. E.g., HLA regions. Phasing parameters ------------------ logphase_shift: float, optional Adjustment to the strength of Markov Model self-transition in phasing. The higher the value, the higher self-transition probability. Default is -2.0. secondary_min_umi: int, optional The minimum UMI count a genome segment has in pseudobulk of spots in the step of genome segmentation. Default is 300. Clone inference parameters -------------------------- n_clones: int The number of clones to infer using only BAF signals. Default is 3. n_clones_rdr: int, optional The number of clones to refine for each BAF-identified clone using RDR and BAF signals. Default is 2. min_spots_per_clone: int, optional The minimum number of spots required to call a clone should have. Default is 100. min_avgumi_per_clone: int, optional The minimum average UMI count required for a clone. Default is 10. maxspots_pooling: int, optional If the UMI counts per spot are too low, CalicoST will pool this number of adjacent spots to infer the clone assignment at each HMRF step. Default is 7. nodepotential: str, optional One of the following two options: "max" or "weighted_sum". "max" refers to using the MLE decoding of HMM in evaluating the probability of spots being in each clone. "weighted_sum" refers to using the full HMM posterior probabilities to evaluate the probability of spots being in each clone. Default is "weighted_sum". spatial_weight: float, optional The strength of spatial coherence in HMRF. The higher the value, the stronger the spatial coherence. Default is 1.0. construct_adjacency_method: str, optional Choosing from one of the two methods to construct the adjacency graph for HMRF, "hexagon" or "KNN". "hexagon" assumes the spot localization forms a hexagonal grid as in Visium platform. "KNN" assumes the spot localization is arbitrary and uses K-nearest neighbors to construct the adjacency graph. Default is "hexagon". construct_adjacency_w: float, optional If using KNN to construct the adjacency matrix, CalicoST allows combining the spatial similarity with the expression similarity for the adjacency matrix. This weight, ranging between 0 and 1, specifies the weight of spatial similarity. Default is 1.0. CNA inference parameters ------------------------ n_states: int The number of allele-specific copy number states in the HMM for CNA inference. t: float, optional The self-transition probability of HMM. The higher the value, the higher probability that adjacent genome segments are in the same CNA state. Default is 1-1e-5. max_iter: int, optional The number of Baum-Welch steps to perform in HMM. Default is 30. tol: float, optional The convergence threshold to terminate Baum-Welch steps. Default is 1e-4. Merging clones with similar CNAs -------------------------------- np_threshold: float, optional The threshold of Neyman Pearson statistics to decide two clones have distinct CNA events. The higher the value, the two clones are merged more easily. Default is 1.0. np_eventminlen: int, optional The minimum number of consecutive genome segments to be considered as a CN event. Default is 10.