scrm is uses a syntax compatible with the popular program ms. There are, however, a few differences to ms:
-c
in ms) and-s
),-L
produces a slightly different output
and-l
(approximation),-sr
(changing recombination rate),-st
(changing mutation rate),-eI
(sampling haplotypes at multiple time points)
and-oSFS
(generates frequency spectra).-ema
. Our version of the command is just
-ema <t> <M11> <M12> ...
instead of
-ema <t> <npop> <M11> <M12> ...
.For all other options, you can also refer to ms’
manual to get a detailed description of what the commands are doing.
scrm should happily execute any ms command that does not contain
-c
, -s
and -ema
. Also, scrm has
somewhat stricter requirements regarding the order of arguments if
population admixture (-es
) is involved.
The arguments for calling scrm are
scrm <nhap> <nrep> [...]
where nhap is the total number of haplotypes (in all
populations and at all times) that are simulated at each locus, and
nrep is the number of independent loci that will be produced.
The [...]
is an optional placeholder for an arbitrary
number of command line flags described below.
-r <R> <L>
: Set the recombination rate to
R = 4N0r and the length of all loci to L base pairs.
r is expected number of recombinations on the locus per
generation.-l <l>
: Use approximation rather than simulating
the exact ARG. Within a sliding window of length l base pairs
all linkage information is considered when building the genealogy. To
positions outside of this window, some linkage is ignored. Setting
l=0 produces the SMC’ and l=-1 deactivates the
approximation. Since v1.6.0, it’s also possible to specify the window’s
length in number of recombinations. To do so, use
-l <x>r
, where x is the number of recombinations
(e.g. -l 100r
for a window spanning 100 recombinations).
Also starting with version 1.6.0 approximation is turned on by
default using a conservative window length of 500
recombinations. For most applications, it should be fine to reduce this
value to 100 - 250 recombinations if runtime is a critical factor.In all commands, migrations rates M = 4N0m, where m is the fraction of a population that is replaced with migrants from other populations each generation (looking forwards in time).
-I <npop> <s1> ... <sn> [<M>]
:
Use an island model with npop populations, where s1 to
sn haplotypes are sampled from population 1 to n, respectively.
Optionally assume a symmetric migration rate of M.-M <M>
: Assume a symmetric migration rate of
M/(npop-1).-m <i> <j> <M>
: Set the migration
rate from population j to population i to M
(looking forward in time) [since v1.3.1].-ma <M11> <M21> ... <M21> ...
: Set
the migration matrix (Dimension is npop x npop). Diagonals
elements are ignored but required (you can use x
or
0
).For exponential growth/decline of a population, the parameter
a changes the size of a population according to the formula
N(s) = N(0)exp(-as), where N(0) is the
population’s size at the time of the command (e.g. 0 for
-g <a>
and -G <a>
and t
for -eg <t> <a>
and
-eG <t> <a>
) and N(s) is the size of
the population s time units in the past. Looking forwards in
time, a positive a leads to population growth, while a negative
one generates a decline in population sizes.
-n <i> <n>
: Set the present day size of
population i to _n*N0_.-G <a>
: Set the exponential growth rate of all
populations to a.-g <i> <a>
: Set the exponential growth rate
of population i to a.-t < $\theta$ >
: Set the mutation rate to \(\theta = 4N_0u\), where u is the
neutral mutation rate per locus. If this options is given, scrm
generates the segregating sites output.-transpose-segsites
or
--transpose-segsites
: If given, the segregating sites are
printed with each row representing a mutation and each column
representing a haplotype, rather than the other way round. Additionally,
the time at which a mutation occurred is reported (in units of 4 *
N0 generations) [since v1.7.0].-T
: Print the local genealogies in newick format.-O
: Print the local genealogies in the
oriented forest
format as described in Kelleher
et al. (2014) [since v1.2].-L
: Print the TMRCA and the local tree length for each
segment (behaves different to ms). Both values are scaled in coalescent
time units, e.g. in 4 * N0 generations.-oSFS
: Print the site frequency spectrum. Requires that
the mutation rate \(\theta\) is given
with the ‘-t’ option.-SC [ms|rel|abs]
: Scaling of sequence positions. Either
relative to the locus length between 0 and 1 (rel
),
absolute in base pairs (abs
) or ms
’s scaling
(default) where the positions in the segregating sites output
are relative, and the positions in the trees output are absolute
(ms
) [since v1.3.0].-seed <SEED> [<SEED2> <SEED3>]
:
Specifies a seed for the simulation. You can input up to three
non-negative numbers. If no seed is given, scrm generates one using
entropy provided by the operating system. To reproduce a previous
simulation, use the single number in the second line of the output.-print-model, --print-model
: Prints information about
the model defined by the command line arguments, including calculated
population sizes. Can be useful for debugging or verifying the model
[since v1.5.0].-p <digits>
: Sets the number of significant
digits used in the output [since v1.4.0].-h
, --help
: Prints a help text.-v
, --version
: Prints version
information.The command this section all have a time t as first parameter. Changes made by the commands affect the time from t further back into the past. All times in units of _4*N0_ generations.
-eI <t> <s1> ... <sn>
: Sample
s1 to sn haplotypes are from population 1 to
n, respectively, at time t.-eM <t> <M>
: Assume a symmetric migration
rate of M/(npop-1) at time t.-em <t> <i> <j> <M>
: Set the
migration rate from population j to population i to
M (looking forward in time) at time t [since
v1.3.1].-ema <t> <M11> <M12> ... <M21> ...
:
Set the migration matrix at time t (Dimension is npop x
npop). Diagonals elements are ignored but required (use ‘x’ or 0).
The rates apply pastwards from time t.-eN <t> <n>
: Set the size of all
populations to _n*N0_ at time t.-en <t> <i> <n>
: Set the size of
population i to _n*N0_ at time t.-eg <t> <i> <a>
: Set the exponential
growth rate of population i to a at time
t.-eG <t> <a>
: Set the exponential growth
rate of all populations to a at time t.-es <t> <i> <p>
: Population
admixture. Replaces a fraction 1-p of population i
with haplotypes from a population npop + 1. Technically (and
looking backwards in time), a new population n+1 with size
N0 is created at time t. Migration (to & from) and
growth rates for this population are initially 0. Each lines in
population i is moved to the new population with probability
1-p. Please sort multiple -es
arguments by their
time to avoid confusion about the numbering of populations. Please give
the arguments that affect the whole population (-M
,
-N
, -G
& -ma
) before giving
the first -es
. Also, their timed equivalent’s
(-eM
, -eN
, -eG
, -eI
& -ema
) position on the command line events must also
be sorted by time, at least relative to the -es
argument.
scrm
throws an error if any of these conditions is not met.
In doubt, just sort all command line arguments by their time.-eps <t> <i> <j> <p>
: Partial
admixture. Similar to -es
but replaces a fraction
1-p
of population i with haploids from population
j at time t. Different to -es
, population
j is a normal population that continues to exist at times more
recent than t. Viewed backwards in time, this moves a fraction
1-p of the linages in population i to population
j. This does not change the number of populations, population
sizes, growth or migration rates in any way [since v1.5.0].-ej <t> <j> <i>
: Adds a
specialization event in population i that creates population
j (forwards in time). Technically (and looking backwards in
time), it moves all lines from population j into population
i at time t. Migration rates into population
j are set to 0 for the time further back into the past.When multiple es
, eps
or ej
arguments are given for the same time t, the migrations are
executed in the order in which the commands are given. For example if we
have -es 0.08 2 .2 -ej 0.08 3 1
, first 80% of pop 2 move to
a newly created pop 3 (viewed backwards in time), then everyone that
just moved to pop 3 moves on to pop 1. This is equivalent to
-eps 0.08 2 1 .2
, except that the latter does not create
the empty population 3.
The following commands change the model parameters from at a sequence
position s. You should still set the initial rate with
-r
or -t
, respectively, and then use the
commands prefixed with s
for all changes. Note that
-r
also takes the total length of the sequence as second
argument, while -sr
just has the rate as argument.
-sr <s> <R>
: Set the recombination rate to
R starting at position s.-st <s> <$\theta$>
: Set the mutation rate
to \(\theta\) starting at position
s.