Recalc

Recalc is short for “recalculate”, and is basically the command that tells AnkiMorphs to work all its magic. When you run Recalc, AnkiMorphs will go through the cards that match any 'Note Filter' and do the following:

Update the ankimorphs.db with any new seen morphs, known morphs, etc.
Calculate the score of the cards, and then sort the cards based on that score.
Update any cards' extra fields and tags.

Basically, when you run Recalc, AnkiMorphs will go through your collection, recalculate the difficulty of your cards based on your new knowledge, and reorder your new cards in a way that’s optimal for the new you: the you who knows more than you did yesterday.

You can run Recalc as often as you like, but you should run it at least once before or after every study session so that your new cards will appear in the optimal order.

It's easy to forget to run recalc, so you can also check the Recalc on sync settings option, which will take care of recalc for you by running it automatically before Anki syncs your collection.

Note: Recalc can potentially reorganize all your cards, which can cause long sync times. The Anki FAQ has some tricks you can try if this poses a significant problem.

Scoring Algorithm

TL;DR: Low scores are good, high scores are bad.

The order in which new cards are displayed depends on their due value: a card with due = 1 will be shown before a card with due = 2, and so on. Leveraging this property, we can implement the following strategy: assign higher due values to cards with more complex text, pushing them further back in the card queue. Here are some examples of what that might look like:

"She walked home"
- due = 600
"Asymmetric catalysis for the enantioselective synthesis of chiral molecules"
- due = 100 000 000

Now, let's define some properties that we want our cards to have:

Few unknowns morphs (comprehensibility)
High priority morphs (significance)
Ideal length (low deviation)

We can now invert these properties to calculate a "penalty" score, which will then replace the due values of the cards. That formula at the highest level is:

$score = incomprehensibility + insignificance + deviation$

Let’s break it down into smaller components.

incomprehensibility

In practice, the comprehensibility of a given text is determined by a combination of known grammar points and vocabulary. However, evaluating grammar is non-trivial, especially in a general language learning context, so we will not make any explicit attempts to do so.

Determining which morphs are known is relatively easy, so our incomprehensibility score will be the product of the number of unknown morphs and a constant penalty factor.

$incomprehensibility = P U \times ∣ M_{U} ∣$

where $P U : penalty for unknown = 1 0^{6} M : set of identified morphs m_{li} : morph learning interval M_{U} : unknown morphs = {m \in M ∣ m_{li} = 0}$

insignificance

Each morph has a priority value, which AnkiMorphs aggregates into the following metrics: $P_{total}^{all} P_{total}^{unknown} P_{total}^{learning} = m \in M \sum m_{p} = m \in M_{U} \sum m_{p} = m \in M_{L} \sum m_{p} P_{average}^{all} P_{average}^{learning} = \frac{P _{total}^{all}}{M} = \frac{P _{total}^{learning}}{M _{L}}$

where $m_{p} : morph priority M_{L} : learning morphs = {m \in M ∣ 0 < m_{li} < known threshold}$

Note: $P_{average}^{unknown}$ is not included since it would not have any meaningful impact on 1T cards.

You can customize the algorithm by selecting any combination of these metrics and adjusting their influence on the result by changing their corresponding weights. This is done using two column vectors: one for the weights and one for the aggregated metrics. The final score is computed by taking the scalar product of these vectors:

$W_{P} = W_{total}^{all} W_{total}^{unknown} W_{total}^{learning} W_{average}^{all} W_{average}^{learning} P = P_{total}^{all} P_{total}^{unknown} P_{total}^{learning} P_{average}^{all} P_{average}^{learning}$

which gives us: $insignificance = W_{P} \cdot P = w_{1} p_{1} + w_{2} p_{2} + \dots + w_{n} p_{n} = i = 1 \sum n w_{i} p_{i}$

Example:

$W_{P} = 100005 P = 600302010010$
$W_{P} \cdot P = 10 \times 600 + 0 \times 30 + 0 \times 20 + 0 \times 100 + 5 \times 10 = 6050$

Deviation

Learning can be easier with more surrounding context, e.g., other known words. However, if a sentence contains too many words, learning may become more challenging. This is because the complexity of the grammar often increases, along with the likelihood of not perfectly remembering all the surrounding words. Ideally, we want our cards to have sentences within this optimal range.

Having the ability to bias our sentences towards a certain length is also beneficial; you might find it easier to learn from shorter sentences compared to longer ones, or vice versa.

To achieve this, we use a piecewise equation that where we define the following:

How much to penalize excessive morphs
How much to penalize insufficient morphs
The ideal range (target) of morphs

Here is an example of what that might look like:

Playground: https://www.geogebra.org/graphing/ta3eqb8y

This graph shows:

Penalty for excessive morphs: squared in relation to the target difference
Penalty for insufficient morphs: linear in relation to the target difference
Ideal range (target): 4-6 morphs

AnkiMorphs provides the following metrics, whose variables you can adjust, and you can disable or amply them by changing their weights:

$D_{target}^{all} = ⎩ ⎨ ⎧ ⌈ a_{H} (∣ n - T_{H} ∣^{2}) + b_{H} ∣ n - T_{H} ∣ + c_{H} ⌉ ⌈ a_{L} (∣ n - T_{L} ∣^{2}) + b_{L} ∣ n - T_{L} ∣ + c_{L} ⌉ 0 if n > T_{H} if n < T_{L} otherwise$

$D_{target}^{learning} = ⎩ ⎨ ⎧ ⌈ a_{H} (∣ n_{L} - T_{H} ∣^{2}) + b_{H} ∣ n_{L} - T_{H} ∣ + c_{H} ⌉ ⌈ a_{L} (∣ n_{L} - T_{L} ∣^{2}) + b_{L} ∣ n_{L} - T_{L} ∣ + c_{L} ⌉ 0 if n_{L} > T_{H} if n_{L} < T_{L} otherwise$

where $D_{target} : target difference ⌈ ⌉ : round up to the nearest integer T_{H} : high target T_{L} : low target n : number of morphs = ∣ M ∣ n_{L} : number of learning morphs = ∣ M_{L} ∣ a_{H}, b_{H}, c_{H} : coefficients when n or n_{L} is greater than T_{H} a_{L}, b_{L}, c_{L} : coefficients when n or n_{L} is less than T_{L}$

with the following vectors:

$W_{D} = W_{target}^{all} W_{target}^{learning} D = D_{target}^{all} D_{target}^{learning}$

which gives us: $deviation = W_{D} \cdot D = i = 1 \sum n w_{i} d_{i}$

Constraints

We have now refined the formula to:

$score score = incomprehensibility + insignificance + deviation = P U \times ∣ M_{U} ∣ + W_{P} \cdot P + W_{D} \cdot D$

However, there are a few practical concerns we have to address.

First, we need to ensure that cards are primarily ordered by the number of unknown morphs they contain. This means that all 1T cards should appear before any MT cards, regardless of their insignificance or deviation.

To do this, we apply a min function to ensure that the sum of the last two terms does not exceed $P U - 1$ $score = P U \times ∣ M_{U} ∣ + min (W_{P} \cdot P + W_{D} \cdot D, P U - 1)$

where $min : choose the side that has the smallest number$

Lastly, we have to make sure that the score does not exceed the maximum card due value allowed by Anki. The due value is stored as a signed 32-bit integer, with a maximum value of $2^{31} - 1$ . To prevent overflow when cards are shifted, we include a safety margin of $1 0^{8}$ . This results in the upper bound:

$score_{max} = 2^{31} - 1 - 1 0^{8}$

Now we wrap the entire expression in another min function to get our final formula:

$score = min (P U \times ∣ M_{U} ∣ + min (W_{P} \cdot P + W_{D} \cdot D, P U - 1), score_{max})$