amazing work on the original version of the user guide!
Introduction
AnkiMorphs is an Anki add-on that can rearrange your cards based on how well you know the words on them and how important the words are to learn. This ensures that your cards are arranged in the best order for optimal language learning.
AnkiMorphs goes through the text on the cards you specify, and parses the text into morphs (basically words). It assumes you already know all the morphs contained within the cards you’ve learned. In this way, it creates a database of your current knowledge and uses that database to analyze how many unknown morphs are contained within each of your new cards.
It then reorders your new cards based on their score so that you see the easiest cards (i.e., the cards with the fewest number of unknown morphs) first. AnkiMorphs only reorders your new cards; it doesn’t touch the scheduling of cards you’ve already learned. You can tell AnkiMorphs to re-analyze and reorder your cards as often as you like. This allows you to always learn new cards in a 1T fashion.
This guide is an attempt to explain how AnkiMorphs functions as simply as possible. Feel free to skip straight to Installation, Setup, or Usage, and refer back to the Glossary whenever clarification is needed.
Glossary
1T Sentence
Abbreviation for “one-target sentence”. A sentence that contains one unknown word or grammar structure. The unknown word or structure is referred to as the “target word” or “target structure”.
Learning through 1T sentences can be thought of as “picking low-hanging fruit”. It makes the target word/structure easy to understand and retain. As you continue to learn, sentences that were previously one-target will become zero-target, and sentences that were previously multi-target will become one-target. In this way, one-target sentences can take you all the way to fluency.
Learn about the "Input Hypothesis"
MT Sentence
Abbreviation for “multi-target sentence”. A sentence that contains more than one unknown word or grammar structure.
Morph
A morph is a basic unit of meaning in language. It's short for the word "morpheme," which is the smallest grammatical unit of speech. A morpheme can be a whole word, like "book" or "run," or a part of a word, like prefixes (re- in "rewrite") or suffixes (-ed in "walked").
Lemma
A lemma is the base form of a word. It's the version you would typically find in a dictionary. For example:
- The lemma for "running," "ran," and "runs" is "run."
- The lemma for "better" and "best" is "good."
Inflection
An inflection is a variation of the base form that shows different grammatical features such as tense, case, voice, aspect, person, number, gender, mood, or comparison. For example:
- "run" (base form) can change to "running," "ran," or "runs" to show different tenses.
- "good" (base form) can change to "better" or "best" to show comparison.
Morphs as tuples
In many language learning systems, morphs are considered as tuples containing two values: a lemma (base form) and an inflection. Here's a simple example table showing different morphs for the verb "to break":
Lemma | Inflection |
---|---|
break | break |
break | broke |
break | breaking |
break | broke |
Understanding and breaking down morphs into lemmas and inflections can be incredibly useful for language learning. It allows you to focus on the fundamental building blocks of words, making it easier to grasp new vocabulary and grammatical structures. This approach can help in creating more effective and personalized study methods, potentially leading to faster and more efficient learning.
sub2srs
You can get automatically generated Anki cards from tv-shows or movies by using a tool called sub2srs. Generating decks with sub2srs is pretty technical, so I recommend finding sub2srs decks other people have already made.
You can download many different anime sub2srs decks from this site.
New cards
A card is considered 'new' by Anki if it hasn't been reviewed yet, meaning you have never answered the card with 'Again', 'Hard', 'Good', or 'Easy'.
You can tell if a card is in the 'new' state when its due
value looks like this: New #....
After reviewing a card, you can change its state back to 'new' by using the reset option.
Reviewed cards
Once a card has been reviewed once, i.e. answered with either 'Again', 'Hard', 'Good', or 'Easy', it will move from the 'new' state into the 'review' state.
Profile folder
For AnkiMorphs to work, it needs to use some dedicated files and folders, namely:
ankimorphs.db
names.txt
priority-files/
known-morphs/
Those can be found in the Anki profile folder. The path to the Anki profile folder depends on your operating system:
- Windows:
C:\Users\[user]\AppData\Roaming\Anki2\[profile_name]
- Mac:
/Users/[user]/Library/Application Support/Anki2/[profile_name]
- Linux:
/home/[user]/.local/share/Anki2/[profile_name]
Installation
You can download the latest version of AnkiMorphs from ankiweb. You can find previous versions on github releases.
AnkiMorphs parses text into morphs by using external morphemizers, and different languages will require different morphemizers. Below are the currently supported morphemizers:
Japanese morphemizers
Japanese has two available morphemizers:
MeCab morphemizer (recommended)
This can be added by installing the ankimorphs-japanese-mecab companion add-on (installation code:1974309724
). Once this add-on has been installed and Anki has been restarted, the morphemizer will show up as the optionAnkiMorphs: Japanese
install spaCy with Japanese models
Chinese morphemizers
Chinese has two available morphemizers:
Jieba morphemizer (recommended)
This can be added by installing the ankimorphs-chinese-jieba companion add-on (installation code:1857311956
). Once this add-on has been installed and Anki has been restarted, the morphemizer will show up as the optionAnkiMorphs: Chinese
install spaCy with Chinese models
Morphemizers for other languages
For other languages you can install spaCy, which currently supports:
Catalan, Chinese, Croatian, Danish, Dutch, English, Finnish, French, German, Greek (Modern), Italian, Japanese, Korean, Lithuanian, Macedonian, Norwegian (Bokmål), Polish, Portuguese, Romanian, Russian, Slovenian, Spanish, Swedish, Ukrainian.
After the installation is complete, some setup is required to get AnkiMorphs to work. After that you can run Recalc and you will be good to go!
Here is an overview of the changes that are made to Anki after installing AnkiMorphs.
Choosing spaCy Models
First you need to find out which spaCy model(s) you want to download. A spaCy model determines which language is used
and how to interpret that language. Find the names of the models you want to use from
the spaCy website,
e.g., en_core_web_sm
.
Note that there are usually four different types of models to choose from, each with their distinct suffixes:
- sm (small model)
- md (medium model)
- lg (large model)
- trf (transformer model) NOT SUPPORTED BY ANKIMORPHS!
Larger models are slower, but they might produce better results.
If you go to the spaCy website: Language support section, you can click on the packages of a language to see which models are available.
Installing spaCy
There is, unfortunately, no super simple way to integrate spaCy with Anki, so we have to perform a little bit of terminal magic. This is because spaCy has a relatively large size (usually ~400 MB), so it can't be included as part of AnkiMorphs itself.
Another problem is that we have to install spaCy with the same python version that Anki uses. If you downloaded Anki
from their website, the python version will be 3.9
. This version of python is considered outdated, further complicating
the process somewhat.
If your Anki is from another source, then read the section below:
Non-standard Anki builds
Note: If you are using a non-standard Anki build (e.g.
anki-bin
from AUR), then the python version will probably not be3.9
. To check which python version your Anki is using, go toHelp -> About
, and you will find something like this:Python 3.9.15 Qt 6.6.1 PyQt 6.6.1
Because of the way the python packaging system works, we have to install spaCy with a python version that has the same first two number groups, i.e. if your Anki showsPython 3.11.xx
, you can install spaCy using anyPython 3.11.yy
version.
The rest of this guide assumes Anki usesPython 3.9
, but if that is not the cause, then substitute3.9
in the terminal commands with whatever your Anki is using.
With all that being said, this only needs to be done once, so hopefully it's not too bad.
Windows
First, we need to have Python 3.9
on our system. Go to the start menu, open a Command Prompt, and type in:
py -3.9 --version
If your output is not Python 3.9.x
, then 3.9 has to be installed.
Note: If you install Python 3.9 in a different way than the instructions below, then you might encounter important differences that could prevent you from accessing the spaCy morphemizers in Anki.
Installing Python
Go to https://www.python.org/downloads/release/python-3913/ and
download the Windows installer (64-bit)
at the bottom of the page.
Note: When you start the installer, make sure to select the Add python.exe to PATH
checkbox at the very bottom:
Install with the default settings ("Install Now").
After the installation, go back to the command prompt and type in py -3.9 --version
again. You should now see the new
Python version you installed.
Now we are ready to install spaCy and the models you want to use. Paste these commands into the command prompt:
cd %HOMEPATH%\AppData\Roaming\Anki2\addons21
py -3.9 -m pip install --upgrade pip virtualenv
py -3.9 -m venv spacyenv
spacyenv\Scripts\activate
py -m pip install --upgrade pip setuptools wheel
py -m pip install --upgrade spacy six
In the same command prompt, we now want to download the models. Here I'll use the Korean model ko_core_news_sm
and the
Russian
model ru_core_news_sm
.
py -m spacy download ko_core_news_sm
py -m spacy download ru_core_news_sm
deactivate
Now those spaCy models should be available as morphemizers in AnkiMorphs!
macOS
First, we need to have `Python 3.9` on our system. Go to the start menu, Open a terminal and type:python3.9 --version
If your output is not Python 3.9.x
, then 3.9 has to be installed.
Note: If you install Python 3.9 in a different way than the instructions below, then you might encounter important differences that could prevent you from accessing the spaCy morphemizers in Anki.
Installing Python
Go to https://www.python.org/downloads/release/python-3913/ and
download the macOS 64-bit universal2 installer
at the bottom of the page.
Install with the default settings ("Install Now").
After the installation, open a new terminal and type in python3.9 --version
again. You should now see the new Python
version you installed.
Now we are ready to install spaCy and the models you want to use. Paste this into the terminal:
cd ~/Library/Application\ Support/Anki2/addons21
python3.9 -m pip install --upgrade pip virtualenv
python3.9 -m venv spacyenv
. spacyenv/bin/activate
python -m pip install --upgrade pip setuptools wheel
python -m pip install --upgrade spacy six
In the same terminal, we now want to download the models. Here I'll use the Korean model ko_core_news_sm
and the
Russian
model ru_core_news_sm
.
python -m spacy download ko_core_news_sm
python -m spacy download ru_core_news_sm
deactivate
Now those spaCy models should be available as morphemizers in AnkiMorphs!
Linux
First, we need to have Python 3.9
on our system. Go to the start menu, Open a terminal and type:
python3.9 --version
If your output is not Python 3.9.x
, then 3.9 has to be installed.
Installing Python
This is the hardest part of the installation process because Python 3.9 is considered dead, and it can therefore be tricky to download and install.
If you are on a Debian distro you can install it from the deadsnakes PPA:
sudo apt update
sudo apt install software-properties-common
sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt update
sudo apt install python3.9
sudo apt install python3.9-venv
Another alternative that also works on other distros is pyenv.
After the installation, open a new terminal and type in python3.9 --version
again. You should now see the new Python
version you installed.
Now we are ready to install spaCy and the models you want to use. Open a terminal and cd
to the addons21 directory,
e.g:
cd ~/.local/share/Anki2/addons21/
Then install spaCy:
python3.9 -m pip install --upgrade pip virtualenv
python3.9 -m venv spacyenv
source spacyenv/bin/activate
python -m pip install --upgrade pip setuptools wheel
python -m pip install --upgrade spacy six
In the same terminal, we now want to download the models. Here I'll use the Korean model ko_core_news_sm
and the
Russian
model ru_core_news_sm
.
python -m spacy download ko_core_news_sm
python -m spacy download ru_core_news_sm
deactivate
Now those spaCy models should be available as morphemizers in AnkiMorphs!
Potential problems
PowerShell Execution Policy Error
This is a safeguard against running malicious scripts, which is generally a good thing. To allow an exception for this one time, you can use the command:
Set-ExecutionPolicy -ExecutionPolicy Unrestricted -Scope Process
If you want to permanently remove this restriction for your user, then use the command:
Set-ExecutionPolicy -ExecutionPolicy Unrestricted -Scope CurrentUser
ImportError: cannot import name 'ModelMetaclass' from 'pydantic.main'
Some people on Arch have experienced problems getting spaCy for work when they already have installed the python-spacy and python-thinc AUR packages. Uninstalling those packages can potentially fix this import error. For more info see issue #239.
Changes To Anki
After installing AnkiMorphs you will find that some changes have been made to Anki.
Toolbar
The toolbar now has three new items:
- Recalc
L
, which stands forKnown Morph Lemmas
I
, which stands forKnown Morph Inflections
English examples of L and I
Each column in the table contains a morph lemma, and every row in a column contains a different inflection of that lemma.
Knowing the morph in the highlighted cell below would give you L: 1 and I: 1
go break read walk went broke read walked going breaking reading walking gone broken read walked Knowing the morphs in the highlighted cells below would give you L: 1 and I: 2
go break read walk went broke read walked going breaking reading walking gone broken read walked Knowing the morphs in the highlighted cells below would give you L: 2 and I: 3
go break read walk went broke read walked going breaking reading walking gone broken read walked
Japanese examples of L and I
Each column in the table contains a morph lemma, and every row in a column contains a different inflection of that lemma.
Knowing the morph in the highlighted cell below would give you L: 1 and I: 1
ない 物 奴 出 ねぇ もの やつ 出る ね もん ヤツ 出よう Knowing the morphs in the highlighted cells below would give you L: 1 and I: 2
ない 物 奴 出 ねぇ もの やつ 出る ね もん ヤツ 出よう Knowing the morphs in the highlighted cells below would give you L: 2 and I: 3
ない 物 奴 出 ねぇ もの やつ 出る ね もん ヤツ 出よう
The L
and I
numbers are updated after every Recalc.
Note: Chinese and other languages that don't have inflections will result in
L
andI
having equal numbers.
Browse
AnkiMorphs adds new options in the Browse
window
These options can be accessed either from the context menu when right-clicking cards, or from the AnkiMorphs
menu at
the top of the Browse
window:
These features are explained here.
Tools Menu
An AnkiMorphs submenu is added to the Anki Tools
menu:
You can find info about the options here:
Setup
The setup guide is separated into the following sections:
- Decks: which deck-options to use and other miscellaneous deck tips
- Settings: details about the AnkiMorphs settings options
- Prioritizing: how to give priority to morphs
- Names: how to specify names to ignore
- Setting Known Morphs: how to import known morphs
Decks
The more cards you have with unique morph combinations, the more likely AnkiMorphs is to find the ideal next learning card for you. Large decks like those generated by sub2srs are therefore well-suited for AnkiMorphs. The drawbacks of large decks are that they require more disk space and the first sync might take a while, so find a size that works best for you.
In this guide I will be using this Japanese to English deck with 37K sub2srs cards.
If your deck has sub-decks like mine does then you also need to change one deck-options setting to get it working properly.
First Sync
The first time you sync a deck from/to a device, the media files are downloaded/uploaded which can take a long time depending on how many files there are. Note that sub2srs cards usually have two media files (screenshot image and sentence audio), which results in more files being synced than there are cards.
Errors can occur during the sync process, but the progress is usually saved, and you just have to click the sync button again.
Every subsequent sync should be lightning fast, even if AnkiMorphs makes changed to all your cards, it will usually only take a couple of seconds to sync with a normal internet connection because no media files were changed.
Deck-options
If your deck has sub-decks like mine does, then you need to configure the deck option in Anki to gather new cards from all the sub-decks, otherwise Anki only pulls new cards from one sub-deck at a time until that deck is empty. To fix this do the following:
- Go to the deck options of the deck that has sub-decks
- Go to
Display Order
- Set
New card gather order
toAscending position
Settings
To display the settings dialog either use Ctrl+Shift+S
or go to Tools -> AnkiMorphs -> Settings
The settings are separated into the following sections:
- General: miscellaneous settings
- Note Filter: set which cards you want AnkiMorphs to analyze and sort
- Extra Fields: have AnkiMorphs add extra information to your cards
- Tags: rename the tags AnkiMorphs uses
- Preprocess: adjust the text AnkiMorphs analyzes
- Card Handling: adjust how AnkiMorphs handles cards
- Algorithm: adjust the scoring algorithm
- Toolbar: adjust the look of the toolbar
- Shortcuts: adjust keyboard shortcuts
General
-
Evaluate morphs based on their lemma or inflection:
This impacts the two things:- scoring algorithm: use the morph priority associated with the inflection or the lemma.
- skipping: skip morphs based on their lemma or inflection.
-
Morphs are considered known when [...]:
This is variable is used when text is highlighted, and it determines the L and I numbers. -
Read files in 'known-morphs' folder and register morphs as known:
Import known morphs from theknown-morphs
folder. Read more in Settings Known Morphs. -
Automatically Recalc before Anki sync:
Recalc automatically runs before Anki syncs your card collection.Note: If you use the FSRS4Anki Helper add-on with an
Auto [...] after sync
-option enabled, then this can cause a bug where sync and recalc occurs simultaneously.
Note Filter
AnkiMorphs only analyzes and sorts cards that match at least one note filter; if you don't specify any note filters, then AnkiMorphs won't do anything, so this is a necessary step. This can seem overly complicated and overwhelming, but hopefully things will make sense after reading to the end of this page. This is really the heart of the add-on, and it has some powerful options (notably tags), so having a good understanding of note filters work might significantly improve how much you benefit from AnkiMorphs.
Each note filter contains:
- Note Type
- Tags (optional)
- Field
- Morphemizer
- Morph Priority
- Read & Modify (optional)
Note Type
To find a card's note type do the following:
- Go to Browse
- Find a card you want AnkiMorphs to analyze and sort
- Right-click the card
- Click Info
- See
Note Type
All the cards in my Japanese Sentences
deck (and sub-decks) have the same note type, but that might not be the case
for your decks.
Another thing you can do is look through the Note Types
in the left sidebar and until you find the cards you are
after.
Tags
You can further filter AnkiMorphs to only work on cards with a certain note type and with/without specific tag(s).
Let's use an example of having a note type: anime_sub2srs
. The card break-down of the note type is the following:
- Total cards: 20K
- Cards with the tag
demon-slayer
: 6k - Cards with the tag
movie
: 3k - Cards with the tag
fight-scene
: 2k - Cards that have both
demon-slayer
andmovie
tags: 1k
If you want all the 20K cards of note type anime_sub2srs
then leave the tags empty (default):
If you want the 6K cards with the demon-slayer
tag:
If you want the 1k cards that have both demon-slayer
and movie
tags, i.e. the intersection:
If you want the 8K cards that have either demon-slayer
or movie
tags, i.e. the union, then you have to create
two note-filters like this:
If you want the 18K cards that don't have the fight-scene
tag:
Field
This is the field on the card AnkiMorphs reads and analyzes, which is then used to sort the card.
- Go to Browse
- Find the note type in the left sidebar
- Find the field you care about
In my case the field I'm interested in is Japanese
Note: Fields with complete sentences are preferable over fields that only have isolated words. The more context the morphemizers are given, the less likely they are to produce false positives.
Morphemizer
This is the tool AnkiMorphs uses to split text into morphs. See the installation section for how to add morphemizers.
AnkiMorphs comes bundled with two morphemizers: Simple Space Splitter and SSS + Punctuation. These morphemizers are very basic and do not perform any linguistic analysis, meaning they won't provide accurate lemmas. Therefore, you should only use them if no other morphemizers are available for your particular language.
Simple Space Splitter
As the name suggests, this morphemizer just splits words based on whitespace.
Simple Space Splitter + Punctuation
This morphemizer extends the Simple Space Splitter to preserve words containing hyphens (-) and apostrophes ('), ensuring that words like these are not split apart:
mother-in-law
quelqu'un
Note: This morphemizer may not work correctly for some languages, such as Arabic. If you encounter this issue, try using the Simple Space Splitter instead.
Morph Priority
The calculated score of the card, and as a result, the sorting of the card, depends on
the priority you give the morphs. You can either set the priorities to be Collection frequency
(how often the morphs occur in your card collection), or you could use
a custom priority file that specifies the priorities of the morphs.
AnkiMorphs automatically finds .csv files placed
in [anki profile folder]/priority-files/
.
Note: using
Collection frequency
is not recommended because it can be volatile; if you make any changes to your cards (delete, suspend, move, etc.), then it can cause a cascade of sorting changes.
Read & Modify
If, for whatever reason, you don't want AnkiMorphs to read one of the note filters you have set up, then you
can uncheck the Read
option.
If you uncheck Modify
, AnkiMorphs will analyze the
specified fields of cards (and update the database of known morphs based on them), but won’t reorder
or change the cards in any way.
Usage
There are some nuances that are important to be aware of when it comes to note filters:
Order of the note filters
Order matters. In the image above all the cards that have note type ankimorphs_sub2srs
will have the text found in
the Japanese
field analyzed, and then those cards will be sorted based on
the score of that text.
After those cards are analyzed and sorted then the next note filter will take effect: All the cards that have the
note type Kanji
will have the text found in the Front
field analyzed and then those cards will be sorted based on
the score of that text.
Overlapping filters
If a card matches multiple filters, then it will only be analyzed and sorted based on the first matching filter. Any subsequent filters will not analyze and sort the card.
If you were to do something like this:
Then the second filter would do nothing because all the cards would have already been used by the first filter.
If you find yourself in a situation where you have overlapping note filters, then there are two things you can do:
- Make the filters more restrictive by using tags
- Create a new note type.
Tip: All your cards should follow the minimum information principle, not only will that help you remember them better, but it might make the note filters less complicated.
Extra Fields
The text found in the note filter: field is extracted and analyzed by AnkiMorphs. AnkiMorphs can then place information about that text into dedicated fields on your cards.
Note: The first time you select an extra field, you will need to perform a full sync upload to AnkiWeb. If you have a large number of cards (500K+), syncing might become an issue. For more details, refer to the Anki FAQ.
Important: Extra fields add more data to your collection, so only select the fields that will be useful to you.
The fields contain the following:
-
am-all-morphs:
A list of the morphs. -
am-all-morphs-count:
The number of morphs. -
am-unknown-morphs:
A list of the morphs that are still unknown to you. -
am-unknown-morphs-count:
The number of morphs that are still unknown to you. -
am-highlighted:
An HTML version of the text that highlights the morphs based on learning status. -
am-score:
The score AnkiMorphs determined the card to have -
am-score-terms:
The individual score terms -
am-study-morphs:
A list of the morphs that were unknown to you when you first studied the card.
The following fields will only update on new cards:
- am-all-morphs
- am-all-morphs-count
- am-score
- am-score-terms
- am-study-morphs
and these fields will always update, even on reviewed cards:
- am-unknown-morphs
- am-unknown-morphs-count
- am-highlighted
Here is an example card where all the extra-fields have been selected:
The extra fields display morphs in this form:
You can chose to display morphs in their inflected forms:
"walking and talking" -> [walking, and, talking]
or their lemma (base) forms:
"walking and talking" -> [walk, and, talk]
This effects the following three fields:
- am-all-morphs
- am-study-morphs
- am-unknown-morphs
Using am-study-morphs
Adding this field to your card-template can give you a quick way to see which morphs are/were unknown to you on the first encounter. Here is a simplified version of the card template used in the example above:
Using am-*-morphs-count
This is useful if you want to sort your cards in the browser based on how many total/unknown morphs they have.
Using am-highlighted
AnkiMorphs can automatically color-code morphs based on their learning status, i.e., how well you know them.
I recommend only putting the highlighted-field on the back of cards. The reason for this is that, in order to get the best results, you want your SRS experience to simulate real life as much as possible. When reading in real life, you aren’t going to be told which words you know and which you don’t. So, it makes sense to have your sentence cards reflect this.
Here is a simplified example of some of the changes you need to make to your card template to get the results shown
above. Notice that the am-highlighted
-field is substituted for the Japanese
-field on the back of the card.
You also need to add the following to the Styling
section (choose any color you want):
[morph-status=unknown] { color: blue; }
[morph-status=learning] { color: #8bb33d; } /* light-green */
[morph-status=known] { color: green; }
.nightMode [morph-status=unknown] { color: red; }
.nightMode [morph-status=learning] { color: #ffff99; } /* yellow */
.nightMode [morph-status=known] { color: #8bb33d; } /* light-green */
You can pick and choose among these; if you only want unknown morphs to be highlighted, and you don't care about dark-mode, then only adding the first line would be enough.
It’s also possible to use background-color
:
[morph-status=unknown] { background-color: #f7867e; } /* red */
[morph-status=learning] { background-color: #ffff99; } /* yellow */
[morph-status=known] { background-color: #49f53e; } /* green */
.nightMode [morph-status=unknown] { background-color: #b74d4d; } /* red */
.nightMode [morph-status=learning] { background-color: #ccad50; } /* yellow */
.nightMode [morph-status=known] { background-color: #27961f; } /* green */
Furigana and other ruby characters
The am-highlighted
field supports ruby characters
such as furigana. To have furigana displayed properly, you have to prepend furigana:
to the field in the card
template, e.g:
{{furigana:am-highlighted}}
You also have to have the Ignore content in square brackets
preprocess setting activated.
Note: This does not always work flawlessly. The known problems section has more details on how to fix ruby character highlighting problems.
Duplicate Audio Problem
When the back of a card also has an audio field and not just the front, then both might play after each other when you
press Show Answer
on the card. To prevent both playing you can do the following:
- Go to deck-options
- Scroll down to the
Audio
section - Activate
Skip question when replaying answer
Tags
As AnkiMorphs processes cards, it automatically adds and removes various tags. You can customize the names of the different tags if you want, or you can leave them as they are and move on.
Note: Avoid reusing tags from other sources. Mixing different tags can quickly become complicated and confusing.
-
One unknown morph:
Cards that only have one unknown morph will be given this tag -
Multiple unknown morphs:
Cards that have more than one unknown morph will be given this tag -
Fresh morphs:
Cards that have one or more morphs in alearning
state will be given this tag -
Learn card now:
When you use the Learn Card Now feature on a card, it will be given this tag. The purpose of this tag is to make the internal process of theLearn Card Now
feature simpler. Do not manually assign this tag to cards, as it will have no effect. -
Set known and skip:
When you use the Set Known and Skip feature on a card, it will be given this tag. Do not delete cards that have this tag, as AnkiMorphs relies on them to track which morphs you know. -
All morphs known:
New cards that only have morphs you already know will be given this tag. Cards with this tag can safely be deleted without AnkiMorphs losing track of which morphs you know. This can be useful if you want to trim down your card collection.
Preprocess
Here are some options that can preprocess the text on your cards, potentially removing uninteresting morphs for you.
-
Ignore content in square brackets []:
Ignore content such as furigana readings and pitch -
Ignore content in round brackets ():
Ignore content such as character names and readings in scripts -
Ignore content in slim round brackets( ):
Ignore content such as character names and readings in Japanese scripts -
Ignore content in suspended cards:
Ignore text found on suspended cards except for suspended cards that have the Set known and skip tag. This exception makes it so that you can safely suspend cards with known morphs without AnkiMorphs losing track of which morphs you know.Note: if you use collection frequency in any note filters, then you should not use this option because it will affect the morph priorities.
-
Ignore names found by the morphemizer:
Some morphemizers are able to recognize some names.Note: This can have mixed results; some morphemizers produce a non-trivial amount of false-positives, the German spaCy models in particular. If you find that there are missing morphs, then this is likely the cause. In that case you are probably better off only using the names.txt feature.
-
Ignore names found in names.txt:
Ignore names that are placed in names.txt -
Ignore custom characters:
Any characters you specify (e.g.,,.?@
) will be ignored. This is especially useful when working with basic morphemizers like theSimple Space Splitter
.
Card Handling
Encountering cards during study sessions:
This is where you can make AnkiMorphs really efficient. AnkiMorphs sorts your cards based on how well you know its content; the more you know, the sooner the card will be shown. The downside is this is that it might take a long time before you see a cards with any unknown morphs, i.e., you don't learn anything new.
To overcome this problem and speed up the learning process, we can use the options found here.
-
Skip cards with only known morphs:
If AnkiMorph has determined that you know all the morphs on the card, then it will be buried and skipped. -
Skip cards that have unknown morphs already seen today:
If you have already studied a card earlier today with the same unknown morph, then any subsequent cards with that unknown morph will be buried and skipped, which reduces the need to Recalc. -
Show "skipped x cards" notification:
After cards are skipped, a notification in the lower left corner displays how many cards were skipped and for what reason. If you don't want to see this notification, you can uncheck this option.
On recalc:
-
Suspend new cards with only known morphs:
Cards that have either the 'All morphs known' tag or the 'Set known and skip' tag will be suspended on Recalc. -
Shift new cards that are not the first to have the unknown morph:
This option is an alternative to the skip options that are only available on desktop, potentially making it easier to study new cards on mobile.
There are two parameters you can adjust:- How much to shift/offset the due of the affected cards
- How many unknown morphs to perform this shift/offset on
Here is an example card order without this option activated:Card ID Unknown Morph Due Card_1 break 50 001 Card_2 break 50 002 Card_3 walk 50 003 Card_4 walk 50 004
Here are the same cards but with this option activated (due_shift = 50 000, first_morphs = 2):Card ID Unknown Morph Due Card_1 break 50 001 Card_3 walk 50 003 Card_2 break 100 002 Card_4 walk 100 004
-
Move new cards without unknown morphs to the end of the due queue:
New cards that do not contain any unknown morphs will be given adue
value of2047483647
which is the max score given by AnkiMorphs.
Algorithm
For these settings to make sense you have to read the scoring algorithm section first.
Weights:
Morph Targets:
These are explained in the scoring algorithm: deviation section.
Play around with the variables here: https://www.geogebra.org/graphing/ta3eqb8y
This is what the default looks like:
And the default looks like this:
Toolbar
Here you can adjust the look of the toolbar, and which stats it shows.
'L' and 'I' displays:
-
Seen morphs:
Shows all morphs that have been reviewed at least once. This can be more motivating than only seeing known morphs since it goes up every time you study new cards, but it can also give you a false sense of confidence. -
Known morphs:
Only show known morphs, which is determined byMorphs are considered known when [...]
option in the general setting.
Hide:
-
Recalc:
Recalc
will not be displayed in the toolbar -
L:
Known lemmas will not be displayed in the toolbar -
I:
Known inflections will not be displayed in the toolbar
Shortcuts
Change the AnkiMorph shortcuts if you prefer different ones. Be aware that changing these might lead to unexpected behavior.
Prioritizing
The more frequently a morph occurs in a language, the more useful it is to learn. This is the fundamental principle behind AnkiMorphs--learn a language in the order that will be the most useful.
AnkiMorphs is a general purpose language learning tool, therefore, it has to be told which morphs occur most often. You
can do this in two ways, either have AnkiMorphs calculate the morph frequencies found in your
cards (Collection frequency
), or you can specify a custom .csv
file that contains that information.
Any .csv
file located in the folder [anki profile folder]/priority-files/
is
available for selection in note filters: morph priority.
But before we outline the custom priority files, we have to discuss morph lemmas and inflections.
Lemmas or Inflections?
There are scenarios where you might not want to give each individual inflection a separate priority:
-
Chinese technically does not have inflections, so any inflection data is artificial and leads to a wasteful use of resources.
-
Korean has an extreme number of inflections, leading to an explosion of priorities, which creates disproportionate penalties.
-
You might feel that you have a good enough grasp of the grammar of inflections, making it unnecessary to prioritize one over another.
If you never want to give separate priorities to inflections then you should choose a lemma only priority file. If you do care about inflection priorities, or if you might want to switch to lemma priorities on the fly, then choose an inflection priority file.
Custom Priority Files
You can use the Priority File Generator or the Study Plan Generator to create your own custom priority file, or you can download some pre-made ones at the bottom of this page.
Note:
- The
Occurrences
column is optional- Any lines after 1 million will be ignored by AnkiMorphs
Custom Lemma Priority Files
The lemma only
priority files follows this format:
- The first row contains column headers.
- The second row and down contain morph lemmas in descending order of frequency.
Custom Inflection Priority Files
The inflection
priority files follows this format:
- The first row contains column headers.
- The second row and down contain morphs in descending order of inflection frequency.
- The first column contains morph-lemmas, the second column contains morph-inflections (this is done to prevent morph collisions).
- The third column contains lemma priorities, the fourth column contains inflection priorities (this is done to so you can switch morph evaluation on the fly).
Morph Collision
Inflected morphs can be identical even if they are derived from different lemmas (base), e.g.:
Lemma : Inflection
有る ある
或る ある
To prevent misinterpretation of the inflected morphs, we also store the lemmas.
Downloadable Priority Files
Unless otherwise stated, these are inflection
priority files, generated using a 90% comprehension cutoff.
Cantonese
Note: This is a lemma only priority file that was not generated using AnkiMorphs, so it might not work very well (or at all).
- zhh-freq.csv
- Source:
existingwordcount.csv
found on words.hk - analysis
Catalan
- ca-news-priority.csv
- Source:
cat_news_2022_300K-sentences.txt
found on wortschatz - catalan corpora- Morphemizer:
spaCy: ca-core-news-sm
Chinese
Note: this is a lemma only priority file.
- zh-news-lemma-priority.csv
- Source:
zho_news_2020_300K-sentences.txt
found on wortschatz - chinese corpora- Morphemizer:
AnkiMorphs: Chinese
Croatian
- hr-news-priority.csv
- Source:
hrv_news_2020_300K-sentences.txt
found on wortschatz - croatian corpora- Morphemizer:
spaCy: hr-core-news-sm
Danish
- da-news-priority.csv
- Source:
dan_news_2022_300K-sentences.txt
found on wortschatz - danish corpora- Morphemizer:
spaCy: da-core-news-sm
Dutch
- nl-news-priority.csv
- Source:
nld_news_2022_300K-sentences.txt
found on wortschatz - dutch corpora- Morphemizer:
spaCy: nl-core-news-sm
English
- en-wiki-priority.csv
- Source:
eng_wikipedia_2016_300K-sentences.txt
found on wortschatz - english corpora- Morphemizer:
spaCy: en-core-web-sm
Finnish
- fi-news-priority.csv
- Source:
fin_news_2022_300K-sentences.txt
found on wortschatz - finnish corpora- Morphemizer:
spaCy: fi-core-news-sm
French
- fr-news-priority.csv
- Source:
fra_news_2022_300K-sentences.txt
found on wortschatz - french corpora- Morphemizer:
spaCy: fr-core-news-sm
German
- de-news-priority.csv
- Source:
deu_news_2022_300K-sentences.txt
found on wortschatz - german corpora- Morphemizer:
spaCy: de-core-news-md
Greek (Modern)
- el-news-priority.csv
- Source:
ell_news_2022_300K-sentences.txt
found on wortschatz - modern greek corpora- Morphemizer:
spaCy: el-core-news-sm
Italian
- it-news-priority.csv
- Source:
ita_news_2022_300K-sentences.txt
found on wortschatz - italian corpora- Morphemizer:
spaCy: it-core-news-sm
Japanese
- ja-news-priority.csv
- Source:
jpn_news_2011_300K-sentences.txt
found on wortschatz - japanese corpora- Morphemizer:
AnkiMorphs: Japanese
- ja-anime-priority.csv
- Source: NanakoRaws
- Morphemizer:
AnkiMorphs: Japanese
Korean
Note: this is a lemma only priority file.
- ko-news-lemma-priority.csv
- Source:
kor_news_2022_300K-sentences.txt
found on wortschatz - korean corpora- Morphemizer:
spaCy: ko-core-news-sm
Lithuanian
- lt-news-priority.csv
- Source:
lit_news_2020_300K-sentences.txt
found on wortschatz - lithuanian corpora- Morphemizer:
spaCy: lt-core-news-sm
Macedonian
- mk-news-priority.csv
- Source:
mkd_newscrawl_2011_300K-sentences.txt
found on wortschatz - macedonian corpora- Morphemizer:
spaCy: mk-core-news-sm
Norwegian (Bokmål)
- nb-news-priority.csv
- Source:
nob_news_2013_300K-sentences.txt
found on wortschatz - norwegian corpora- Morphemizer:
spaCy: nb-core-news-sm
Polish
- pl-news-priority.csv
- Source:
pol_news_2022_300K-sentences.txt
found on wortschatz - polish corpora- Morphemizer:
spaCy: pl-core-news-sm
Portuguese
- pt-news-priority.csv
- Source:
por_news_2022_300K-sentences.txt
found on wortschatz - portuguese corpora- Morphemizer:
spaCy: pt-core-news-sm
Romanian
- ro-news-priority.csv
- Source:
ron_news_2022_300K-sentences.txt
found on wortschatz - romanian corpora- Morphemizer:
spaCy: ro-core-news-sm
Russian
- ru-web-priority.csv
- Source:
rus-ru_web-public_2019_300K-sentences.txt
found on wortschatz - russian corpora- Morphemizer:
spaCy: ru-core-news-sm
Slovenian
- sl-news-priority.csv
- Source:
slv_news_2020_300K-sentences.txt
found on wortschatz - slovenian corpora- Morphemizer:
spaCy: sl-core-news-sm
Spanish
- es-news-priority.csv
- Source:
spa_news_2022_300K-sentences.txt
found on wortschatz - spanish corpora- Morphemizer:
spaCy: es-core-news-sm
Swedish
- sv-news-priority.csv
- Source:
swe_news_2022_300K-sentences.txt
found on wortschatz - swedish corpora- Morphemizer:
spaCy: sv-core-news-sm
Ukrainian
- uk-news-priority.csv
- Source:
ukr_news_2022_300K-sentences.txt
found on wortschatz - ukrainian corpora- Morphemizer:
spaCy: uk-core-news-sm
Names
Note:
- Memory Usage: AnkiMorphs loads the entire list of names into memory and compares against it each time you review a card. To avoid slowdowns, keep the list of names as small as possible.
- Loading Changes: If you manually edit the
names.txt
file, you must restart Anki for the changes to take effect. However, if you use theMark as name
feature, no restart is required.
You can have AnkiMorphs automatically filter out specified names found on your cards. This feature is designed so users won't have to learn the names of places or individuals, as these words lack inherent meaning that can be acquired.
You can activate the feature by selecting Ignore names found in names.txt
it in
the preprocess settings.
The names.txt
file is located in your anki profile folder.
You can either update this file manually, or during a review you can also add names to the list by selecting a word,
right-clicking it, and choosing Mark as name
from the dropdown menu.
Setting Known Morphs
AnkiMorphs determines which morphs you know by analyzing the cards you specify. However, if you delete any of those
cards then it can lead to loss of information. To address this issue, you can store known morphs in .csv
files in the
[anki profile]/known-morphs
folder.
Any .csv
file that has the priority file format (like those produces by the
Known Morphs Exporter), and is placed within this folder, can be read during Recalc and saved to the database.
You can activate this feature by selecting Read files in 'known-morphs' folder and register morphs as known
in the general settings tab.
Usage
After you have finished installing and setting up, you can run Recalc and finally start using AnkiMorphs with your cards! Delve into how to use AnkiMorphs with the following sections:
- Reviewing cards with AnkiMorphs.
- Using the Browser Options.
- Generating priority files to change morph priorities.
- Generating readability reports to find out much of specified files you will be able to read.
- Exporting known morphs so you can trim down your card collection.
- Gauging your overall progress in terms of morph priorities.
Recalc
Recalc is short for “recalculate”, and is basically the command that tells AnkiMorphs to work all its magic. When you run Recalc, AnkiMorphs will go through the cards that match any 'Note Filter' and do the following:
- Update the
ankimorphs.db
with any new seen morphs, known morphs, etc. - Calculate the score of the cards, and then sort the cards based on that score.
- Update any cards' extra fields and tags.
Basically, when you run Recalc, AnkiMorphs will go through your collection, recalculate the difficulty of your cards based on your new knowledge, and reorder your new cards in a way that’s optimal for the new you: the you who knows more than you did yesterday.
You can run Recalc as often as you like, but you should run it at least once before or after every study session so that your new cards will appear in the optimal order.
It's easy to forget to run recalc, so you can also
check the Recalc on sync
settings option, which will take care of recalc for you by
running it automatically before Anki syncs your collection.
Note: Recalc can potentially reorganize all your cards, which can cause long sync times. The Anki FAQ has some tricks you can try if this poses a significant problem.
Scoring Algorithm
TL;DR: Low scores are good, high scores are bad.
The order in which new cards are displayed depends on their due
value: a card with due = 1
will be shown before a card with due = 2
,
and so on. Leveraging this property, we can implement the following strategy: assign higher due
values to cards with more complex
text, pushing them further back in the card queue. Here are some examples of what that might look like:
- "She walked home"
due = 600
- "Asymmetric catalysis for the enantioselective synthesis of chiral molecules"
due = 100 000 000
Now, let's define some properties that we want our cards to have:
- Few unknowns morphs (comprehensibility)
- High priority morphs (significance)
- Ideal length (low deviation)
We can now invert these properties to calculate a "penalty" score, which will then replace the due
values of the cards.
That formula at the highest level is:
Let’s break it down into smaller components.
incomprehensibility
In practice, the comprehensibility of a given text is determined by a combination of known grammar points and vocabulary. However, evaluating grammar is non-trivial, especially in a general language learning context, so we will not make any explicit attempts to do so.
Determining which morphs are known is relatively easy, so our incomprehensibility score will be the product of the number of unknown morphs and a constant penalty factor.
where
insignificance
Each morph has a priority value, which AnkiMorphs aggregates into the following metrics:
where
Note: is not included since it would not have any meaningful impact on 1T cards.
You can customize the algorithm by selecting any combination of these metrics and adjusting their influence on the result by changing their corresponding weights. This is done using two column vectors: one for the weights and one for the aggregated metrics. The final score is computed by taking the scalar product of these vectors:
which gives us:
Example:
Deviation
Learning can be easier with more surrounding context, e.g., other known words. However, if a sentence contains too many words, learning may become more challenging. This is because the complexity of the grammar often increases, along with the likelihood of not perfectly remembering all the surrounding words. Ideally, we want our cards to have sentences within this optimal range.
Having the ability to bias our sentences towards a certain length is also beneficial; you might find it easier to learn from shorter sentences compared to longer ones, or vice versa.
To achieve this, we use a piecewise equation that where we define the following:
- How much to penalize excessive morphs
- How much to penalize insufficient morphs
- The ideal range (target) of morphs
Here is an example of what that might look like:
Playground: https://www.geogebra.org/graphing/ta3eqb8y
This graph shows:
- Penalty for excessive morphs: squared in relation to the target difference
- Penalty for insufficient morphs: linear in relation to the target difference
- Ideal range (target): 4-6 morphs
AnkiMorphs provides the following metrics, whose variables you can adjust, and you can disable or amply them by changing their weights:
where
with the following vectors:
which gives us:
Constraints
We have now refined the formula to:
However, there are a few practical concerns we have to address.
First, we need to ensure that cards are primarily ordered by the number of unknown morphs they contain. This means that all 1T cards should appear before any MT cards, regardless of their insignificance or deviation.
To do this, we apply a min
function to ensure that the sum of the last two terms does not exceed
where
Lastly, we have to make sure that the score does not exceed the maximum card due
value allowed by Anki. The due
value
is stored as a signed 32-bit integer, with a maximum value of . To prevent overflow when cards are shifted,
we include a safety margin of . This results in the upper bound:
Now we wrap the entire expression in another min
function to get our final formula:
Reviewing Cards
Starting Out
When you first start using AnkiMorphs, you will probably come across many variations of Interjections (e.g. Aaah!
,
umm...
, Wow!
) and other uninteresting
words. Just tag them as known
and move on. When you reach a critical mass of known morphs, usually around 50–100, is when you will start encountering
useful sentences.
Stuttered names or words might accidentally produce morphs that don't make any sense in the context, and you should probably suspend these cards or mark them as known if there are many of them.
AnkiMorphs might seem error-prone at first, like mixing up two (seemingly) different morphs, but the more data it accumulates, the more accurate it becomes, so try not to get discouraged! It becomes much more enjoyable to use after you know 100+ morphs.
It is a good idea to frequently Recalc when you are first starting out, maybe every 10 cards or so, to make sure you get the best possible new cards.
Encountering Morphs You Already Know
If you already know the morphs in a card you are presented, then use the hotkey K
(for Known) to add
the am-known-manually
tag to the card and skip it. The morphs on this card will be considered known the next time you
recalc.
Encountering Cards You Don't Understand
There will also be times when AnkiMorphs says a card is 1T, but you aren’t able to understand it. There are two reasons this may occur. The first is that, due to incorrect parsing, AnkiMorphs thinks you know a word that you don’t. Unfortunately, there is no easy way to remove morphs from the AnkiMorphs' database. Luckily, this shouldn’t happen very often. When it does, your only real option is to suspend or delete the card.
The other scenario is that you aren’t able to understand a sentence deemed 1T despite it indeed containing only one unknown morph. This is simply a fact of life when it comes to language learning. Sometimes you know all the words in a sentence, but still just can’t get what it means. It could be due to many things, such as one of the words having an alternate meaning you haven’t learned yet, or the grammar being too tricky for you to parse at your current level. Basically, although the sentence appears to be 1T, it’s actually MT. By definition, any sentence that’s truly 1T shouldn’t be difficult to understand.
Whenever this happens, it's best to either find a better card or suspend/delete theccard and move on. The whole point of AnkiMorphs is to help you make fast progress by collecting low-hanging fruit. If you spend time mulling over things that are above your level, you’re defeating the purpose of the add-on.
Finding A Better Card
If you want to learn a different card instead of the one you are presented, then press the hotkey
L
to open the browser and see all the other 1T cards in your collection with the same
unknown morph. If you want to see all 1T and MT cards you can use Shift+L
.
From here you can right-click your preferred card and select Learn Card Now
. You can also find the same options in
the AnkiMorphs
menu at the top of the browse window.
The card will then go to the top of the new cards
-queue. If you have other due cards, then they might show up first.
Encountering Suitable 1T Cards
If you come across a new card with only one unknown and it seems reasonable, treat it like any other new Anki card and answer it accordingly. For more information on handling new cards, refer to the Anki studying guide.
Skipping Cards
There are three scenarios where AnkiMorphs will automatically skip a card:
-
You have selected the
Skip cards with only known morphs
-option the in the card-handling settings:
If the next card has one of the 'known' tags, then it will be skipped. -
You have selected the
Skip cards that have unknown morphs already seen today
-option in the card-handling settings:
Say you have three cards:card1, card2, card3
, all of which have the same unknown morph. After you have answeredcard1
then the cardscard2, card3
will be skipped. -
You have selected the
ignore names found in names.txt
-option in preprocess settings
Let's use the same example of three cards ,card1, card2, card3
. This time they all have the same unknown morphAlexander
. If you use the Mark as name feature to markAlexander
as a name oncard1
, then the cardscard2, card3
will be skipped.
Pre-skipping Cards
The skipping features mentioned in the section above only take effect when using Anki on desktop where the AnkiMorphs addon is activated. This can make it tricky to study new cards on mobile since there might be many cards right after each other that have the same unknown morph.
To get some of the same effects on mobile, we can instead "pre-skip" cards by selectively moving some of them farther back in the queue when we Recalc.
For more info read:
Card Handling: Shift new cards that are not the first to have the unknown morph
Right-Clicking Highlighted Text
AnkiMorphs adds some additional options to the Anki context menu (right-click):
-
Mark as Name:
The highlighted text will be added to the names.txt file, and the card will be skipped. -
Browse in
am-study-morphs
:
This opens up the Anki Browse window with the search term:"am-study-morphs:{highlighted_text}"
This can be useful for finding cards you previously studied that contained the highlighted text as an unknown morph.
For example, you might have forgotten the nuances of the word
repulse
, but recall having studied it before, you can then highlightrepulse
, select this option, and the browse window will open with the search term:"am-study-morphs:repulse"
Browser
AnkiMorphs adds new options in the Browse window that can be accessed either from the AnkiMorphs
menu at the top or
when right-clicking cards:
-
View Morphemes:
Opens a pop-up window showing the card's morphs -
Learn Card Now:
Raises selected cards to the top of thenew cards
-queue.Note: If you use
Learn Card Now
on a card that is not in the deck you are currently studying, then it won't show up. -
Browse Same Morphs:
Searches for all the cards that have the same morphs (inflection) as the selected card. -
Browse Same Unknown Morphs:
Searches for all the cards that have the same unknown morphs (inflection) as the selected card. -
Browse Same Unknown Morphs (Lemma):
Searches for all the cards that have the same unknown morphs (lemma) as the selected card. -
Tag As Known:
Adds theSet known and skip
tag to the selected cards.
Generators
AnkiMorphs provides the following three generators:
-
Readability Report Generator
A report over how well you know the text in the specified files -
Priority File Generator
A file that lists all the found morphs sorted by their frequency -
Study Plan Generator
A combination of priority files in the order you specify
To use the generators you have to follow these three steps:
Loading Files
File Formats
These are the files that the generators are (mostly) able to read. Any files that don't have these extensions will be ignored.
Note:
- Files must be encoded in
UTF-8
. Using other encodings may lead to parsing errors or crashes.- EPUB files may be parsed slightly differently across operating systems due to system-specific quirks.
Selecting Root Folder
Any files that match your selected file formats and are in this folder or sub-folders, will be used by the generators.
Take, for example, the following folders and their files:
english_texts/
├── books/
│ └── The Wise Man's Fear/
│ ├── The Wise Man's Fear.pdf
│ └── The Wise Man's Fear.txt
└── subs/
├── Game-of-Thrones/
│ └── season-1/
│ └── episode_1.srt
└── Lord_of_the_Rings/
└── The_Fellowship_of_the_Ring.vtt
If you were to select the books
folder, and you checked the .txt
file format, then the generator would
only use the The Wise Man's Fear.txt
file.
If you were to select the folder english_texts
and you checked all the file format options, then the generator would
use the files:
The Wise Man's Fear.txt
episode_1.srt
The_Fellowship_of_the_Ring.vtt
After Loading
The files that will be used by the generators will be shown in the File
column in the tables below, and the generator
buttons are now enabled. Next, you need to specify how the generators should process the files.
Processing Files
Morphemizer
This is the tool AnkiMorphs uses to split text into morphs.
Preprocess
These options are equivalent to those found in Preprocess
settings.
Generator Output
When clicking the Generate Priority File
or Generate Study Plan
buttons you will be presented with these options:
The output file is automatically set to be in the [anki profile folder]/priority-files/
folder. Any priority
files or study plans that are placed in this folder can be selected in the
note filter: morph priority settings.
You can name the file whatever you want as long as it has a .csv
extension, e.g. ja-freq.csv
.
File Format
Lemma and inflection
: inflection priority fileOnly lemma
: lemma priority file
Minimum Occurrence
Limit the morphs to only those that occur at least x
many times.
Comprehension Target
Limit the morphs to only those that occur below the specified comprehension percent. Let's take these morphs as an example:
If your target is 90%
, then we get:
The morphs in the fifth and sixth rows would therefore not be included since they have an occurrence sum greater than 360.
Readability Report Generator
The Readability Report Generator can give you insights into how much of the text in a file you are able to read. It produces two different outputs, one with pure numerical values, and one with percentages.
You can click on the column headers to sort the rows based on those values.
Priority File Generator
The Priority File Generator creates a priority file that is described in the prioritizing section.
Study Plan Generator
Using a study plan is convenient if you want to learn morphs from source materials in a specific sequence, e.g., TV show episodes, book series, etc.
A study plan differs from a regular priority file in the following ways:
- It is first sorted by input files, then morph frequency.
- It has extra columns:
- Learning status
- File name
The study plan generator basically does this:
- Creates a priority file for each input file
- Combines those priority files
- Removes duplicate morphs
The resulting file can be used in the note filter: morph priority settings like any other priority file.
Note: that only the data from the
Morph-Lemma
, andMorph-Lnflection
columns are read by AnkiMorphs, so you can delete or modify the other columns if you want.
Changing The File Order
The study plan uses the same file order as that displayed in the currently opened table at the bottom of the window. This provides more flexibility than relying solely on the alphanumeric values of the file names.
If I have this table open as I click the Generate Study Plan
button:
Then the study plan will have the files in this order:
Jigokuraku-03.srt
Jigokuraku-01.srt
Jigokuraku-02.srt
Note: the
Total
"file" is artificial and won't be included, nor is its data used in any calculations.
With this table open:
Then the order will be this:
Jigokuraku-03.srt
Jigokuraku-02.srt
Jigokuraku-01.srt
Progression
In the beginning stages of language acquisition, your working vocabulary will consist mostly of commonly used words. As your ability increases, you will recognize a richer variety of words. As you approach native-level proficiency, you will recognize almost all words -- from the very common to the highly specialized.
Although AnkiMorph cannot measure true language acquisition, the Progression tool can help you understand both your learning progress and the quality of your card collection with respect to morph priority.
Setup
Designating Morph Priorities
Since progression is measured with respect to morph priorities, we must first decide how
morph priorities should be determined. In an identical manner to note filters, you can either use the morph
frequencies of your card collection (Collection frequency
) or you can designate a custom .csv
file that contains this
information. Any .csv
file located in the folder anki profile folder/priority-files/
is available for selection.
Options
To gauge progression, AnkiMorphs essentially calculates a histogram. Morphs with assigned priorities are first binned
into priority ranges (priorities 1-500
, 501-1000
, etc.).
The user can designate the bin size as well as the minimum and maximum priority considered. For example, these settings:
specify the priority bins 1-2000
, 2001-4000
, 4001-6000
, and 6001-8000
.
Note: the calculated bins may differ depending on the number of specified morph priorities.
Bins can also be cumulative:
In this mode, bin statistics will increase or decrease monotonically.
Finally, the user can specify whether morphs should be evaluated according to lemma or inflection.
This can be freely changed regardless of the mode specified in the general settings. However, if using
morph priorities from a custom .csv
file, one must be sure as always that the file is compatible with the morph evaluation mode, e.g.
if you are using a lemma-only priority file, then you can only evaluate by lemma.
Results
Clicking Calculate progress report
will determine the current progression and populate the results.
Numerical and Percentage Tabs
The Numerical
tab reports the number of unique morphs, known morphs, learning morphs, unknown morphs, and missing morphs in each priority range (bin). Unknown morphs are present in the card collection, while missing morphs are not present in the card collection.
The Percentage
tab reports these same statistics as percentages of unique morphs. By examining the percentage of known morphs, learning progress can be evaluated. Meanwhile, the percentage of missing morphs is an important metric of card collection quality -- a deck should contain the most relevant morphs, after all.
Morph List Tab
The Morph List
tab provides the status of each morph with a specified priority. This information can be useful to quickly zero-in on critical morphs that are either unknown or missing from your card collection.
Reset Tags
When you switch to a new morphemizer or change the morph evaluation from lemma to inflection, some tags on your cards may become incorrect or misleading and these should be removed. The tags shown in the picture above are safe to remove because they will always be reapplied during recalc.
To reset these tags, go to Tools
-> AnkiMorphs
-> Reset Tags
Exporting Known Morphs
Exports all the morphs from ankimorphs.db
that have the specified interval or above. Useful
for setting known morphs, which allows you to trim down your card collection.
Select Output
Select the folder you would like AnkiMorphs to save the file to.
Defaults to the [anki profile]/known-morphs
folder.
Resulting File
The file name will be known_morphs-{datetime}.csv
, where datetime is the time of creation, e.g.:
known_morphs-2024-01-11@18-47-19.csv
The file format will be the same as those generated by the priority file generator.
Tips & Tricks
Learning specific media
If you want to learn a specific piece of media—like a book or a movie—a targeted priority file can speed things up compared to a general one. However, ou should only really do this after you have already learned at least the most frequent 2k morphs. If you start to specialize too early you can fall into the trap of 'over-fitting' your vocabulary and understanding of the language.
Reverting AnkiMorphs changes
There are a couple of ways to revert the changes AnkiMorphs has made to your card collection:
- Restore from a previous backup you made.
- If you only want to revert how AnkiMorphs sorted the cards, then you can do the following:
Browse -> Card State -> New cards -> Select all (Ctrl + A) -> Forget -> Restore original position where possible
Known Problems
Undoing 'set known and skip'
There is a bug that occurs when you do the following:
- Open Anki
- Go to a deck and click 'Study Now'
- Only 'set known and skip' cards >
If you do this then those actions cannot be undone immediately. You can easily fix this by simply answering (or basically doing anything to) the next card, and you can now just undo twice and the previous 'set known and skip' will be undone.
This is a weird bug, but I suspect it is due to some guards Anki has about not being able to undo something until the user has made a change manually first ('set known and skip' only makes changes programmatically).
Redo is not supported
Redoing, i.e. undoing an undo (Ctrl+Shift+Z), is a nightmare to handle with the current Anki API. Since it is a rarely used feature, it is not worth the required time and effort to make sure it always works. Redo might work just fine, but it also might not. Use it at your own risk.
Freezing when reviewing
AnkiMorphs uses the Anki API to run in the background after you answer a card, which then displays a progress bar of how many cards have been skipped:
The Anki API has a rare bug where it sometimes gets in a deadlock and just says 'Processing...' forever.
When this happens you have to restart Anki.
Morphs don't split correctly
Anki stores text on cards as HTML, and this can cause some weird/unexpected problems. One such problems is that line breaks are actually stored as
<br>
.Here is how it looks on the card:
Hello. Goodbye.
This is how it is actually stored:
Hello.<br>Goodbye.
Most morphemizers completely ignore the unicode equivalent of
<br>
, which results in them interpreting the text as:Hello.Goodbye.
To fix this problem, we can use the find and replace feature in Anki to add a whitespace between before the
<br>
on all our cards: Where theFind
field has this:(\S)<br>
The
(\S)
part finds a non-whitespace character and saves it for later.And then the
Replace With
field has this:${1} <br>
The
${1}
part re-inserts the(\S)
character that was found earlier.
Ruby characters (furigana, etc.) are displayed wrong in am-highlighted
When morphs are not recognized in the same way that the ruby characters intended, then we can get ugly things like this:
This is because
錬金術師
gets split into ->[錬金術, 師]
and the ruby characters are after the second morph, so they only attach to that one. Fixing this programmatically is not possible, unfortunately.If you really wanted to fix this particular card then you would have to do some manual editing to the ruby characters in the original field, e.g. splitting it into two different parts:
original: 錬金術師[れんきんじゅつし] split: 錬金術[れんきんじゅ]師[つし]
then
am-highlighted
will produce this instead:
Incorrect highlighting of ignored names
When names are ignored, either by the morphemizer or those found in the
names.txt
, then the highlighting is prone to false-positives where other morphs also found in the text can mistakenly get highlighted in the names:
Readability report freezes indefinitely when input is too long
When using the
AnkiMorphs: Japanese
morphemizer, excessively long lines of text can cause the morphemizer's buffer to overflow, causing the progress bar to freeze indefinitely. To avoid this, try splitting the long lines into shorter segments.
Anki crashing when opening AnkiMorphs settings
The
AnkiMorphs: Japanese
morphemizer doesn't handle paths with diacritical marks very well, so paths like this:C:\Users\héroïne
can cause crashes. If you can't change the path name that is causing the crash, try using spaCy morphemizers instead.
Changelog
View changelog on github: https://github.com/mortii/anki-morphs/releases
Frequently Asked Questions
Can you make AnkiMorphs work on Anki qt5?
No, supporting multiple versions of Qt is too detrimental to the project's workflow and code-base.
The last release of Qt 5 was in May 2023, so using Qt 6 is the sensible thing to do.
Transitioning from MorphMan
Should I add a note-filter row for both my sentence field and my focus morph field?
No, only use the sentence field.
Should I use the same tags in AnkiMorphs that I was using with Morphman?
I recommend using the default AnkiMorphs tags. Mixing tags can get confusing.
Should I export all of studied and in progress words into a CSV spreadsheet?
AnkiMorphs determines which morphs are known in the same way MorphMan does it: by how long the learning intervals of the cards are. The Known Morphs Exporter is more of a tool for trimming your card collection, it's not a requirement for transitioning from MorphMan.
If you want to retain the morphs on cards that you have tagged as known with MorphMan, then I recommend bulk tagging
those
cards with am-known-manually
:
- Open
Browse
- Select the MorphMan known tag in the sidebar
- Select all those cards
- Go to
Notes
in the topbar and click onAdd Tags
(or use Ctrl+Shift+A) - Enter the tag
am-known-manually
That approach could be overkill though. I wouldn't worry too much about losing known morphs from the cards you tagged as
known with MorphMan, you can usually get them back quickly by using K
when you encounter them when using AnkiMorphs.
Should I manually delete the words in the focus morph field of my cards so that AnkiMorphs can cleanly reparse everything?
AnkiMorphs does not reuse the MorphMan focus morph field, so it makes no difference.
Setup
Linux
-
Installing python 3.9:
Anki (very unfortunately) uses python 3.9. This is considered a dead version of python, so it can't be installed automatically by most package managers. To install 3.9 on a debian system you can do something like this. Alternatively you can use pyenv.
The reason we want to install python 3.9 is that we need to make sure the dev-environment matches the real-world Anki environment--if we use newer versions of python then things might work fine in the dev environment, but Anki crashes as soon as we leave it because the python code is too new (this has happened multiple times).
When this command succeeds:
$ python3.9 --version Python 3.9.[x]
then you are ready move on to the next step.
-
Setting up the dev environment:
We want to use a virtual environment for a few reasons: we don't want to install the project's dependencies on the global environment (your pc) because you might end up with package conflicts or accidentally downgrading packages, etc; a virtual environment also makes sure the dependencies are consistent for all developers.
python3.9 -m pip install --upgrade pip virtualenv python3.9 -m virtualenv venv source venv/bin/activate # <--- this activates the virtual environment python -m pip install --upgrade pip python -m pip install -r requirements.txt pre-commit install
Also install
Xvfb
on your system, e.g.:sudo apt-get install xvfb
, this prevents windows popping up whenpre-commit
runspytest
.Remember to activate the virtual environment any time before you start working on the project, some IDEs do this automatically.
-
Set the project python interpreter to be
anki-morphs/venv/bin/python
to get your IDE to recognize the packages installed above. -
Create a soft symbolic link from the cloned repo to the anki add-ons folder so anki starts using the cloned AnkiMorphs:
ln -s ~/path/to/cloned/repo/anki-morphs/ankimorphs ~/.local/share/Anki2/addons21/ankimorphs
-
Using pre-commit:
Pre-commit runs some commands (pylint, pytest, etc.) on the code before you commit to make sure the code is in good condition. Pre-commit is configured in
.pre-commit-config.yaml
and some of the commands have additional configurations inpyproject.toml
.You can run it manually with the command:
pre-commit run --a
If you want to make an intermediate commit without caring about pre-commit running successfully you can use the
--no-verify
flag, e.g.git commit -am "fixed abc" --no-verify
Pre-commit can be annoying to use in the same way that it can be annoying to follow traffic-laws--sure it might slow you down right now, but it is much better in the long-run when everybody does it. Pre-commit can help you in three ways:
- Automatically fix code for you
- Catch bugs earlier
- Make code more understandable
Once you get used to the pre-commit flow it no longer slows you down, and there are only upsides to using it.
Pre-commit fails in two ways:
-
Automatically fixed
When a pre-commit hook changes a file (fixing it) then you simply have to re-stage the file and re-run the commit. E.g:$ vim recalc_main.py $ git commit -am "made changes" isort (python)..........Failed - hook id: isort - files were modified by this hook Fixing /home/{...}/recalc_main.py $ git commit -am "made changes" [main c0bd018] made changes 1 file changed, 56 insertions(+)
-
Has to be manually fixed
The majority of the hooks provide warnings that have to be handled manually. Most of the time the required fixes provide significant improvements to the code, and you might learn something new and become a better programmer in the process. Sometimes the suggested errors are false-positive, or the suggested fix is actually problematic in some way. When this happens then ignoring it is fine, e.g:from aqt.qt import QMessageBox # pylint:disable=no-name-in-module
Optional: if you use gitkraken you have to adjust the pre-commit script (
anki-morphs/.git/hooks/pre-commit
) to activate the virtual environment first:#!/usr/bin/env bash # File generated by pre-commit: https://pre-commit.com # ID: 138fd403232d2ddd5efb44317e38bf03 # start templated ANKIMORPH_DIR=/home/mortii/git/anki-morphs/ # <--- ADD THIS LINE! INSTALL_PYTHON=/home/mortii/git/anki-morphs/venv/bin/python3 ARGS=(hook-impl --config=.pre-commit-config.yaml --hook-type=pre-commit) # end templated HERE="$(cd "$(dirname "$0")" && pwd)" ARGS+=(--hook-dir "$HERE" -- "$@") if [ -x "$INSTALL_PYTHON" ]; then cd $ANKIMORPH_DIR && source venv/bin/activate # <--- ADD THIS LINE! exec "$INSTALL_PYTHON" -mpre_commit "${ARGS[@]}" elif command -v pre-commit > /dev/null; then exec pre-commit "${ARGS[@]}" else echo '`pre-commit` not found. Did you forget to activate your virtualenv?' 1>&2 exit 1 fi
Debugging
Debugging tools for Anki add-ons are unfortunately fairly limited. The simplest approach is to use print statements in the code which can then be seen in a terminal that spawned the Anki instance. Here is the guide for doing that: https://addon-docs.ankiweb.net/console-output.html#showing-the-console
Redirecting the terminal output to a file can be very useful. Here is a linux example:
anki > anki_output.txt
There is also a dedicated test function in the AnkiMorphs code that allows for faster/easier testing, you can find it here: __init__.py: test_function
Docs
This website was made using mdBook and is hosted on Github Pages.
mdBook
The official mdBook guide is pretty good and walks you through the entire setup process and how to use it.
First install rust to get the cargo package manager:
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
install mdbook:
cargo install mdbook
To display latex equations and environments install mdbook-katex with the command:
cargo install mdbook-katex
Launch mdbook from the MorphMan repo with the command:
mdbook serve docs/ --open
Github Pages
To make github automatically deploy the generated book to github pages do the following:
- Activate Github Actions:
Github repo settings → Code and automation → Pages → Build and deployment → Github Actions - Make any necessary adjustments to branch names or paths in:
anki-morphs/.github/workflows/deploy.yml
Project sites will be available at:
http(s)://<username>.github.io/<repository>
Styling
For additional spacing between bullet points, you have to add a space between at least one of the points, e.g.:
* **a**:
a
* **b**:
b
* **c**:
c
Qt Designer
Creating dialogs with Qt Designer can make the process much easier than doing it by hand. The Qt Designer packages conflicts with the anki-qt (aqt), so we need to use a different virtual environment.
python3.9 -m pip install --upgrade pip
python3.9 -m pip install virtualenv
python3.9 -m virtualenv designer-venv
source designer-venv/bin/activate
python3.9 -m pip install pyqt6 pyqt6-tools
Start Qt Designer with the command:
./designer-venv/lib/python3.9/site-packages/qt6_applications/Qt/bin/designer
Convert ui file to python:
pyuic6 -o ankimorphs/ui/settings_dialog_ui.py ankimorphs/ui/settings_dialog.ui
pyuic6 -o ankimorphs/ui/tag_selection_dialog_ui.py ankimorphs/ui/tag_selection_dialog.ui
pyuic6 -o ankimorphs/ui/generators_window_ui.py ankimorphs/ui/generators_window.ui
pyuic6 -o ankimorphs/ui/known_morphs_exporter_dialog_ui.py ankimorphs/ui/known_morphs_exporter_dialog.ui
pyuic6 -o ankimorphs/ui/view_morphs_dialog_ui.py ankimorphs/ui/view_morphs_dialog.ui
pyuic6 -o ankimorphs/ui/generator_output_dialog_ui.py ankimorphs/ui/generator_output_dialog.ui
pyuic6 -o ankimorphs/ui/progression_window_ui.py ankimorphs/ui/progression_window.ui
Useful guides:
- https://realpython.com/qt-designer-python/
- https://www.pythontutorial.net/pyqt/qt-designer/
Databases
ankimorphs.db
This is an sqlite database with three tables:
'Cards'
'Card_Morph_Map'
'Morphs'
A card can have many morphs, morphs can be on many cards, so we need a many-to-many db structure:
Cards -> Card_Morph_Map <- Morphs
Card table
card_id INTEGER PRIMARY KEY ASC,
note_id INTEGER,
note_type_id INTEGER,
card_type INTEGER,
tags TEXT
Card_Morph_Map table
card_id INTEGER,
morph_lemma TEXT,
morph_inflection TEXT,
FOREIGN KEY(card_id) REFERENCES card(id),
FOREIGN KEY(morph_lemma, morph_inflection) REFERENCES morph(lemma, inflection)
Morph table
lemma TEXT,
inflection TEXT,
highest_learning_interval INTEGER,
PRIMARY KEY (lemma, inflection)
To make sure the morphs are unique, we make the primary key the lemma AND inflection, since inflections can be identical even if they are derived from two different bases, eg:
Inflection : Lemma
ある : 有る
ある : 或る
Using an int as a primary key is preferable over text objects, but hashing the lemma and inflection would lead to a high likelihood of collisions because of the following:
# sqlite integers are max 2^(63)-1 = 9,223,372,036,854,775,807
# The chance of hash collision is 50% when sqrt(2^(n/2)) where n is bits of the hash
# With 64 bits the prob of collision becomes sqrt(2^(64/2)) = 65,536
So if we have over 65,536 morphs we would likely experience bugs that are basically impossible to trace.
Anki dbs
table_info = mw.col.db.execute("PRAGMA table_info('decks');")
print(f"table_info: {result}")
Anki collection db tables:
[['col'],
['notes'],
['cards'],
['revlog'],
['deck_config'],
['config'],
['fields'],
['templates'],
['notetypes'],
['decks'],
['sqlite_stat1'],
['sqlite_stat4'],
['tags'],
['graves']]
notes table:
[[0, 'id', 'INTEGER', 0, None, 1],
[1, 'guid', 'TEXT', 1, None, 0],
[2, 'mid', 'INTEGER', 1, None, 0],
[3, 'mod', 'INTEGER', 1, None, 0],
[4, 'usn', 'INTEGER', 1, None, 0],
[5, 'tags', 'TEXT', 1, None, 0],
[6, 'flds', 'TEXT', 1, None, 0],
[7, 'sfld', 'INTEGER', 1, None, 0],
[8, 'csum', 'INTEGER', 1, None, 0],
[9, 'flags', 'INTEGER', 1, None, 0],
[10, 'data', 'TEXT', 1, None, 0]]
notetypes table:
[[0, 'id', 'INTEGER', 1, None, 1],
[1, 'name', 'TEXT', 1, None, 0],
[2, 'mtime_secs', 'INTEGER', 1, None, 0],
[3, 'usn', 'INTEGER', 1, None, 0],
[4, 'config', 'BLOB', 1, None, 0]]
cards table:
'id' ID_FIELD_NUMBER: builtins.int
'nid' NOTE_ID_FIELD_NUMBER: builtins.int
'did' DECK_ID_FIELD_NUMBER: builtins.int
'ord' TEMPLATE_IDX_FIELD_NUMBER: builtins.int
'mod' MTIME_SECS_FIELD_NUMBER: builtins.int # when card was modified
'usn' USN_FIELD_NUMBER: builtins.int
'type' CTYPE_FIELD_NUMBER: builtins.int
'queue' QUEUE_FIELD_NUMBER: builtins.int
'due' DUE_FIELD_NUMBER: builtins.int
'ivl' INTERVAL_FIELD_NUMBER: builtins.int
'factor' EASE_FACTOR_FIELD_NUMBER: builtins.int
'reps' REPS_FIELD_NUMBER: builtins.int
'lapses' LAPSES_FIELD_NUMBER: builtins.int
'left' REMAINING_STEPS_FIELD_NUMBER: builtins.int
'odue' ORIGINAL_DUE_FIELD_NUMBER: builtins.int
'odid' ORIGINAL_DECK_ID_FIELD_NUMBER: builtins.int
'flags' FLAGS_FIELD_NUMBER: builtins.int
'data' custum_data builtins.str
'type' is the learning stage type:
CardType = NewType("CardType", int)
CARD_TYPE_NEW = CardType(0)
CARD_TYPE_LRN = CardType(1)
CARD_TYPE_REV = CardType(2)
CARD_TYPE_RELEARNING = CardType(3)
'queue' types:
CardQueue = NewType("CardQueue", int)
QUEUE_TYPE_MANUALLY_BURIED = CardQueue(-3)
QUEUE_TYPE_SIBLING_BURIED = CardQueue(-2)
QUEUE_TYPE_SUSPENDED = CardQueue(-1)
QUEUE_TYPE_NEW = CardQueue(0)
QUEUE_TYPE_LRN = CardQueue(1)
QUEUE_TYPE_REV = CardQueue(2)
QUEUE_TYPE_DAY_LEARN_RELEARN = CardQueue(3)
QUEUE_TYPE_PREVIEW = CardQueue(4)
Contributors
A huge thank you to all the people who have made this project possible! Contributors are listed in (hopefully) chronological order. If you have contributed, but you are not on the list, create an issue on GitHub where you let me know, and I'll add you ;)
Code contribution
mortii, Vilhelm-Ian, xofm31, Jcuhfehl, schiozzone, Tartee, wolearyc, mdraves91, hans.
Docs contribution
Matt Vs Japan, mortii, Vilhelm-Ian, cocowash, xuiqzy, xofm31, zeroeightysix, wolearyc, Celebes.
Bugs reports, feature requests, or other helpful guidance
Vilhelm-Ian, CodeWithMa, ashprice, aleksejrs, HQYang1979, soliviantar, buster-blue, rymiel, BorisNA, wrinkledeth, cocowash, asayake-b5, quietmansoath, MichaelPetre, xofm31, knoebelja, xuiqzy, Jcuhfehl, fuquasteve, pallas42, syfgk, jahnke, jsteel44, iwouldrathernotusegithub, tanhoaian01, drkthomp, Kirchheim, zeroeightysix, Gardengul, wolearyc, Pedrubik2000, RyanMcEntire, BobvanSchendel, khanguyenwk, buqamura, Rct567, rwmpelstilzchen, bie-zheng, IncontinentCell.
MorphMan (v5.0-qt6-alpha.1)
Joseph Re, Jeremy Neiman, Chang Spivey, Patrice Peterson, Greg Price, kaegi, Landon Epps, InfiniteRain, David Lõssenko, imd, Shan Rauf, derpue, ianki, ym555, Zachery Gyurkovitz, David Gaya, Sara Aimée Smiseth, Yiufung Cheong, Daniel M German, sb7297, RisingOrange, Joe Strong, Jeffrey Ying, rameauv.