basic vocabulary

Vocabulary

One of the best ways to start familiarizing yourself with kanji is by using a beginner-level vocabulary deck on Anki. These decks typically contain around 1,500 of the most common Japanese words, giving you a solid foundation to jumpstart your learning. By working through this deck, you'll get a feel for how kanji are used together in everyday words, which is a crucial step in mastering the language.

It’s important to understand that not all words carry equal weight in language learning. In fact, the top 1,000 most frequently used words in Japanese account for nearly 75% of all written material. This isn't just a coincidence—it’s a pattern found in most languages. By focusing on these high-frequency words first, you’ll accelerate your progress, building a core vocabulary that makes it easier to learn through context and immersion.

As you memorize the most essential 1,000 to 2,000 words, something transformative happens. Suddenly, you’ll find that you can understand the majority of words in sentences you read or hear. The need to constantly look up unfamiliar terms drops dramatically, and learning becomes a smoother, more efficient process. This early investment in core vocabulary pays off, making the overall experience of mastering Japanese much less daunting.

Focus on Comprehension First

I recommend focusing on understanding Japanese through immersion before attempting to speak it. Once you’ve built a solid foundation in comprehension, speaking the language will come naturally and with much less effort.

The Anki deck introduced below is designed to enhance your understanding of the language. The cards in this deck are specifically crafted to test your recognition skills rather than forcing you to recall words from memory.

Learning Through Sentences

Our preferred method for learning new words is through reading and comprehending sentences. This approach is the most natural, as sentences and phrases are encountered far more often in immersion than isolated words. Moreover, sentences offer context, which can trigger memory in situations where the meaning of a single word might elude you.

Of course, it's challenging to start reading full sentences if you're unfamiliar with even basic vocabulary. To address this temporary hurdle, we use targeted sentence cards (TSCs). A TSC is a flashcard that provides context, but understanding that context isn't necessary to pass the card. Only the target word is tested. If you grasp the context, it aids your understanding. If not, you treat the card as an isolated word. This method quickly becomes advantageous as your vocabulary grows. Even after learning just a few hundred words, your comprehension will expand noticeably. Reviewing sentences in Anki reinforces your memory, helping you understand how words are used in speech, their roles within sentences, and how they connect with other words—benefits that isolated vocabulary cards can't provide.

While learning sentences is the best way to familiarize yourself with language use and grammar structures, it's rare for premade Anki sentence decks to introduce only a single unknown word per sentence, especially for beginners. One way to manage this is by creating isolated vocabulary cards. However, these have their own limitations—they don't teach word usage. Vocabulary cards can't convey how to use a word, what collocations it typically forms, or in what contexts it's appropriate. TSCs overcome this by focusing on the target word while retaining the context and sentence structure.

Anki Decks for Vocabulary Building

The beginner deck "Kaishi 1.5k" which can be downloaded here (click the .apkg link)

Other options
  • Core 2.3K Deck: optimized to get you able to read VNs as fast as possible.
  • Core 2K/6K Deck: Popular Anki Japanese Deck Core 2k/6k; A | B
  • Tae Kim's Deck: It allows to learn Japanese with strong focus on listening comprehension using sample sentences from anime. The deck starts from zero and requires no prior reading skills.
  • Ankidrone Foundation - Learn Kanji With Vocabulary: This Deck will teach you how to recognize kanji along with the most common 1,000 words used in everyday conversations.

Effective Study Strategies

When studying basic vocabulary from a premade Anki deck, your goal should be to learn just enough words to transition into learning the rest through immersion. Once you feel comfortable understanding large chunks of native Japanese—such as in anime or movies—it’s time to set aside the premade decks and focus on learning new vocabulary from immersion and creating your own Anki cards using sentences you encounter.

Typically, you should aim to learn the first 1,000 to 2,000 cards from a premade deck. Then, start sentence mining using TV shows with Japanese subtitles, and later, manga and novels. While sentence mining, continue learning new cards from premade decks at a reduced pace. It’s important not to spend too much time on beginner decks; the higher your level, the less benefit you’ll gain from them. Refer back to premade decks when you struggle to find example sentences during mining.

Reviewing with Anki

When reviewing flashcards, follow these steps:

  1. Read the Target Word: Focus on the target word (usually in bold) or the full sentence if you prefer.
  2. Recall the Kanji Reading: If the target word includes kanji, try to recall its reading from memory.
  3. Understand the Meaning: Attempt to recall the general meaning of the target word.
  4. Use Context: Use the sentence's context to understand how the target word connects with others.
  5. Check Your Answer: Reveal the back of the card to confirm whether your recall was correct.
  6. Grading: If your guess was correct, press "Good." Otherwise, press "Again." Avoid using the "Hard" and "Easy" buttons. Install the AJT Flexible Grading add-on to hide these options.

Remember, your understanding doesn’t need to be precise. Aim for a basic grasp of each word’s meaning. English translations alone won’t fully convey the nuances of Japanese words—immersion is essential for achieving that level of comprehension.

Pro Tips

  • Dealing with Difficult Cards: If you struggle with certain cards, consider using the Mortician add-on, which postpones challenging cards, preventing them from monopolizing your study time.
  • Skipping Familiar Words: Feel free to skip words that are cognates or familiar from other languages, such as katakana words like タクシー (taxi) or エアコン (air conditioner). These can be easily learned through immersion. To suspend a card, press `@` or `!` on your keyboard.
  • Pace Yourself: Don’t overdo it with new cards. It might seem easy at first, but Anki reviews can quickly become overwhelming. We recommend adding 10 to 30 new cards per day.
  • Context Over Translation: Don’t take English translations too literally. They often don’t match the Japanese sentence word-for-word. To truly understand the meaning, you need a solid grasp of underlying grammar structures. Use sentences to reinforce your understanding, but refer to grammar guides or dictionaries like Jisho.org when needed.

How Many Words Should You Learn from a Premade Deck?

To maximize the benefits of premade decks, aim to learn between 1,000 and 2,000 cards from the deck of your choice. Once you reach this milestone, your comprehension will leap from 0% to over 75%, which is enough to begin learning independently. From there, you can dive into native media—books, movies, or any other content—and continue expanding your vocabulary naturally.

While you can continue learning new words from premade decks even as an intermediate learner, remember that they are most beneficial in the early stages.

Are 1,000 to 1,500 Words Enough?

Knowing 75% of the most common words doesn’t mean you’ll understand 75% of every sentence. Most sentences will still contain 2 or 3 unknown words. While you’ll grasp the general meaning, it won’t be comfortable. A comfortable level of comprehension is when you understand 99% or more. Therefore, you’ll need to keep learning new words long after you’ve mastered the basics.

Measuring Your Comprehension

To gauge your true comprehension, try this: watch an episode of an anime with Japanese subtitles, noting every sentence where you encounter an unknown word. Count the total number of sentences and then calculate your comprehension percentage by dividing the number of sentences you fully understood by the total number.

Building Your Own Vocabulary Deck

Creating your own vocabulary deck is highly recommended. No premade deck can replace the value of mining your own sentences. The example mining deck introduced earlier includes a few dozen targeted sentence cards to demonstrate how yourdeck should look.

However, it’s no secret that making your own cards from native Japanese content is challenging at first. Premade decks provide beginners with a valuable shortcut, helping you reach a level of comprehension where building your own mining decks becomes much easier.

You’ll learn more about sentence mining later in this guide.

Intermission: Milestones in Vocabulary Learning

Based on the BCCWJ語彙表 data set, the following table shows the most frequently used words in Japanese and their coverage:

Words Coverage (%)
1,000 75%
2,000 80%
3,000 85%
6,000 90%
10,000 93%
15,000 95%
32,000 98%
50,000 99%

You can calculate the percentage for the first N=1000 words using this Shell snippet:

How do you calculate it?
N=1000; {
        sed "1d;$((N+1))q" BCCWJ_frequencylist_suw_ver1_0.tsv | cut -f 8 | awk '{s+=$1} END {print s}'
        echo '/1000000'
} | paste -s -d '' | bc -l

Breaking Down the Script:

  1. N=1000;

    This sets the variable N to 1000, representing the number of words you want to analyze from the frequency list.

  2. sed "1d;$((N+1))q" BCCWJ_frequencylist_suw_ver1_0.tsv

    sed is a stream editor used to perform basic text transformations.

    • "1d" deletes the first line (header) of the file BCCWJ_frequencylist_suw_ver1_0.tsv.
    • "$((N+1))q" stops the processing after the first N+1 lines (because the first line was deleted, we stop at N+1 to get exactly N lines).

    The result is that this command extracts the top N lines (words) from the file.

  3. cut -f 8

    cut is a command used to extract specific columns from a file.

    • "-f 8" specifies that the 8th field (or column) from each line should be extracted. This field presumably contains the frequency or percentage of usage for each word.
  4. awk '{s+=$1} END {print s}'

    awk is a text processing tool that processes each line of input.

    • "{s+=$1}" sums the values in the first field (which, in this case, is the frequency or percentage of usage extracted by cut).
    • "END {print s}" outputs the total sum of the frequencies/percentages after processing all lines.
  5. echo '/1000000'

    This prints the string '/1000000' to divide the sum by 1,000,000, assuming the total frequency count in the dataset sums to 1,000,000.

  6. | paste -s -d ''

    paste concatenates the output from the previous commands into a single line.

    • "-s" means to paste the output serially (i.e., into a single line).
    • "-d ''" means to use no delimiter, effectively concatenating the sum and the division (s/1000000).
  7. | bc -l

    bc is a command-line calculator.

    • "-l" option specifies that calculations should be done using floating-point arithmetic.
    • This evaluates the mathematical expression resulting from the paste command and outputs the percentage coverage for the top N words.

Summary: This script calculates the percentage of text coverage provided by the top N most frequent words in a dataset by summing their frequencies and then dividing by the total number of words in the dataset (assumed to be 1,000,000). The final output is the cumulative coverage percentage of those N words.

In the early months you’ll likely see rapid progress, with each new word or concept coming more quickly than the last. However, as with any learning process, there comes a point where progress slows, and the returns on your efforts begin to diminish. This is perfectly normal—the more words you add to your vocabulary, the more subtle and incremental your comprehension gains will become.

But there’s a silver lining. You can think of each new milestone as a victory, a small achievement that keeps you motivated. Reaching these milestones not only breaks the monotony of language learning but also provides a tangible sense of progress, turning what might feel like a routine into an ongoing series of rewarding challenges