Archive for category Language study

Create your own ANKI deck with automated translations and voices: Part 2

This is a continuation from this post about creating your own ANKI SRS card deck for learning a foreign language. The following part is here

Step 3 – Create the translations for your wordlist

If you sourced your words from a textbook you might already have the translations. If so, just put them in the next column. I however didnt have a translation list, so I copied the list of English words and pasted them into Google Translate. This gave a corresponding list on the right hand side. I also checked the “phonetic” button underneath the translation to give the Pinyin translation.

Copy the list of translations to matching positions in the spreadsheet

because I also wish to learn Simplified Chinese, i also translated my list into Simplified Chinese too. (the pronunciations are the same for Mandarin) here is my list after adding all the translations

At this point it will be good to get these translations sanity checked by a native speaker. As I found in case of Taiwanese Chinese, there are some words different in Taiwan than in mainland China. I exported the spreadsheet to an XLS spreadsheet file and I have a friend who was kind enough to take the file and mark down the incorrect ones and replace with the correct taiwanese words. Afterwards I updated my own list in google with the corrections.

Step 4. Generate the sounds.

A great feature of anki is you can attach media to the flashcards. rather than spend the time to create my own samples I wanted to use google translate’s voice to generate the voice samples and attach to my cards.

** UPDATE: I created an open-source app for Mac/Windows that downloads the sounds without having to write and execute scripts. The app is available here. It greatly simplifies the process of downloading the voice files and creating data for ANKI

Thanks to this post on stackoverflow its possible to make a script that will download the mp3 from google translate in the language of choice and save it to a file of the same name. The script looks like this

 #!/bin/bash
 # write a Chinese text string as an audio file using Google Translate
 # usage: zh2audio.sh <text>
 wget -q -U Mozilla -O $1.mp3 "http://translate.google.com/translate_tts?ie=UTF-8&tl=zh&q=$1"

To create a script, I used the TextWrangler mac app to create a new text file, copy the script text over and save it as zh2mp3.sh (zh being the code for chinese, but it can be any filename you like) to a new empty folder
After you save this file, give it a test run to see that it works.

 $ sh ./zh2mp3.sh 你好

Should create a file in that directory called 你好.mp3. Check the file and see that it sounds right. (你好, ni-hao, hello)

Also mentioned on that page is how to install wget if its not already installed (which wasn’t installed on this Mac with Mountain Lion) from here.
In the end I created a script from a site online which installed wget for me

 curl -O http://ftp.gnu.org/gnu/wget/wget-1.13.tar.gz
 tar -xzvf wget-1.13.tar.gz
 cd wget-1.13
 ./configure --with-ssl=openssl
 make
 sudo make install
 which wget #Should output: /usr/local/bin/wget

try typing wget on the terminal line to see whether not its installed. Obviously, you don’t need to install this if you already have it and the zh2mp3 script worked.

The next part is to create a script that will grab every mp3 sound for your vocabulary list.
Go back to your spreadsheet and add a column called Filenames. In the next row down, type in this formula

 =SUBSTITUTE( TRIM(C2) ; " " ; "")

Be sure to choose the column that relates to the text you want spoken by google translate.
Fill down this formula to every cell in the column (highlight all the cells from E2 down and press CTRL+D / Command D (for macs) to fill down, or drag the selection box down from the bottom right corner. This will copy the cell and change the position to the correct relative position cell reference) This formula removes any spaces, This is necessary for the file names, this isn’t a problem in Chinese or Japanese but may be problematic for other languages that need spaces.

The next step is to create the script call using the source file name. Create a new column title called “Script”. In the 2nd row, write the following script

 =CONCATENATE("sh ./zh2mp3.sh ",E2)

The result should be something like this: sh ./zh2mp3.sh 你好
There should be no space in words to translate. If there is, it will affect the file names created.

Again, fill this down so you have one per line for your vocabulary.

Next step is to create a script file. Copy this column’s cells and past them into a new file in TextWrangler. Save this as getmp3s.sh in the same folder as zh2mp3.sh

Now go to the terminal window and run the script.

 $ sh ./getmp3s.sh

After some time, you should see mp3’s appearing in the folder in your finder window. When the process is complete, the terminal prompt will re-appear.

These mp3s need to be copied manually to your ANKI media folder. Use Finder to copy every new mp3 file in this folder and paste them over to your ANKI media folder. On mac, this is typically at /Users/username/Documents/Anki/User 1/collection.media

you should now have the mp3s you need for each card.

The next part is here.

Share Button
3 Comments

Create your own ANKI deck with automated translations and voices: Part 1

For the last 2 months I have been using ANKI app to study Chinese vocabulary. Its a learning system for words and phrases using an electronic flash cards on PC, mobile and web. ‘Decks’ of cards can be loaded into cloud servers and deployed to all of the devices you use. It makes use of a spatial repetition system, repeating incorrect words more often. I find the interactivity much more fun than reading through vocab lists and allows me to concentrate longer on the usually boring part of word memorisation.

One problem I find is my listening skills are significantly poorer than my speaking skills. I also found this in Japanese. Its also the one you can’t solve by yourself easily as you really need a speaker to speak to you.

A solution I found was to make my own cards and attach voice samples of each word or phrase so I would recognise it being spoken. This has helped me greatly as well as being extremely convenient as I can use it on my Android phone as well as Mac.
It might seem much easier just to go through a word list created already (ANKI provides some with voice samples of real people too) but I found that I was trudging through words I never use in conversation. Its a pain because at this beginner stage I want to get up to a level where I can say what I want to. So I set to work on a wordlist that was meaningful to me. I did this by jotting down words I wanted to say during conversations, but had to revert to english to say them. My first list contained about 200 words.

Rather than manually enter these words, translations and record voice samples for each word, I created a process to translate words and capture the voice generated using google translate. This was then imported into ANKI, without having to manually write one card. The result is that a batch of 200 words could be translated, voiced, and added to ANKI in less than 10 minutes.

I used a Mac to do this, but it wouldn’t take much more to do this on Linux or Windows. Of course you can change the applications used (i.e. Microsoft Excel, Word, etc) with minimal changes in the steps.

** UPDATE: I made an app to help in the process of creating the flashcard data. Its available in part 5, however many of the steps involved to obtain the data and explain the theory of creating flashcards is still important.

The tools you need to do this (all free)

  • ANKI application version 2. You also need to register an account. You will need a PC client application in order to make use of the import feature.
  • Google docs spreadsheet, requires a google account.
  • Google translate website.
  • TextWrangler: This is an app for editing text documents and saving out txt files, used for importing in ANKI.
  • *UPDATE: I made a Windows/Mac app to help with the process of creating card data and downloading of mp3s, available here


This tutorial has ended up being quite long. So I’ll split it up into different steps and posts:

Part 1:

Step 1. Create your word list in your native language

Step 2. Create a google spreadsheet and put in the words in the list.

Part 2:

Step 3 – Create the translations for your wordlist

Step 4. Generate the sounds.

Part 3:

Step 5. Create the flashcard data

Part 4:

Step 6. Create a new deck and import the data to ANKI

 

Step 1. Create your word list in your native language

I personally came up with a word list of about 180 different words or even small sentences. The list could come from other material you are studying from (such as a book) For me, one day I spent half an hour thinking of words I’d like to be able to say but couldn’t yet. Every day items, weather, hot/cold, place names, numbers, etc.

Here is the first list I came up with.

OK., Everything, once , No thank you., soon, two days from now , afterwards , Would you like that toasted ?, What is your hobby?, What is your phone number, Bus, underwear, Zoo, mouth , later , crying, spray bottle , recycle , floor , garbage , garbage bin, hair, password , chocolate , take away , often, bed , always , lazy, where can i take a shower? , I Made a mistake, hands , promise , Scooter, receipt , Nothing, Bright, Starbucks, Yes please, sometimes , chair, stairs , orange, every day, Hamburger, Car, Salad, Onion, washing machine , Mild, Pretty, Beef, Pork, dessert , username , battery, White, eyes , get divorced, Red, Carrot, get married , Green, Gondola, ears, earphones , backpack , Pepper, feet, bicycle , legs , stinky tofu , handsome, apple, Blue, Vegetables, Taxi, body, Spicy, Chili, drunk , blonde hair , keyboard , ladder, need to charge my cellphone , illegal , Bread, face, reservation , napkin , Chicken, Yellow, Dark, Black, nose, all the time , second hand , legal, eating in, cucumber, arms , speakers , table , table legs , fruit, mouse , login , bald , Blu-ray, screen, details , aisle , remote control , Change (money), tray , snack, megadrive, wifi, triangle , coupon, would you like a bag? , would you like it heated? , light, light switch , ice cream, refrigerator , knife , fork, tightfisted , listen to music , noisy , flavour, loud, lose, tasty , quiet , advert , strong , subtle, alarm, smoking, square , fashionable , wooden panel , tree, attention , flashy, fire extinguisher, toothbrush , toothpaste , sweet , camera , glasses , rectangle , pink , purple , tatoo , internet , stomach , winner , dirty , bag , warning , super Nintendo , loser , choice , bar / pub, metal, door, contact lens , biscuit , cigarette , circle , orange, air con, wrapper , potato , pasta , spoon , microwave , frying pan , nightclub , salty.

These words are over the place, right? Right. Its just a list of words that are useful to me in every day conversation. They’re only for me and I dont care that they aren’t on a specific topic.

Step 2. Create a google spreadsheet and put in the words in the list.

We will use the other columns later to hold the translations and use the cell equations to build the data useful for ANKI later

google text english word list

The next part is here.

Share Button

Tags: , , , , , ,

6 Comments