In what ways can you make Vocaloid2 (a program for singing in Japanese) sing in English more realistically?

(Where I give examples in this guide, I really recommend that you try it yourself with Vocaloid2 as you read along. It should make it more obvious what I'm talking about if you do that.)

Well, since Japanese is generally a phonetic language (i.e. you pronounce things exactly as they're spelt, with very few exceptions), put simply, you need to think of the closest sounds which exist in Japanese for the English word and use them instead.

There are actually 3 problems though. The first one is converting the normal English word into a word which can be spelt normally in Japanese. The second problem is that Vocaloid2 actually needs phonetic codes, and it converts the typed Japanese syllables into phonetic codes (which represent EXACTLY how to pronounce something). Actually, that second 'problem' will be used to our advantage later on. The third problem is that Vocaloid2 can work out these phonetic codes to pronounce ONLY 1 SYLLABLE per note. As you might know, when a single word in English is converted to the nearest-sounding Japanese, one English syllable usually becomes several Japanese syllables (e.g. 'man' -> 'ma, n'), yet we want these to be on only one note.

For example, taking the English word 'man', it would be able to be said compeltely correctly in Japanese, since 'ma' and 'n' exist in Japanese. However, in phonetics, exactly what sound the character 'n' creates changes slightly depending on what character FOLLOWS the 'n'. There are 4 different ways (in Japanese, at least)...

(Note: Luckily, we do not need to remember these - Vocaloid2 will choose the correct phonetic code automatically, but this is just an example to show you that it's not obvious which should be used)
1 - If a sound which needs the mouth to start in an 'm' shape follows, such as an 'm' or 'p' or 'b', the actual spoken sound of the 'n' changes to that of an 'm', for example in the Japanese word 'senpai'. You can see this if you lay down 3 notes, one straight after another, in Vocaloid2, and type as their lyrics 'se', 'n' and 'pa'. The phonetic code in square brackets next to the 'n' will become 'm'.
2 - If what follows the 'n' is a sound which needs the mouth to start in an 'n' shape, such as an 'n' or 't' or 'd', the sound of the origianl 'n' is most similar to how 'n' is pronounced in English. You could see this if you change the last note in that example in Vocaloid2 into 'ta' - the phonetic code (in square brackets) changes to 'n'.
3 - If a vowel follows after (on a second sillable) e.g. in "ren'ai" (4 syllables - re, n, a, i), then the 'n' is pronounced kind of nasally. If you change the last note in the Vocaloid2 example to an 'a' for example, you can see the phonetic code change to 'N\', which is what represents this nasal-sounding 'n'.
4 - For the single-syllable 'ni', the phonetic code becomes 'J i', the 'J' representing the n's sound in this case. I don't really hear any difference personally, but...

If we wanted to be said 'man is' (in English), then I think what would sound most similar in Japanese are the syllables 'ma ni zu'. I know we don't want the 'u' sound on the end, but we can manually edit the phonetic code later to get rid of it. ;) We just need to put it so that Vocaloid2 knows how to make the phonetic code, because the character 'zu' exists in Japanese but 'z' by itself doesn't.

So for this example we put down 3 notes. The first note's lyric we put as 'ma', the second's as 'ni', and the third we put as 'zu'. Now look at the phonetic codes which Vocaloid2's created for us - 'm a' (ma), 'J i' (ni), 'dz M' (zu). The phonetic code for the Japanese 'u' sound is 'M'. (Slightly strange, but there you go, I can't change that.)

Now listen to yourself say 'man is' in English. It turns out that I seem to say it with 'ma' as one syllable and 'nis' as a second syllable. We need to shove the third note's phonetic code (for 'zu') onto the end of the second note's phonetic code (for 'ni'), so that Miku will say it in the closest way as possible to how English-speakers do.

Right-click on the 'zu' note and choose the bottom line with '(P)...' (Note's Properties). A new little box comes up, and notice that there's a 'PHONETIC' section where 'dz M' is typed below. We need to copy-and-paste this onto the end of the previous note's PHONETIC section. Click in this (zu)'s PHONETIC section ('dz M' will be selected) and right-click and choose Copy. Then choose Cancel (to the right of OK), just in case you'd made any changes which we didn't want to keep. Now right-click on the second note ('ni') and choose '(P)...'. Click in its PHONETIC section twice, to get the cursor to the end of the 'J i' text. Put a space (all phonetic codes must be separated with a space), and right-click and paste the 'dz M' there. When you 'click off' the PHONETIC section, such as by clicking on some black area of the box, notice that the check-box underneath the PHONETIC section becomes ticked. This checkbox says 'protect', and it means that it will no longer works out phonetic codes automatically for this specific note, so as to protect (not change) the phonetic code which you have just manually entered. If you un-tick it and click OK, it will re-calculate the code back to 'J i' (from the LYRIC section, 'ni'). But we don't want to do this - we want the note to be pronounced as 'ni zu'. In fact, we don't want the 'u' sound. So before you click OK, delete that 'M' (and the space before it, which is now pointless).

When you click OK, delete the 3rd note which is now unnecessary. We were only using this as a temporary note to let Vocaloid work out the phonetic codes from a syllable for us, which we then took part of and put on a different note.

We're left with 2 notes, one says 'ma [m a]' and the other says 'n i [J i dz]'. Of course, the first text (out of square brackets) means nothing to the program, and Vocaloid2 will just pronounce the phonetic code shown in square brackets while ignoring the 'ni' text. You could change this to 'nis' to help yourself remember what the phonetic code represents, if you wish, to avoid confusion. Double-click the note and type 'nis' - the phonetic code won't change automatically, because remember, the 'protect' check-box is ticked for that note. ;)

If you play right now, then you'll hear Miku (or whatever voice you're using) say something which sounds very like 'man is' in English, without any 'u' sound on the end, since we deleted it.

If you're still awake, I'm going to do a slightly harder-seeming example now (the English word 'boxes'). The phonetic codes needed to pronounce this are included in the Japanese syllables 'bo ku si zu'. However, the sound 'si' in Japanese is pronounced 'shi', so I'll put 'se' and 'i' and mix them together. Lay down 5 notes with the syllables 'bo', 'ku', 'se', 'i' and 'zu'. Note that a fast way to enter lyrics is to right-click on the first note and choose the line with '(L)...', which is for writing in whole lines of lyrics. A text-box will come up, in which you can type 'bo ku se i zu' (without the quotes, of course), and click OK. It will have entered each syllable into each note automatically.

Now the phonetics for each note have become 'b o', 'k M' (there's that 'u' sound again), 's e', 'i' and 'dz M'. If I say 'boxes' in English, to me it sounds like I'm saying the 2 syllables as 'bok siz' or 'bo ksiz'. I'll use the second one in this example. In this case, the first note's phonetic code is already correct - 'b o'. We want the second note to consist of a bit of all of the other notes. We want the 'k' sound from 'ku', the 's' sound from 'se', the 'i' sound, and the 'dz' sound from 'zu'.

So if you just right-click on the second note and choose '(P)...', then while looking at the other notes' phonetic codes in the background, you can enter them into this note's PHONETIC section. When done, you'll have made the second note's PHONETIC section be 'k s i dz'. Remember that spaces and correct capitalisation are vital. Now you can delete the last 3 notes because, again, we were just using them temporarily to steal their phonetic codes. :P If you make your chosen voice read this now, you'll hear a pretty clear English word 'boxes' sung.

Note that in this second 'boxes' example, we were getting very close to the maximum number of phonetic codes able to be sung at once (that is, on one note). If you enter a lot of separate codes in the PHONETIC section, then the voice will try to sing as many as it can, but it often won't manage more than 4 (which is what this example at) or 5. In this case, try to move some of the burden off this note, e.g. in this example, it would've sounded similar if we'd put 'bo k' on note 1 and 's i dz' on note 2, which would've been using less codes for the single second note.

Lastly, here are a few English-to-Japanese pronunciation tips/notes:
- When a consonant sound ends a word, or 2 consonants are side-by-side, use the closest consonant in Japanese followed by a 'zu'. For examble, 'hubble' would become 'ha bu ru'. In this case, it would probably sound best if we used the 'ha' sound for note 1 and 'bu r' for note 2. The only exception is the 'n' sound, which is able to have a consonant put after it without needing to add a 'u' between them.
- In many programs, 'tu' will make a 'tsu' sound, 'si' will make a 'shi' sound, 'hu' will make a 'fu' sound, 'du' will make a 'zu' sound, and 'zi' will make a 'ji' sound, but Vocaloid2 doesn't suffer from these problems. ;)
- That said, it can't handle 'hu' (it just can't make any phonetic code out of it - in Japanese, there's only 'fu'), so use 'he' and 'u' and combine their phonetic codes.
- If you want it to sound best, don't assume what a phonetic code should be based on what you can remember from past codes. It's not always obvious - for example, the code for 'na' is 'n a', but the code for 'ni' is 'J i'. It's always best to type the Japanese syllable and see what Vocaloid2 does, and use the codes which it makes. ;)

- Robbi-985 aka SomethingUnreal