back to top

Spot your Language: 3 Nigerian Languages, 107 Others Added To Google Translator

Share

Google Translator on Thursday added 110 new languages to its portal, including three Nigerian languages.

The Nigerian languages are Fulani, Kanuri and Tiv. Fulani is Number 35 on the alphabetical list released by Google while Kanuri is Number 44 and Tiv is 96.

A blog spot by Google published on Thursday said the new languages represent more than 614 million speakers, opening up translations for around 8% of the world’s population.

It says, “Some are major world languages with over 100 million speakers. Others are spoken by small communities of Indigenous people, and a few have almost no native speakers but active revitalization efforts.

“About a quarter of the new languages come from Africa, representing our largest expansion of African languages to date, including Fon, Kikongo, Luo, Ga, Swati, Venda and Wolof.”

Meteor had reported earlier in the day that Google had announced its largest-ever expansion of language support for Google Translate, to significantly bolsters the platform’s ability to break down language barriers and facilitate global communication.

Before this expansion, Google Translate supported 133 languages.

The addition of 110 new languages is attributed to advancements in AI technology, particularly Google’s PaLM 2 large language model.

According to a Google blog post on Thursday, PaLM 2 played a pivotal role in enabling Translate to learn languages that are closely related to others, such as Awadhi and Marwadi related to Hindi, and French creoles like Seychellois Creole and Mauritian Creole.

The post also gives insight into how Google selects languages for the translation service.

It says, ” There’s a lot to consider when adding new languages to Translate — everything from what varieties we offer, to what specific spellings we use.

“Languages have an immense amount of variation: regional varieties, dialects, different spelling standards.

“In fact, many languages have no one standard form, so it’s impossible to pick a “right” variety. Our approach has been to prioritize the most commonly used varieties of each language.

“For example, Romani is a language that has many dialects all throughout Europe. Our models produce text that is closest to Southern Vlax Romani, a commonly used variety online. But it also mixes in elements from others, like Northern Vlax and Balkan Romani.

“PaLM 2 was a key piece to the puzzle, helping Translate more efficiently learn languages that are closely related to each other, including languages close to Hindi, like Awadhi and Marwadi, and French creoles like Seychellois Creole and Mauritian Creole.

“As technology advances, and as we continue to partner with expert linguists and native speakers, we’ll support even more language varieties and spelling conventions over time.”

The post also lists the peculiarities of some of the new languages added to its portal.

It says, “Afar is a tonal language spoken in Djibouti, Eritrea and Ethiopia. Of all the languages in this launch, Afar had the most volunteer community contributions.

“Cantonese has long been one of the most requested languages for Google Translate. Because Cantonese often overlaps with Mandarin in writing, it’s tricky to find data and train models.

“Manx is the Celtic language of the Isle of Man. It almost went extinct with the death of its last native speaker in 1974. But thanks to an island-wide revival movement, there are now thousands of speakers.

According to the post, NKo represents a standardized version of the West African Manding languages, consolidating numerous dialects into a shared linguistic form. Introduced in 1949, its distinctive alphabet is exclusive to this language, supported by a dedicated research community continuously enhancing resources and technology for its development.

The post adds that “Punjabi (Shahmukhi) is the variety of Punjabi written in Perso-Arabic script (Shahmukhi), and is the most spoken language in Pakistan.

“Tamazight (Amazigh) is a Berber language spoken across North Africa. Although there are many dialects, the written form is generally mutually understandable. It’s written in Latin script and Tifinagh script, both of which Google Translate supports.

“Tok Pisin is an English-based creole and the lingua franca of Papua New Guinea. If you speak English, try translating into Tok Pisin — you might be able to make out the meaning!”

In 2022, Google introduced 24 languages using Zero-Shot Machine Translation, a method where AI models learn to translate between languages without prior examples.

Here are some of the newly supported languages in Google Translate:

1. Abkhaz

2. Acehnese

3. Acholi

4. Afar

5. Alur

6. Avar

7. Awadhi

8. Balinese

9. Baluchi

10. Baoulé

11. Bashkir

12. Batak Karo

13. Batak Simalungun

14. Batak Toba

15. Bemba

16. Betawi

17. Bikol

18. Breton

19. Buryat

20. Cantonese

21. Chamorro

22. Chechen

23. Chuukese

24. Chuvash

25. Crimean Tatar

26. Dari

27. Dinka

28. Dombe

29. Dyula

30. Dzongkha

31. Faroese

32. Fijian

33. Fon

34. Friulian

35. Fulani

36. Ga

37. Hakha Chin

38. Hiligaynon

39. Hunsrik

40. Iban

41. Jamaican Patois

42. Jingpo

43. Kalaallisut

44. Kanuri

45. Kapampangan

46. Khasi

47. Kiga

48. Kikongo

49. Kituba

50. Kokborok

51. Komi

52. Latgalian

53. Ligurian

54. Limburgish

55. Lombard

56. Luo

57. Madurese

58. Makassar

59. Malay (Jawi)

60. Mam

61. Manx

62. Marshallese

63. Marwadi

64. Mauritian Creole

65. Meadow Mari

66. Minang

67. Nahuatl (Eastern Huasteca)

68. Ndau

69. Ndebele (South)

70. Nepalbhasa (Newari)

71. NKo

72. Nuer

73. Occitan

74. Ossetian

75. Pangasinan

76. Papiamento

77. Portuguese (Portugal)

78. Punjabi (Shahmukhi)

79. Q’eqchi’

80. Romani

81. Rundi

82. Sami (North)

83. Sango

84. Santali

85. Seychellois Creole

86. Shan

87. Sicilian

88. Silesian

89. Susu

90. Swati

91. Tahitian

92. Tamazight

93. Tamazight (Tifinagh)

94. Tetum

95. Tibetan

96. Tiv

97. Tok Pisin

98. Tongan

99. Tswana

100. Tulu

101. Tumbuka

102. Tuvan

103. Udmurt

104. Venda

105. Venetian

106. Waray

107. Wolof

108. Yakut

109. Yucatec Maya

110. Zapotec

Read more

Local News