<div style="font-family: arial; font-size: 14px;"><div style="font-family: arial; font-size: 14px;">I'm happy that my corpus is useful.<br></div><div style="font-family: arial; font-size: 14px;"><br></div><div style="font-family: arial; font-size: 14px;">I'm trying to keep it up-to-date and add new sources regularly. It's great that the Klingon community is active and that there are so many Klingon texts produced each year. The corpus has currently almost 500,000 words, which is a lot for a constructed language. The largest Esperanto corpus has about 10 million words, the largest lojban corpus 7 million, so we are not yet there, but at the rate Klingon text is produced we are not far from one million words. (To see how much text is added each year, see this picture: <a href="https://korpus.klingonia.fi/timeline.svg">https://korpus.klingonia.fi/timeline.svg</a>)<br></div><div style="font-family: arial; font-size: 14px;"><br></div><div style="font-family: arial; font-size: 14px;">The Tatoeba corpus contains a lot of typos, which is understandable given how big it is. If you find a typo, click the link to go to the sentence's page on tatoeba.org and correct the sentence (or comment if you don't have rights to edit sentences).<br></div><div style="font-family: arial; font-size: 14px;"><br></div><div style="font-family: arial; font-size: 14px;">As a general warning, many sources of my corpus contain errors. Some of the texts are very old, some are poetic and others just have typos or grammatical errors. The corpus is useful for scientific study of Klingon usage and could also be used as an educational tool when learning Klingon, but it's important to know that it doesn't try to be a corpus of good Klingon, it tries to be a corpus of all Klingon. There are errors. I have tried to organize the sources so that the one's with most errors are shown with red color.<br></div><div style="font-family: arial; font-size: 14px;"><br></div><div style="font-family: arial; font-size: 14px;">Iikka "fergusq" Hauhio<br></div><div style="font-family: arial; font-size: 14px;">https://klingonia.fi/en</div></div><div style="font-family: arial; font-size: 14px;"><br></div><div class="protonmail_quote">
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐<br>
On Saturday, January 22nd, 2022 at 01.42, James Landau <savegraduation@yahoo.com> wrote:<br>
<blockquote class="protonmail_quote" type="cite">
<div style="font-family:Helvetica Neue, Helvetica, Arial, sans-serif;font-size:13px;" class="yahoo-style-wrap"><div data-setdir="false" dir="ltr"><div>I just found a link to Iikka Hauhio's Klingonia Corpus from the Klingon Wiki. For those who haven't seen it before, it's up at: https://korpus.klingonia.fi/<br><br><br>It's great to see *HarqIn* (which was my request) getting use. (I've also found *HarqIn* in a Klingon blog by googling, and I saw 'enru mentioning the word in the comments section for a chabal tetlh request lately.) I didn't get any hits for *DannI'* nor for *rosmaH*, though.<br><br><br>Very cool to see mayqel's texts on the Greek gods in the corpus! I've come across them by chance when doing Google searches before, so obviously they're from the webpages subcorpus.<br><br><br>I think I may have found a mistake in the Tatoeba sentences, though. When I did a search on *loDHom*, I found this sentence. "Both boys have autism" is translated as *ngor cha' loDHompu'vam*. Shouldn't *ngor* be *ngur*?<br><br><br>majQa', Iikka!</div><div><br></div></div></div>
</blockquote><br>
</div>