Jun. 17th, 2009

resa: (work)

Okay, now I talk corpus. At work, we comply a text corpus, that is, a compilation of texts of the variety of English spoken in Ghana. It is amazing! It is cool! It is funny! ... It is sometimes verrrrry difficult.

Allow me to introduce our corpus work in a few sentences:

We scan, typewrite texts (like school books, novels, newspaper reports etc.) and tag them. Headlines get <h></h> around them, paragraphs <p></p> and so on and so forth. That's easy. It gets difficult when we have to decide what to include in the corpus and what not. For example, we have a quote mark-up (<quote></quote>, who would have thought?) for words and sentences which were not originally written by the author. We would not want to include a universal translation of an uttering by Socrates in our corpus of Ghanaian English, would we? We keep such sentences in the text but we include some more Ghanaian English words. This has all to do with word counts really, because we have different categories of text genres and each category includes such and such many texts with two thousand words (that's the important part for the <quote> stuff) to finally get one million words altogther. Speaking of this, another example for the importance of the <quote> tag would be our massive bulk of self-help books which consist to 50% of bible quotes. Taking only these books and not kicking out the word count of the bible quotes, we would have a half-bible, half-Ghanaian-English corpus. That's why we mark, for example, bible quotes as so-called extra-corpus material and do not include the word counts of them in the corpus itself.

With this little introduction to corpus work, on to my little confusing text passage here.

Right now, I'm finding my way through a handwritten exam of modern poetry (which is nothing compared to the pain in the assish biochemistry exams, really...) and am standing dumbfounded before the saying "Dulce et decorum est". It is not Ghanian English, that's for sure. But there are so many possibilities still:

1. It is a sentence in a foreign language which deserves the <foreign> tag.
2. It is a syntactic complete sentence which also deserves the <X> tag but integrated into the Ghanian student's ongoing sentence which normally only gets the <quote> tag.
3. The complete problem is that originally the passage reads "the saying 'Dulce et decorum est'" which actually deserves an additional <mention> tag, too.

This is so not funny. U., N. and I will break our brains taking about this together later. We can apply multiple tags to such sentences, but at a certain point it gets ridiculous to tag, tag, and tag them further. I'm curious which tag we will leave out.

And you know what? I am so thankful that Latin is not an indigenous language of West Africa. Because we  have an <indig> tag as well...

resa: (Default)

Due to the bus schedule here in the middle orf nowhere which is the Phuilosophikum I, you get a quick real life minus work update from me, too. :-)

A few days ago, I got a postcard and some money from my grandparents which should make me happy but which does not, sadly. In a very twisted way, this is not a present given by my grandmother to me, but a way to say "look, I'm so generous, thank me, my little girl" for her. I have a hard time dealing with it and I have a hard time dealing with the fact that I have a hard time dealing with it.

On to happier things!

Today, we started talking about The Vagina Monologues in class and I finally got a place to express my enthusiasm about the play with like-minded readers face-to-face. We read some critical texts on the play and some of the monologes in class, too, and our teacher acted out an amazing Queens accent.

If I'm lucky, one of my roommates was at home this morning to greet the postman bringing me my new laptop. Then tomorrow, D. will help me transfer all my stuff from the old one to the new one and my life will be much easier and more pleasurable again.

Aaaaaand the last two chapters of "Breaking Points" will be posted somewhere tonight/tomorrow morning. I'm so exited! <3

See you!

Profile

resa: (Default)
Resa

August 2011

S M T W T F S
 1 2 3456
78910111213
14151617181920
21222324252627
28293031   

Style Credit

Expand Cut Tags

No cut tags
Page generated Jul. 16th, 2025 12:19 am
Powered by Dreamwidth Studios