Category Archives: Stuff

Various other stuff. For example software.

A Doodle from Work

July 5, 2012Stuffabbreviations, Bible, doodles, incunabulum, Tahano HikamuCarsten

Since the beginning of June, I’ve got a job as a student assistant at the Medieval German Philology department at my university. As a part of this job, I have been doing some proofreading of various articles recently, and for the past couple of days I have been working on a particularly annoying one. In order not to go crazy over the umpteenth malapropism or literally translated idiom that renders the current sentence incomprehensible – the author isn’t a German native speaker and their command of the language isn’t great – my mind needs some digression once in a while. So during one of those little breaks, I doodled the note on the left today …

Beginning of John 1:1 in a copy of the Gutenberg Bible — “Gospel according to John.” *Gutenberg Bible.* Vol. 2. Mainz, ca. 1455. 235^rb. National Library of Scotland, n.d. Web. 5 Jul. 2012. ‹http://digital.nls.uk/74624118› (Image published under *CC-BY-NC-SA* license; cropped)

This isn’t supposed to become canon, but I was just fooling around, under the influence of a reproduction of page 235^r (John 1:1 ff.) of the 42-line version of the Gutenberg Bible that’s hanging framed over my desk … And a pretty doodle it became, I think, so I thought I’d share. Compare for scribal abbreviation goodness (even in early printing!) in the picture on the right, which is a snippet from said framed page.

What you can see in my doodle are abbreviations for some of the most frequently occurring case markers, which I assume would be a likely target of abbreviation if space were limited, or if writing materials were expensive – as was the case for parchment in the Middle Ages. However, since Ayeri is already written with an abugida, I guess that there would not be as much abbreviating as with the Latin alphabet, since abugidas already condense a lot of information to diacritics. What I would expect, however, is leading (/ˈlɛdɪŋ/, the space between lines) to be reduced to a hardly legible minimum, since all the diacritics need vertical breathing space that you would probably rather not waste under some circumstances.

“Gospel according to John.” Gutenberg Bible. Vol. 2. Mainz, ca. 1455. 235^rb. National Library of Scotland, n.d. Web. 5 Jul. 2012. ‹http://digital.nls.uk/74624118› (Image published under CC-BY-NC-SA license; cropped)

Simple Interlinear Glosses Shortcode Plugin for WordPress

February 13, 2012Stuffconlang, gloss, interlinears, linguistics, meta, PHP, plugin, software, wordpressCarsten

I don’t actively develop this plugin anymore, since things have so far been working reasonably well enough for my purposes. For something more modern and Javascript based, you may want to have a look at Leipzig.js.

[Last updated: 2012-02-16] For a long time I’ve been slightly annoyed of formatting interlinear glosses in HTML by hand. I had hoped that there would be a plugin for WordPress at least, as a widely used content management system, that would do things automatically and to my liking. But as far as I can tell, nobody has published anything like that so far. Thus, I finally tried and programmed a shortcode plugin for very simple interlinear glosses myself, hoping that it may be useful for others, too. Especially blogging linguists and conlangers, of course 🙂 Continue reading Simple Interlinear Glosses Shortcode Plugin for WordPress →

Name-dropped and UDHR Article 1 Translation

December 22, 2011Scripts, StuffDothraki, New York Times, Omniglot, Public Relations, Tahano Hikamu, Tahano Nuvenon, Translation Challenge, UDHR, Universal Declaration of Human RightsCarsten

Got a mention by fellow conlanger David J. Peterson (along with a few other Conlang-L/LCC4 people) in his reply to the recent New York Times article on his inventing Dothraki for the Game of Thrones TV series and the hobby of “con-langing” (their spelling), which I found both to be good reading. Also, since Simon Ager of omniglot.com kindly updated some information about two scripts of mine – which was more than overdue after (I think) about 7 years – I uploaded a translation (PDF warning) of the first article of the United Nations’ Universal Declaration of Human Rights to serve as an example on Simon’s page on Tahano Hikamu.

Wuggy and Ayeri

July 8, 2011Stuffexperiment, how-to, software, Twitter, word-generation, WuggyCarsten

I permanently deactivated my Twitter account in January 2023. Links to the account below aren’t functional anymore.

So I’ve decided to give a go to this new-fangled web service called “Twitter,” about which the kids have been so frantic for the past couple of years.[1. You can follow me, if you wish, at @chrpistorius] To be honest, Twitter has completely passed me by so far, as I didn’t really see the sense in it when you can share links and leave short notices about ideas that have just popped into your head on Facebook just as easily.

However, this article is not supposed to be about my experiences with Twitter, but rather about something that I’ve found on Twitter thanks to @janMato: A little Python program called Wuggy, developed by folks at the Center for Reading Research of the Department of Experimental Psychology at Gent University, Belgium. This program allows users to create pseudo-words tailored to the specific sound and syllable structure of languages, e.g. to be used for cognitive research, like English ‘wug’, which is to be pluralized in the classic test. JanMato suggested it would be useful for conlanging, too.

In my last post, I had already mentioned that I sometimes use a list of generated words to choose from when I can’t think of a suitable word for some meaning off the top of my head. The list I use currently was generated with a program called kwet, based on the data gained from a rather tedious, search-and-replace heavy analysis of my language’s dictionary that I conducted last year. I’ve been meaning to program a PHP script that can do this analysis automatically (and never did), but I thought that if this program can actually analyze a dictionary you give it in order to generate similar non-words, I might give it a go anyway.

So, how do we get our own words into the program? Wuggy is built in Python, as I said, so its code should be rather straightforward (unlike PERL code, which is simply not to be read). What you need is a bunch of files:

./plugins/subsyllabic_yourlang.py
./plugins/orth/yl.py
./data/yourlang_dictionary.txt

Fortunately, you can just copy an existing subsyllabic_*.py and orth file and just change the name of the language in the files to something more appropriate. And you must register the module for your language in ./plugins/__init__.py by simple analogy with the other entries. The dictionary files Wuggy uses are just flat TXT files where columns are divided with a tab:

word ↦ word’s syllabification ↦ some numeric value I don’t know how to obtain but just write 1

I got this file by simply dumping the ‘pronunciation’ field of my MySQL database into a CSV file which I edited to use orthography instead of phonemic transcription (not very difficult in Ayeri) etc. to fit Wuggy’s format. In order to get Wuggy to do something, you will also need to provide some input to generate words from – I just use my dictionary file with the last field (the one with ‘1’ at the end of each line) removed.

If all prerequisites are installed and you execute Wuggy.py, there’ll be a dialog window with some options. If the creation of Yourlang’s module has worked out, you should see “Subsyllabic Yourlang” as an option in the “Language Modules” drop-down menu and if you choose Yourlang, a progress bar thing should pop up shortly, which tells you about loading and generating some values. I ran my whole dictionary shortened to unique entries (~1800) through the program this afternoon, creating 3 alternatives per word. Here’s some of the output (left original, right generated):

a-ra kri-bay
ba-br= de-bi
ban-te-b= nil-pu-r=
da-lang kra-nu=
en-van tay-tran

This doesn’t look bad so far, but annoyingly, it would recreate the same words dozens of times for the same or similar original syllables even with the dissimilarity from the original word set to 1 out of 3. Also, this occurred frequently:

bu-rang lu-u=
e-rar me-ik
i-lon le-in
kay-ra lē-o
ma-kim ka-os

The problem here is that Ayeri allows words to begin with vowels, but the likeliness of two vowels after another mid-word is rather small. Also, words can’t end e.g. in [p t k], and long vowels are the result of an allophonic process triggered by morphotactics rather than lexicalized. I don’t know, though, how to tweak my module files to avoid this, unfortunately.

Tahano Hinyan and Daléian alphabet

June 23, 2011Scripts, Stuffalphabet, Daléian alphabet, Javanese script, software, Tahano HikamuCarsten

Tahano ~~Hikamu “Java”~~ Hinyan[1. The italic variant of my Tahano Hikamu font, Tagāti Book, is modelled after this style. — 2012-11-09]

For some time now I’ve played around with a style for Ayeri’s native writing system Tahano Hikamu that I loosely based on the look of the Javanese script (which I’ve already mentioned in a previous posting). I made several examples before using an experimental font, but the style has not been documented anywhere so far. However, the file is up now as a kind of brochure/leaflet/thing intended to be a supplement to the “Alphabet” page. That is, I spared me the work to repeat myself with explanations, so most of the file’s content is really just a table of the different characters with their names underneath.

Download (PDF, 1.1 MB)

For those who are curious: The outlines of the characters were drawn in Illustrator, the brochure was made in InDesign, the stock photos are from the wonderful stock.xchng. This all also explains the file size, by the way.

Daléian Alphabet

Daléian script example

Note that I’ve also put up information for the Daléian script again. I just printed out the relevant page from my old website as a PDF file. The low resolution of the images is suboptimal for this purpose, but I suppose it’s still good enough to read on screen and get an idea.

Download (PDF, 143 KB)

Tahano (Nu)Veno(n) font

March 18, 2011Scripts, Stuffalphabet, copyright, decoration, embellishment, experiment, font, ornamental, software, Tahano Nuvenon, Vine ScriptCarsten

Tahano (Nu)veno(n) Just in case this was missed by anyone … If you’ve been following my work you may remember I used to have this Vine Script thing, which was an ornamental alphabet that took inspiration for its characters from climbing plants. Rebecca Bettencourt, fellow Conlang-L reader, made a font of it last year. We agreed that it would be freeware and that I could offer it for download. I’ve not done so up to now.

I earlier deleted information on the script here as I didn’t see it directly related to Ayeri anymore, but I still used to receive requests about it occasionally. Since the alphabet itself is kind of pretty – although I’ve only ever seen it as a minor experiment and never used it actively to write longer passages in Ayeri with because it is simply too unwieldy for that purpose – I didn’t want it to get lost completely. Essentially, you may want to think of it as the equivalent of an EP in music.

The font is self-transcribing, basically. There’s also a page on the script as it was digitalized on Omniglot, you may want to check that for documentation, as well as the Readme file included in the ZIP archive.

Download (MD5: e9228a56fefadccce3c1abda8bc4456e; 59,355 bytes)

The description of Tahano Veno from the old Benung page can be downloaded as a PDF as well. It doesn’t significantly differ from the description at Omniglot, though.

Base converter shortcode WordPress plugin

March 14, 2011Stuffbase conversion, meta, PHP, plugin, software, wordpressCarsten

A plugin to convert numbers in base 10 to other number bases, including their decimal places. You can insert this into pages and posts with a shortcode:

[base no="NUMBER" base="BASE TO CONVERT TO" prec="DECIMAL PLACES" show="SHOW BASE AS INDEX"]

no is a real number, decimal places are set off by a dot, so a valid input is e.g. 12.3456.
base is the base you want to convert to as an integer, e.g. 12.
prec is an integer number that defines how many places you want after the decimal point. If you leave this out, the number returned will have the same number of places after the dot as the input. A valid input would be e.g. 3.
show is a boolean value, i.e. either 0 for ‘no’ or 1 for ‘yes’, that enables or disables showing the base of the result as an index after it, like 10A₁₂. If you leave this out, 0 will be assumed.

For example, the following codes gives the following result:

[base no="1234" base="2" show="1"]
→ [base no=”1234″ base=”2″ show=”1″]

[base no="123.4567" base="16" show="0"]
→ [base no=”123.4567″ base=”16″ show=”0″]

The script can also round up or down:

[base no="0.142857" base="12" prec="20"]
→ [base no=”0.142857″ base=”12″ prec=”20″]

The same, shortened to 3 places:
[base no="0.142857" base="12" prec="3"]
→ [base no=”0.142857″ base=”12″ prec=”3″]

The same, shortened to 5 places:
[base no="0.142857" base="12" prec="5"]
→ [base no=”0.142857″ base=”12″ prec=”5″]

Download