Tag Archives: software

Markov-Chain Generator for Ayeri Words

April 26, 2017StuffPython, software, vocabulary, word-generationCarsten

Since I’m sometimes a little lazy to come up with new words, I wrote myself a little Python script which pulls a certain subset of words from the dictionary database I’m using and applies a Markov chain algorithm to it in order to generate new similar words. The script is sophisticated enough to filter out duplicates and some other undesirable outcomes. You can adapt the code shared below to your needs if you wish to.

See file on GitHub Gist

Tagāti Book G, Graphite and TeXLive 2014

May 24, 2015Asides, Stufffont, Graphite, LaTeX, software, Tahano HikamuCarsten

I updated my operating system to Ubuntu 15.04 ‘Vivid’ the other day and ran into trouble when I tried using my Tahano Hikamu font, Tagāti Book G. The issue was that XeLaTeX would ignore Graphite as the text rendering engine for this font in spite of my explicitly declaring it:

\newfontfamily{\Tagati}[
    Renderer=Graphite,
    ...
]{Tagati Book G}

The result was that none of the diacritics were aligned correctly, since the font is not configured for OpenType to handle them:

Demonstration of the bug in fontspec-xetex.sty 2.4a — a: Rendering as expected; b: Graphite ignored

After some research, it turned out that the bug is with the file fontspec-xetex.sty.[1. On my system, the path to the file is /usr/share/texlive/texmf-dist/tex/latex/fontspec/fontspec-xetex.sty, use kpsewhere fontspec-xetex.sty to find it, otherwise.] Ubuntu 15.04 still ships with TeXLive 2014, which includes version 2.4a of it as a part of the fontspec package. In this version, there is a typo in the definition for Graphite which apparently makes it inaccessible through the Renderer option. You can read up on it in the bug report on GitHub.

Changing fontspec-xetex.sty according to the bug report and saving it under my home directory’s TeX tree at ~/texmf/tex/latex/fontspec/ to not overwrite the original file solved the issue for me. Another way to solve the issue for the time being is to include a snippet of code in your TeX file’s preamble that basically redefines the respective function.

The issue is already fixed in the latest version of the fontspec package, also in the version that’s available from CTAN, so I hope there will be an update to the fontspec package in the official Ubuntu repositories as well sometime.

Ayeri number word converter in Python

February 5, 2015

Just a little something I started programming a while ago and elaborated on to make it useable today. It’s a little Python program which turns a base-10 number into its corresponding number word in Ayeri: Ayerinumbers.py on Github

The code is probably a bit unnecessarily complicated, because it tries to mimic Ayeri’s grammar rather than being mathematically straightforward. I may rewrite things in the future to make more sense from an engineer’s point of view.

Tagāti Book G Font is up for Download

August 12, 2012Scripts, Stufffont, Graphite, software, Tahano HikamuCarsten

Finally! After a couple of weeks of drawing characters (albeit in a rather lazy way) in February and March, and programming font features for the past couple of weeks, I decided to upload my Tahano Hikamu font to Github: https://github.com/carbeck/tagatibookg.

There’s still some things to improve, but for the most part, the font works now. Please be aware that this font uses Graphite and that not so many applications support that. Also, note that in order to use Graphite in Firefox 11+, you will need to activate it first.

The Github repository contains all files used in the making of the font so you can easily clone/download it. But if you really just want the font, you probably want to just

Download the ZIP archive

For some extra fun, here’s basically how I made it: See the video on Youtube

I had the ZIP file in my Dropbox ‘Public’ folder, however, Dropbox dropped support for the Public folder a while ago, so the link was broken. I fixed it now.

Some Work in Progress

August 6, 2012Scripts, Stuffexperiment, font, Graphite, software, Tahano Hikamu, work in progressCarsten

I’ve been reworking my font of Tahano Hikamu since February now and also drew a hinyan version (“Tahano Hikamu Java”) completely from scratch. When I felt like toying around with these things again a couple of weeks ago, I started making the files functional with Graphite – that is, I added ways to handle diacritics and I’m currently working on getting dynamic diacritic replacement and character reordering right – this is so much easier and far less brain-twisting with pen and paper!

The whole thing is still messy and highly preliminary, which is why I won’t release any font files for download just now (please be patient). However, I’m kind of pleased with how this experiment comes along, so I wanted to share the link to my current testing page here as well and not just on Twitter.

There’s no version schedule, so it’s done when it’s done. Hopefully that won’t take too very long in spite of a pending term paper and other more important work. I’m looking forward to it, though, and so can you. 🙂

It’s up for download now.

Simple Interlinear Glosses Shortcode Plugin for WordPress

February 13, 2012Stuffconlang, gloss, interlinears, linguistics, meta, PHP, plugin, software, wordpressCarsten

I don’t actively develop this plugin anymore, since things have so far been working reasonably well enough for my purposes. For something more modern and Javascript based, you may want to have a look at Leipzig.js.

[Last updated: 2012-02-16] For a long time I’ve been slightly annoyed of formatting interlinear glosses in HTML by hand. I had hoped that there would be a plugin for WordPress at least, as a widely used content management system, that would do things automatically and to my liking. But as far as I can tell, nobody has published anything like that so far. Thus, I finally tried and programmed a shortcode plugin for very simple interlinear glosses myself, hoping that it may be useful for others, too. Especially blogging linguists and conlangers, of course 🙂 Continue reading Simple Interlinear Glosses Shortcode Plugin for WordPress →

Digitale Typografie für fiktionale Schriftsysteme – ein Rant

September 7, 2011Scripts, Soapboxalphabet, Beschwerde, Deutsch, essay, experiment, font, Leid, Schriften, software, Tahano HikamuCarsten

Dies ist die Übersetzung eines englischsprachigen Beitrags (click for English version), den ich bereits im August 2011 geschrieben habe. Da scheinbar ein größeres Interesse an diesem Beitrag bestand, dachte ich, es wäre eventuell sinnvoll, ihn auch ins Deutsche zu übersetzen.
Mittlerweile habe ich auch einen Font mithilfe von Graphite gebastelt.
Beachte, dass ich nicht einmal ein halbprofessioneller Schriftdesigner bin. Alles, was du hier liest, ist learning by doing und daher sehr subjektiv. Ich habe mir bisher nicht mehr über Schriftdesign beigebracht, als nötig ist, um meine eigene Schrift umzusetzen.

Eines meiner fortlaufenden, mit dem Sprachenbasteln verbundenen Projekte ist es, das Schriftsystem meiner Kunstsprache auf den Computer zu bringen. Ich versuche seit mehreren Jahren, brauchbare Lösungen zu finden, bin aber immer früher oder später gegen eine Wand gerannt.
Continue reading Digitale Typografie für fiktionale Schriftsysteme – ein Rant →

Digital Typography for Fictional Writing Systems – A Rant

August 8, 2011Scripts, Soapboxalphabet, essay, experiment, font, rant, software, Tahano Hikamu, woesCarsten

This article still gets accessed a lot even after over 5 10 years since publishing it. Technology, however, continuously advances, so please be aware that the information below may be outdated.

Dieser Beitrag ist jetzt auch auf Deutsch zu lesen, nämlich hier.
By now, I’ve made a font that uses Graphite.
Keep in mind that I’m not even a semi-professional font designer. All you read here is my subjective experience in learning by doing. I haven’t yet explored font-making beyond what I needed for my own stuff.

One of my ongoing language-construction related pet projects is to bring my constructed language’s writing system to the computer. I have been trying to come up with workable solutions to do this for a number of years, but always hit brick walls sooner or later. Continue reading Digital Typography for Fictional Writing Systems – A Rant →

Wuggy and Ayeri

July 8, 2011Stuffexperiment, how-to, software, Twitter, word-generation, WuggyCarsten

I permanently deactivated my Twitter account in January 2023. Links to the account below aren’t functional anymore.

So I’ve decided to give a go to this new-fangled web service called “Twitter,” about which the kids have been so frantic for the past couple of years.[1. You can follow me, if you wish, at @chrpistorius] To be honest, Twitter has completely passed me by so far, as I didn’t really see the sense in it when you can share links and leave short notices about ideas that have just popped into your head on Facebook just as easily.

However, this article is not supposed to be about my experiences with Twitter, but rather about something that I’ve found on Twitter thanks to @janMato: A little Python program called Wuggy, developed by folks at the Center for Reading Research of the Department of Experimental Psychology at Gent University, Belgium. This program allows users to create pseudo-words tailored to the specific sound and syllable structure of languages, e.g. to be used for cognitive research, like English ‘wug’, which is to be pluralized in the classic test. JanMato suggested it would be useful for conlanging, too.

In my last post, I had already mentioned that I sometimes use a list of generated words to choose from when I can’t think of a suitable word for some meaning off the top of my head. The list I use currently was generated with a program called kwet, based on the data gained from a rather tedious, search-and-replace heavy analysis of my language’s dictionary that I conducted last year. I’ve been meaning to program a PHP script that can do this analysis automatically (and never did), but I thought that if this program can actually analyze a dictionary you give it in order to generate similar non-words, I might give it a go anyway.

So, how do we get our own words into the program? Wuggy is built in Python, as I said, so its code should be rather straightforward (unlike PERL code, which is simply not to be read). What you need is a bunch of files:

./plugins/subsyllabic_yourlang.py
./plugins/orth/yl.py
./data/yourlang_dictionary.txt

Fortunately, you can just copy an existing subsyllabic_*.py and orth file and just change the name of the language in the files to something more appropriate. And you must register the module for your language in ./plugins/__init__.py by simple analogy with the other entries. The dictionary files Wuggy uses are just flat TXT files where columns are divided with a tab:

word ↦ word’s syllabification ↦ some numeric value I don’t know how to obtain but just write 1

I got this file by simply dumping the ‘pronunciation’ field of my MySQL database into a CSV file which I edited to use orthography instead of phonemic transcription (not very difficult in Ayeri) etc. to fit Wuggy’s format. In order to get Wuggy to do something, you will also need to provide some input to generate words from – I just use my dictionary file with the last field (the one with ‘1’ at the end of each line) removed.

If all prerequisites are installed and you execute Wuggy.py, there’ll be a dialog window with some options. If the creation of Yourlang’s module has worked out, you should see “Subsyllabic Yourlang” as an option in the “Language Modules” drop-down menu and if you choose Yourlang, a progress bar thing should pop up shortly, which tells you about loading and generating some values. I ran my whole dictionary shortened to unique entries (~1800) through the program this afternoon, creating 3 alternatives per word. Here’s some of the output (left original, right generated):

a-ra kri-bay
ba-br= de-bi
ban-te-b= nil-pu-r=
da-lang kra-nu=
en-van tay-tran

This doesn’t look bad so far, but annoyingly, it would recreate the same words dozens of times for the same or similar original syllables even with the dissimilarity from the original word set to 1 out of 3. Also, this occurred frequently:

bu-rang lu-u=
e-rar me-ik
i-lon le-in
kay-ra lē-o
ma-kim ka-os

The problem here is that Ayeri allows words to begin with vowels, but the likeliness of two vowels after another mid-word is rather small. Also, words can’t end e.g. in [p t k], and long vowels are the result of an allophonic process triggered by morphotactics rather than lexicalized. I don’t know, though, how to tweak my module files to avoid this, unfortunately.

Tahano Hinyan and Daléian alphabet

June 23, 2011Scripts, Stuffalphabet, Daléian alphabet, Javanese script, software, Tahano HikamuCarsten

Tahano ~~Hikamu “Java”~~ Hinyan[1. The italic variant of my Tahano Hikamu font, Tagāti Book, is modelled after this style. — 2012-11-09]

For some time now I’ve played around with a style for Ayeri’s native writing system Tahano Hikamu that I loosely based on the look of the Javanese script (which I’ve already mentioned in a previous posting). I made several examples before using an experimental font, but the style has not been documented anywhere so far. However, the file is up now as a kind of brochure/leaflet/thing intended to be a supplement to the “Alphabet” page. That is, I spared me the work to repeat myself with explanations, so most of the file’s content is really just a table of the different characters with their names underneath.

Download (PDF, 1.1 MB)

For those who are curious: The outlines of the characters were drawn in Illustrator, the brochure was made in InDesign, the stock photos are from the wonderful stock.xchng. This all also explains the file size, by the way.

Daléian Alphabet

Daléian script example

Note that I’ve also put up information for the Daléian script again. I just printed out the relevant page from my old website as a PDF file. The low resolution of the images is suboptimal for this purpose, but I suppose it’s still good enough to read on screen and get an idea.

Download (PDF, 143 KB)

Benung

The Ayeri Language Resource