_LEAD = re.compile(r'^[aeiouy]*[bcdfghjklmnprstvwxz]+')
_INNER = re.compile(r'\B[aeiouy]+[bcdfghjklmnprstvwxz]+\B')
_TRAIL = re.compile(r'[aeiouy]+[bcdfghjklmnprstvwxzy]?$')
_LEAD = re.compile(r'^[aeiouy]*(?:qu|[bcdfghjklmnpqrstvwxz])+')
_INNER = re.compile(r'\B[aeiouy]+(?:qu|[bcdfghjklmnpqrstvwxz])+\B')
_TRAIL = re.compile(r'[aeiouy]+(?:qu|[bcdfghjklmnpqrstvwxz])+$')
Basically, I was aiming for something pronounceable with commons letters weighted to appear more often. It produced output such as;
Votharn Eristacark Iplortidot Birtoil Udaeteahieb Aceastoherk Reloist Tharnog Wasterk Femewelav Ublyrrielic Cekird Owritothol Hoogoh Obloukajarriem Sleebont Niestart Pekev Lirtooth Efentoidagix Klyckas Yryfesat Klooton
Yeah … that's really, really awful.
So I decided to give it another whack. This time I started with the premise, 'what sounds most like a name?'
Names do!
I found a couple files with over a thousand of the most common male and female first names on the US Census Bureau's web page and started playing. I wrote a Python script that used regular expressions to slice a batch of words into three lists;
List 1 = Zero or more vowels + One or more consonants at the start of the word
List 2 = One or more vowels + One or more consonants inside the word (not at the start or the end). We can get 0 or more of these patterns depending on the word.
List 3 = One or more vowels + Zero or more consonants at the end of the word.
Side note: If you haven't dug into regular expressions yet I highly recommend you check them out. I avoided them for years and now they're an essential part of my programmer tool box. Another big plus is their utility spans multiple languages.
I also tracked the frequency of each pattern, sorting by most common first and discarding the rares. Finally, I dumped the output formatted as Python lists that I could paste right into the source of the next script.
Next I used a script to assemble random names from these lists. Here's the one for male names:
Which gives output like;
Shen Grolon Mor Warred Hilluel Huel Pie Reror Jo Frewed Pin Rasy Steris Tron Chelo Nor Stomey Stennuel Sernas Golian Karrernin Varren Kichey Hemio Grerto Bas Wo Wathas Fandes Ferrey Kan Fines Lanin Clancuel Vy Pacosian Ke Shie Rathis Dor Ce Frencermis Van Nances Fones Reny Kas Shathus Brer Jilled Vio Feton Testin Hian Bed Rio Trewavan Nernel Antamalor Cance Stomio Venno Lel Grus Antel Jey Holis Kas Stie Lewo Tille Vin Hey Jon Trencor Bancis Fie Stel Fruel Brestin Javen Cancen Sel Trarrus Brarlitio Kertes Cie Perned Fris Storas Stey Grarly Dor Grio Won Gentian Stan Clichy Das Tennan
Seeding with female names, I get;
Lindi Elon Pacia Tulee Gia Koneen Shabee Stan Siannis Vory Stannin Kesses Stinie Re Genistian Krolonah Ladee Perie Detter Clel Chillistyn Angami Han Can Pesten Shulis Alie Worianon Choryn Neana Clindetis Angannian Pishia Man Chres Bralan Cilla Pissie Free Veen Vinon Paly Angancy Chran Jabi Ali Shes Cerran Sissan Ner Katia Sten Dah Mandrie Den Tilis Nancer Chey Cenny Mamah Angisheen Wian Deen Jennameen Kelie Stacian Chrerrian Tren Doson Loler Kria Ci Claces Kandree Try Tande Chianan Lishia Sadis Belon Clady Closonel Rindah Krestah Elian Son Triny Jin Chis Hianisie Angilli Tannee His Silli Vy Kiannie Chraria Lulon Wy Brissyn
I've been experimenting with different sample sizes and varying amounts of culling infrequent patterns. Plus, it's hard to gauge success. You wouldn't want to name your children from those lists, but if I had to populate a fantasy town full of NPCs I'd be content with many of those. Two things I liked was the simplicity of the finished code and that the gender sound mostly survived the mulching process – except Stan the transvestite.