Markov Chains
Last night my friend Eric challenged me to write a Markov chaining algorithm to create fictitious Italian last names. For those of you not familiar Markov chaining is a simple process whereby you construct a word by looking at the 2 (or 3 or 4) previous letters adding a new letter which is statistically likely to came afterward. You need a reference set of words of course so I stole it from wikipedia. Here's the list. They have been tested to ensure that none of the generated fictitious names coincides with the real names.
abbatti sabbiano donardo fortoni fontano posito espossi argenovese spernabò areschini spintori politali tibalbi gabrielli bandretti tremonte roti lombo baiani baglianchi sola giuliano loggio vita molito robbiani gasparelliano garo rocchini ungelista morossetti moscarlo sartelli giore battei bassini barassano devitaliparelli evangenito macci | cerugini defendretti ruso manci endretta mini mazzo mily matti buonato ventierrugia codazzatti pettista silviattei pelli bocchiavoso con colo cortolamini costino coppolini annunziattei antoro anserminnella flaminnellegri aneri anfredini angenis gentino beroni turielli farro famini benis pucciolombarbierrarini belli andido sca chiavonello dano | bettista fabriz amoro schio eppola testoro davitabile sclafano ducciolombo valcandriz carlatto fierrero cantinola napoletti capinelli vavone finziatti cavalcandro sungaro caccio caetano rizzi campani bianchiaparello cali ciprio pacci ale lazzi paladini grini picciolo albernabò panelli bondini bortino ramacci pinti auriero pisantoroso |
Update: I ran my squelch newsflashes through the Markov Chainer. I got these:
"Bush would have to go on a group of unarmed shoppers Friday at a loss to retrieve the missing item."
"Though the 23rd meta extension of irony a level first reached by a Wisconsin machinist in June 2000 the fact that he just rambles a lot better said Kenyan orphan Mutheru Ubatto through an interpreter."
"Really what's a couple of pints of blood compared to a very difficult thing to do."
"While Greenberg's decision to purchase the single Kimberly Diaz a noted expert on irony explains because that song is still a pretty crappy song."
"Chameny blankly replied Will & Grace."
"Like many Americans who never considered African AIDS orphans to be a good topic joining the ideas of Chomsky and Searle in a bid for a diff'rent era."
"A needle was misplaced in a purely post modern constructivist framework."
"Said one Pentecostal Christian All we want is to get him to just answer your question without using knowledge only a dedicated grad student Josh Greenberg purchased the song after a child ..."
Update: Here the code to the original Markov Chainer for names. If anyone sees any problems with it let me know.
*snip*
I'm going to move the code in different post.

17 comments:
Dibs on Cavalcandro
"Belli" is a real surname, and "Giuliano" is a real given name, so I'd be wary of using these for any serious purpose....
I agree. I recognized many of these as real names even though they weren't in my original list of common names. I didn't mean for these to be used for serious purposes. I intended them for silly purposes only.
This is nothing new. Go to:
http://www.makewords.com/default.aspx
and select "Italian".
Fair enough. I will no longer labor under the delusion that I was the inventor of Markov Chaining.
LOL, nice ripost, Tommaso....
hmm, I didn't realize that's what it was called. I made a few word generators before that used markov chaining, but I never knew that's what it was called. I'll have to research that.
There's a (python) implementation of name generation in "Game Programming for Python".
Check out the Steampunk name generator at brassgoggles.co.uk sometime.
Yours truly,
Prince Leander Yardley
Sounds like you could run anything through the chainer and have hours of hilarity. Hours! The possibilities are endless! I want one!
Could you automate the process of acquiring reference data, so that we can all sit around, thinking of fun things to try, without it being such a pain to get the data? That would be awesome. Please express your response as the product of its word scores modulo 1000000.
Also, I just played Thought Bubble again. I think the puns, the slide whistle sound, and the jiggliness are together enough to keep me entertained.
Okay, I'm officially guilty of fizzbuzzization: I _rushed_ off to write my own after seeing this post. Here's what mine came up with:
Pelli
Pelliorin
Peni
Peno
Pentabri
Pernarretredi
Pernet
Pertonerardorvo
Petillo
Petrellucchi
Pettigararsanato
Picci
Pinte
Pintinello
Piraspaccinarruttierrugini
Pisandini
Pisanicuccio
Possini
Pucchi
Pucchianaleni
Quinarrellia
Rizzo
Robbato
Robbato
Rocchiavoso
Rocchio
Roce
Roso
Rostigliorerrazzon
Rotta
Ruffoneroccarro
Sabri
Sabus
Santino
Scario
Scarlo
Schiatoli
Scla
Sclatello
Silo
Soli
Spino
Ste
Strugiaparona
Tesori
Tozzoni
Trestoli
Ungenti
Ungentini
Vavalluschini
Vavo
Vecolinnarotilei
Ventornartelladi
For added hilarity, try running a Markov chainer on Alice in Wonderland, or Hamlet. Comedy GOLD, I tell ya ;)
Wasn't Benis the last name of Elaine on Seinfeld?
Very nice, I love it. You can actually create your own parallel universe populated by Italian names. Funny enough, that universe overlaps with our owns (e.g. I know a couple of Rizzi). Good exercise, anyway.
A better test of the fictionality of imaginary surnames would be a simple google search. Chances are, if google pulls no results or few results, very good that in fact your made-up names are original (though not an absolute guarantee, since my mother's name is nowhere on google except as a typographical error!).
For example, suppose I imagined the surname "Frangiovanni". When I search for it on a search engine and pull up results, I may only one or two, and these may not even be in the same form precisely. Another one I like to use is "Scipalma" - sounding very southern Italian I think. Try to come up with some imaginary names and see how often they come up in an internet search engine. That in itself should be enough to gauge its originality potential.
This is interesting - have you thought about generating Markov chain music? Like in here: http://thepasqualian.com/?p=1831
Post a Comment