Sorry your browser is not supported!

You are using an outdated browser that does not support modern web technologies, in order to use this site please update to a new browser.

Browsers supported include Chrome, FireFox, Safari, Opera, Internet Explorer 10+ or Microsoft Edge.

DarkBASIC Discussion / My Struggles With Procedural Generation

Author
Message
Libervurto
12
Years of Service
User Offline
Joined: 30th Jun 2006
Location: On Toast
Posted: 17th May 2012 19:32 Edited at: 17th May 2012 19:40
I am a novice when it comes to procedural generation, but of course I've set myself a ridiculously difficult task of creating a system that can generate several styles of creatures and spaceships (and the names for both) with many variations within each style. The end goal is to have separate "cultures" of generated content which are clearly distinguishable. Since names require the least additional work (no graphics) I will start there.

My first attempt at PG name generation was this short function inspired by Elite:

I've yet to see anything else as short that can produce legible names, but it's a bit of a cheat as really it only generates pairs of characters and concatenates them: if character A is a vowel then B is either "D","L","R","N","S" or "T"; if A is a consonant then B is one of the five vowels. We basically flip between vowels and consonants to ensure that the word remains legible.
This method produces relatively few string combinations: you cannot produce words with three consecutive consonants or vowels, such as "string" or "beauty" for example. If I'm going to create distinguishable cultures that means each culture has to specify in a particular range of character combinations which greatly decreases the possible names within each culture, and with this method that would leave very few words available. I need a more flexible base rule-set that allows for greater number of combinations.

I set about designing a more intelligent algorithm that would be able to produce words like "string" and "beauty" - this requires more complex rules. We can't stick with rules for individual characters and generate pairs because we'd have to allow "st" to be a pair which could produce "stststst" as a word. (If your name sounds like radio static then you're not going to fare well in interstellar combat!)
We need rules that produce entire words or word-fragments (forgotten the proper word), that are interchangeable, without producing illegible nonsense.
Notice that if we add a vowel to the end of "st" it becomes pronounceable, eg "stastestistostu".

I decided to map out the possible combinations for the letters "a", "n", "s" and "t". I chose only four to make it easier.

I highlighted the common branches and gave each unique branch a number. (Slight error: every "A" should be the same colour as there is only one A node.) I treat each unique branch like a node, pointing to other possible successive nodes.


It now occurs to me that these are not very well constructed, as they cannot be fluently combined. I need to alter my code to produce word-fragments that are completely interchangeable.

It's quite difficult to free your mind from your own language, I want to make pronounceable words but I have to remember that just because I don't know of a word that begins "Mbro" doesn't mean it can't be valid (and probably is in some Africa language).

Well this is where I'm at so far but I will be coming back to this thread as I learn more about procedural generation.

WARNING: The above comment may contain sarcasm.

Attachments

Login to view attachments
BN2 Productions
15
Years of Service
User Offline
Joined: 22nd Jan 2004
Location:
Posted: 17th May 2012 21:42
Don't have time to really do a lot of work on things right now, but here is a thought:

Names are comprised not just of syllables but of sounds (the inflection of the syllables themselves). What I would suggest is having it choose from discrete syllables (not letters) and each one has a stored value of 1 or -1 (upward or downward inflection). You might need more as you play with it (perhaps a 0 case).

To start generating the name, use some sine and cosine math to generate an oscillating wave and collecting the data at each set of 180 degrees (on a cos(x) curve, this will give a pattern of 1,-1,1,-1). Perhaps come up with some more intricate functions (using perhaps abs() and some other cool math stuff) and have those stored.

Generation follows this pattern:

1) Randomize number of syllables (perhaps bias towards cultural trends)


2) Randomize the wave form of the name

3) Randomly choose from one pile of syllables (upwards or downwards inflection), then choose another from the other pile. Repeat until all slots are formed.

I would suggest coming up with some predetermined cultural rules (perhaps randomly generated at the start of the program) and biasing any randomly generated numbers towards those.


Just a thought, take from it whatever sounds good or disregard it completely.

Great Quote:
"Time...LINE??? Time isn't made out of lines...it is made out of circles. That is why clocks are round!" -Caboose
Phaelax
DBPro Master
16
Years of Service
User Offline
Joined: 16th Apr 2003
Location: Metropia
Posted: 29th Jul 2012 06:54
We've done name generation challenges before.

http://dbcodecorner.com/index.php?page=view&challenge=Random%20Name%20Generator


If you really want something effective, you'll need to understand language structure. Knowing what consonants can go together, how many you can have before a vowel is needed.

"You're not going crazy. You're going sane in a crazy world!" ~Tick
Mr909
7
Years of Service
User Offline
Joined: 2nd Jun 2012
Location:
Posted: 13th Nov 2012 19:21 Edited at: 13th Nov 2012 19:47
Well, here's my thoughts, take them or leave them.

"Mbro" doesn't work (in our perceptions) due to the (generally true) idea that consonants don't soundly follow consonants in sequences of 3, as you'd basically stated. 2 consonants next to each other, maybe. Really, this is only consonant SOUNDS. Same kind of works for vowels. That's why the word "queuing" looks so weird spelled out.

Combinatorics knowledge is ESSENTIAL for this kind of process. I'm going to start this from the ground up because I'm not sure if we're on the same page as to how this is calculated.

For any sequence of non-repeating letters, the amount of valid combinations is factorial n, where n is the number of letters. Mathematically, this is:

n!

No joke, there is an exclamation point there.

For n=5, n!=5*4*3*2*1. So, for any five non-repeating items, there are 120 possible combinations.

What about repeats? Well, we know that with ABC, there are 3*2*1, or 6 valid combinations. If we only take two letters, there are 3*2, or, yes, still, 6 combinations. However, we know AAA only produces 1 unique combination. Why? Since there is a repeating item, we have to add the factorial of the number of repeats. Since all are repeats, we solve for:

3! / 3!
Or
3*2*1/3*2*1.

That's just so you know that the word "STATISTICS" or somesuch returns significantly fewer due to its repeated instances of S,I, and T.
These are only UNIQUE combinations. If we allow items to repeat (chocolate, vanilla, strawberry- type problems where chocolate-chocolate-chocolate is valid (yum! )), we just use 3*3*3 (there are 27 different triple-scoop orders, although many of them are just stacked differently.)

How does this apply to your problem? Well, from what I've observed, people are only looking at the lower bound, that is, regardless of the sophistication of your system, the believability is based on how much you allow the massive possible permutations under the limited system.

So honestly? Cheating with sequences like:

1st: "Sha" or "Cha" or "Ram" or "Vla"
2nd: "on" or "one" or "no" or "ni" or "in" or "lyn"

Bounds: 4*6-24 for this name sequence

Is perfectly okay. Just create a couple different sound types that you can reliably expect to fit together (crunch all of them if you'd like, put them on a list, look for any outliers. It's just a bunch of FOR-NEXT loops, as I try to explain to non-coders in real-life scenarios )

Another idea, useful for your length problem, is what I call critical boundary thinking, based on the World Of Darkness tabletop roleplaying game. Essentially, the logic behind the dice rolls allowed for infinite success. How? Well, every roll on the 10 sided die that was above 8 let you roll again. So with 7 rolls, you had a constantly decaying chance but, yes, you could theoretically continue rolling forever so long as it always hit 8 or above.

What I'm saying is, if you allow consonants to repeat, make successive repetitions have a lower and lower chance, and THEN write the upper bound if it seems necessary.

Look at the lengths of this test:



I don't know how the probability decays offhand, but it's slow, steady, and a particularly easy way to write a generator. It doesn't have an upward cap, but it hardly NEEDS one. It reflects real life fairly well: Shorter names with increased rarity over longer, very long names with the lowest.

Also, odds-making logic helps in generators of any kind where bias is involved.
So you want every English letter to appear but "s", "c", and "t" to be favored? Well, what I would do is just create an array with one instance of every valid sound, but with repeats of the favored sounds. then roll against the table. Then, the relative "weight" of any can be adjusted.

I hope I didn't ramble. I kinda rambled. I hope something in my rambling helped. I'm going to go write a generator now XD

EDIT: I attached a simple generator I cooked up, source and all. To increase weight or change sounds, just modify the .ini that comes with it. Gah, I don't like doing combinatorics again. :\ BUT ITS SOOOO FUUUUUUNN...

Attachments

Login to view attachments
Phaelax
DBPro Master
16
Years of Service
User Offline
Joined: 16th Apr 2003
Location: Metropia
Posted: 13th Nov 2012 20:21
Quote: "Notice that if we add a vowel to the end of "st" it becomes pronounceable, eg "stastestistostu""


Well, you might be able to pronounce it, but I'm struggling. If I say it aloud, sounds like I have a stutter.

"You're not going crazy. You're going sane in a crazy world!" ~Tick
Mr909
7
Years of Service
User Offline
Joined: 2nd Jun 2012
Location:
Posted: 13th Nov 2012 20:53
I got Sta-stay-sti-toss-stew. Worked here.
Fluffy Rabbit
User Banned
Posted: 13th Nov 2012 23:02
Onchaone. I suppose it all depends on the syllables entered into the ini file.

So, let's say the game were the big ambitious space adventure OBese wanted to make. The randomizer would have to be seeded in a constant way, so that when the player saved the game, he wouldn't end up in a different part of the galaxy the next time he started it up. There would have to be a way to make sure that no two exact full name combinations are repeated. I got "lynlyn" twice.

Login to post a reply

Server time is: 2019-06-25 08:52:47
Your offset time is: 2019-06-25 08:52:47