From the NannyMUD documentation
2000-12-23
NAME
class2 - A log from TMI, part 2..DESCRIPTION
This is part 2 of some logs from TMI. Saphire says: think i am in the wrong class room. Nightshade leaves east. Bdeniston says: why? Profezzorn says: ok, todays topics are regexps, and then advance switch usage Saphire says: thats why Profezzorn says: after that we'll see what we can fit in Bdeniston says: What? Nightshade arrives. Cejones nods solemnly. Nightshade smiles happily. Bdeniston says: I think so too. Sunfire says: hmm Saphire says: is there going to be a beginners class today? Profezzorn says: no Bdeniston says: I can't find a pencil! Profezzorn says: beginners are on thursdays Sunfire growls at bdeniston. Saphire says: okay Heckler sighs deeply. Bdeniston says: What time? Nightshade says: blah... Terry chuckles politely. Profezzorn says: 4pm EDT I think, now, can we get on with it? Sunfire giggles inanely. Nightshade says: my step classes are taught on thursdays. Nightshade says: yes.. :) Terry nods solemnly. Bdeniston says: thanks teach Saphire nods solemnly. Sunfire grins evilly. Sunfire says: hmm.. Profezzorn says: now what? Sunfire says: May I code something? Nightshade says: teach. Nightshade smiles happily. Profezzorn says: no Sunfire sighs deeply. Terry chuckles politely. Profezzorn says: anyway, let's start Profezzorn says: regexp is short for "regular expression" Profezzorn says: it's really a small kind of program, but you don't have to now that Profezzorn says: because it's coded in a string Profezzorn says: regexps are used to 'match' agains another string to see if Profezzorn says: they match Profezzorn says: and also, it is often used for 'search-and-replace' commands Profezzorn says: like in 'ed' Profezzorn says: anyway, a regexp is a string, and most characters just match themselves Profezzorn says: like the regexp "r" matches the string "r" Bdeniston says: so is regexp a varibale of sorts? Profezzorn says: no Sunfire says: *knows dis stuff* Sunfire waves good-bye. Sunfire just left the school. Profezzorn says: a regexp is just a string, but it contains a pattern approximately like a sscanf-string Bdeniston nods solemnly. Megaboz arrives. Profezzorn says: regexps are much more powerful than sscanf strings though Profezzorn says: for instance, if A and B are two regexps Heckler says: could we see example of one...Ah u r.. Profezzorn says: then "A|B" is the regexps that matches either A or B Bdeniston says: this sounds like topology, with mappings. Profezzorn says: I wouldn't describe it like that Saphire says: not really just basic l Saphire says: binary algebra. Profezzorn says: anyway, "A*" would be a regexp matching A zero or more times Vedder arrives. Vedder says: cool Bdeniston says: shhh! Profezzorn says: "AB" would be a regexps matching A then B ie. concatenation Corin arrives. Profezzorn says: "A+" is a regexp matching A one or more times Profezzorn says: and "." is a character that matches any character Profezzorn says: any questions? Cejones shakes and quivers like a bowlful of jelly. Profezzorn says: we gotta get that 'shake' command fixed :) Bdeniston says: I feel like I am in a cheetos comercial and just been flatened. Cejones says: I feel dumb Cejones smiles happily. Profezzorn says: let's do some examples Bdeniston says: Where does all this apply to the LPC muds. Rassilon says: therefore "ab*c" would mathch: "abc","abbc","abbbc", but would barf on "abbb" Heckler says: could we see full syntactical example of a regexp ? Profezzorn says: correct rassilon Profezzorn says: it would also match "ac" Heckler says: nerermind got it Profezzorn says: you can also use parentheses to group things like: Profezzorn says: "(a|b)(a|b)" matches "aa" "ab" "ba" and "bb" Bdeniston says: Ah, like a * is a wildcard, and can replace any number of b's Nightshade just left the school. Profezzorn says: yes, that is correct bdeniston Bdeniston says: this is topology!. Profezzorn chuckles politely. Heckler says: byw how long do we have on here before idle out ? Profezzorn says: I have no idea, I have never idled out :) Bdeniston says: Hmm, my Bachelors in mathematics degree is finally paying off. Heckler says: you wouldn't ;) Profezzorn says: anyway, to get back to reality, let's see how this applies to lpmud Profezzorn says: there are 2 usages: Profezzorn says: one is the efun regexp(), and the other is ed Guest arrives. Heckler says: and I do love me Ed ;) Bdeniston says: ed sucks! Profezzorn says: regexp() has the syntax: string *regexp(string *strings,string regexp) Rassilon enters the school. Profezzorn says: regexp just takes an array of strings and matches them against a regexps Profezzorn says: it returns the strings that matched in an array Heckler says: Ed's not too bad once u r used to it. all I can code in Profezzorn says: ed uses regexps for both the search and the replace commands Heckler says: like g / ? and s commands in Ed Profezzorn nods solemnly. Cejones says: what about the "^" it doesn't match right? Bdeniston says: it is like a bad nightmare of being trapped in some spinoof of edlin from dos. Profezzorn says: ^ is a special character Cejones says: ah Profezzorn says: ^ matches the beginning of a line Profezzorn says: $ matches the end of a line Cejones nods solemnly. Heckler says: any other specials ??? like/ \ | & etc... Profezzorn says: to override these meanings you quote the character whith a \ Heckler says: same specials as in Ed I guess.... Profezzorn says: when you write regexps in a string you have to use two \ as the Heckler says: for s/g command Cejones says: so I could search for a ^ using "\^" Profezzorn says: compiler will eat one \ Cejones says: gotcha you Profezzorn says: yes, correct cejones Cejones nods solemnly. Profezzorn says: another special is [ ] Profezzorn says: it's the set operator Saphire leaves east. Bdeniston leaves east. Profezzorn says: which means to say that [abe] matches any of the characters a, b or e Corin says: I read that in the man to ed ;) Profezzorn says: also [a-z] matches any character a to z Heckler says: mebbe I should update my Ed documentation.... Profezzorn says: and [^qed] maches any character but q, e or d Cejones nods solemnly. Guest says: so [ and ] are special chars to ? Uyvwnmon leaves east. Profezzorn says: yes, if you want to match them you have to quote them Guest nods solemnly. Heckler says: quote or \ ? Profezzorn says: who can give me a regexps matching an integer? Profezzorn says: \ Profezzorn says: \ is special too, and has to be quoted with \ :) Heckler says: *knows th Cejones says: [0-9] Heckler says: sorry damn control chars Rassilon says: for example: "[A-Za-z_][A-Za-z0-9_]*" would be the regexp for variable names, right? Profezzorn says: yes, rassilon that is right Uyvwnmon arrives. Profezzorn says: [0-9]+ is the right regexp for a number Rassilon says: for an integer... :) Profezzorn says: yes Profezzorn says: [0-9]*\.[0-9]+ would be a float Cejones says: you answered my next question :) Profezzorn smiles happily. Profezzorn says: so, any questions? should I repeat the operators quickly? Heckler nods solemnly. Heckler says: this mean you'll update Nanny help text on regexp ?? ;) Profezzorn says: ok, here are the opeartors in precedence order: Profezzorn says: ( ) use used for grouping regexps Profezzorn says: A* means zero or more A Profezzorn says: A+ means one or more A Profezzorn says: AB means A then B Profezzorn says: A|B means A or B Guest just left the school. Profezzorn says: and, then there are the atoms: Profezzorn says: . matches any character Profezzorn says: ^ matches the beginning of the line Profezzorn says: $ matches the end of the line Cejones says: what about match a word? Rassilon says: note: ^ is only a special character at the beging of a regexp, so "abc^" would only match: "abc^" and there is no reason to escape the ^ since it isnt at the beginning the same is true with $ at the end o a regexp. Profezzorn says: [abc] and [a-c] mathces a,b or c Profezzorn says: [^abc] matches anything but a, b and c Profezzorn says: I think using ^ unquoted in the middle of a string is undefined Heckler says: not true Rassilon, regexp would fail as would look for acbNEWLINE Profezzorn says: ie. there is no telling what will happen Heckler says: where NEWLINE means start of line Heckler shrugs helplessly. Profezzorn says: not true Heckler, ^ is not the same as newline Heckler says: meant start of line... typed wrong thing ;) Profezzorn says: actually, ^ wouldn't be nessesary at all if it wasn't because regexps Profezzorn says: match if they are a _part_ of the string Profezzorn says: ie "a" matches any string that contains an 'a' Profezzorn says: see what I mean? Heckler nods solemnly. Cejones says: If there is an error in the regular expression 0 will be Cejones says: returned Heckler says: but in ed each character in a line is a seperate string ? Profezzorn says: that depends on the driver cejones Cejones says: ah Profezzorn says: no, heckler, it just prints them that way Profezzorn says: anyway, to make one more comparisment with sscanf Heckler says: dunno how this compiler handles strings... Rassilon says: from man grep: Rassilon says: ^ If the first character of string is a ^ (circumflex), the RE Rassilon says: [^string] matches any character except the characters in string and Rassilon says: the newline character. A ^ has this special meaning only if it occurs first in the string. Heckler ruffles rassilons hair playfully. Profezzorn says: ok, that manual says ^ can be used freely inside strings :) Rassilon chuckles politely. Profezzorn says: anyway, to compare with sscanf: Cejones shrugs helplessly. Profezzorn says: regexps are backtracing as opposed to sscanf Profezzorn says: what that means is that regexps can break up match and make another Profezzorn says: I think it is best illustrated with an example Profezzorn says: say that we have the input string "foo bas foo bar" Profezzorn says: if we use sscanf with the string "%sfoo%cbar%s" it won't match fully Profezzorn says: (%c isn't implemented in all lpmud drivers though) Profezzorn says: but, if we match it agains the regexp "^.*foo.bar.*$" it will match Profezzorn says: because in the regexp the 'foo' will match the second foo in the string Profezzorn says: but sscanf will match foo against the first foo and then stick to that Profezzorn says: anybody understand what I'm saying? Cejones nods solemnly. Rassilon says: yep, sscanf is pretty stupid... :) Profezzorn smiles happily. Profezzorn says: exactly Cejones says: it has always worked for me... Heckler nods solemnly. Cejones says: of course....I have never used regexp until now Profezzorn says: regexps can be very useful, for instance if one want to write a grep command on a mud Profezzorn says: something like: Profezzorn says: write(implode(regexp(explode(read_file(FILE), "\n",match),"\n"))) A small gnome appears and cleans some of the boards that are empty. The gnome leaves again. Profezzorn says: also, it is very useful to use together with get_dir() to make help-commands and such Profezzorn says: any questions? Heckler nods solemnly. Heckler says: like man on Nanny ? Profezzorn says: hmm, not really :) Heckler says: should I give up now ;) Profezzorn says: anyway, shall we move on to some switch() usage? Heckler says: dunno how that is coded *shrug* Jfinch arrives. Cejones bows to jfinch. Jfinch smiles happily. Heckler says: shoot Profezzorn says: well, ok Profezzorn says: who doesn't know how to use switch() ? Profezzorn says: who knows how to use switch() ? Y says: Me! Heckler raises a hand. Jfinch says: rais Jfinch raises a hand. Y says: Just not with strings. Cejones says: me me me Profezzorn says: let's do an example of some odd switch() usage... Profezzorn says: /* Duff's device */ Profezzorn says: void callplenty(int times,string fun,object ob) Profezzorn says: { Profezzorn says: int i; Profezzorn says: i=times / 8; Profezzorn says: switch(times % 8) Profezzorn says: { Profezzorn says: case 0: while(--i>=0) { ob->fun(); Profezzorn says: case 7: ob->fun(); Profezzorn says: case 6: ob->fun(); Profezzorn says: case 5: ob->fun(); Profezzorn says: case 4: ob->fun(); Profezzorn says: case 3: ob->fun(); Profezzorn says: case 2: ob->fun(); Profezzorn says: case 1: ob->fun(); } Profezzorn says: } Profezzorn says: } Profezzorn says: I will put the example on the projector in a second Profezzorn puts something on the projector. Profezzorn says: who can tell me what that is good for? Profezzorn says: it's on the projector if you need it Cejones says: ummmmmmmmm Profezzorn says: noone? Heckler says: repeating a command.... Heckler says: or function... 8 times Profezzorn says: close Jfinch says: it calls the function 64 times ? Profezzorn says: no Cejones shrugs helplessly. Rassilon says: yuck. talk about hard to read. Profezzorn says: the code you see is an implementation of Duff's device Jfinch says: nothing if times isn't a multiple of 8 ;) Profezzorn says: it's a particularly nasty usage of the switch() construct Profezzorn says: what it does is that it calls ob->fun() a number of times Profezzorn says: exactly as many times as the variables times says actually Profezzorn says: it jumps into the loop and starts executing there Cejones says: we need a boggle emotion Heckler says: to max of 8 times ? Profezzorn says: nope Rassilon says: ah... what a really silly way of doing it. Profezzorn says: yes, it's an extremely silly way of doing it Profezzorn says: there is no max to the number of time, except the size of the ints of course Y says: Wow, you can jump into the middle of a while() with a switch()? Profezzorn says: yes Profezzorn says: not all lpc's implent it that way, but all C and most LPC does it that way Cejones says: learn something new everyday :) Rassilon says: well it makes sense, its just ugly looking code. Profezzorn says: yes, the original use was to copy memory fast by unwrapping a loop Profezzorn says: so it could copy 8 bytes per loop Profezzorn says: and then using a switch to enter the loop at the precise location Rassilon says: ah. Profezzorn says: you get it heckler? Heckler nods solemnly. Profezzorn says: good, shall we take another example? Rassilon nods solemnly. Profezzorn says: write("spell cost max damage\n"); Profezzorn says: switch(level) Profezzorn says: { Profezzorn says: case 0..4; Profezzorn says: write("You don't have any spells yet.\n"); Profezzorn says: break; Profezzorn says: default: write("Fireball 20 40\n"); Profezzorn says: case 10..14: write("Shock 15 30\n"); Profezzorn says: case 5..9: write("Missile 10 20\n"); Profezzorn says: } Profezzorn unloads the projector. Profezzorn puts something on the projector. Profezzorn says: it should be pretty obviouis what this example does Heckler nods solemnly. Cejones nods solemnly. Rassilon nods solemnly. Profezzorn says: the new thing is that we use ranges in the case-statements Heckler says: lists standard mud spells according to your level Cejones says: is that standard? the ranges? Profezzorn says: and fall-through to display all current spells (ie, no break) Rassilon says: no. Cejones says: I can't do that on my driver Heckler says: llooks standard to me Profezzorn says: ? Profezzorn says: what kind of driver you have cejones? Rassilon says: nevemind, I'm not thinking. it ought to be. Profezzorn says: all drivers I've tried has ranges in case Cejones says: 'checking..... A small gnome appears and cleans some of the boards that are empty. The gnome leaves again. A small gnome appears and cleans some of the boards that are empty. The gnome leaves again. Profezzorn says: it's not very well documented though :) Profezzorn says: who said he couldn't use string switches? Cejones says: CD.03.31 Y raises a hand. Profezzorn says: ah, I don't know much about CD, so that might be true Cejones shrugs helplessly. Profezzorn says: Amylaar, Mudos, 3.1.2 and LPC4 all has it though Cejones nods solemnly. Profezzorn says: what's the problem Y? Profezzorn says: string switches are just like any switches() Profezzorn says: switch()es I mean :) Y says: Actually, I just never tried it. Y blushes. Profezzorn says: ah Profezzorn says: the strings have to be constant though Profezzorn says: just as with integer switch() Rassilon says: note: case ranges arent standard C although gcc has them as an extension. Profezzorn says: we're not discussing C here :) Profezzorn says: but gcc has everything :) Rassilon says: and uses three periods. :P cross overs might help some one... :) Rassilon chuckles politely. Profezzorn says: some last warnings about switch() though Profezzorn says: most lp-drivers dosn't implement mixed-type switches Profezzorn says: so if you have a command that does a switch() on the argument and Profezzorn says: the user doesn't type an argument Profezzorn says: the argument will be zero, and that will generate an error Profezzorn says: another warning Profezzorn says: if you write something like: Profezzorn says: while(x) { switch(foo()) { ..... }; } Profezzorn says: you _should_ be able to use continue inside the switch to continue the loop Profezzorn says: BUT, doing so can crash older drivers Heckler nods solemnly. Rassilon nods solemnly. Profezzorn says: now, any last questions before I end this class? Cejones says: has it been an hour? Profezzorn nods solemnly. Rassilon nods solemnly. Heckler says: midnight UK now... Rassilon chuckles politely. Cejones says: geesh, the time sure flies Jfinch says: would it always crash or would it crash if the return value of foo() wasn't the same every time ? Rassilon says: its a bug in the driver, nothing to do with the syntax. Profezzorn says: it would crash the second or third time it loops Profezzorn says: it's a bug yes, originally created by Amylaar Rassilon chuckles politely. Cejones says: so just make sure the player inputs something..... Heckler says: sue him ;) Heckler says: check input before running the switch... Cejones nods solemnly. Profezzorn says: correct, heckler Rassilon says: be nice to amylaar, lathough I never understood the point of switching on strings, there are otherways to do that. (not to mention the ineffiecny and bloddymindedness of the switch code in 3.2.x Profezzorn says: ok, let's call that a wrap, shall we? Rassilon nods solemnly. Cejones nods solemnly. Heckler nods solemnly. Y says: Ok.