Intermediate LPC
Descartes of Borg
November 1993
Chapter 5: Advanced String Handling
5.1 What a String Is
The LPC Basics textbook taught strings as simple data types. LPC
generally deals with strings in such a matter. The underlying driver
program, however, is written in C, which has no string data type. The
driver in fact sees strings as a complex data type made up of an array of
characters, a simple C data type. LPC, on the other hand does not
recognize a character data type (there may actually be a driver or two out
there which do recognize the character as a data type, but in general not).
The net effect is that there are some array-like things you can do with
strings that you cannot do with other LPC data types.
The first efun regarding strings you should learn is the strlen() efun.
This efun returns the length in characters of an LPC string, and is thus
the string equivalent to sizeof() for arrays. Just from the behaviour of
this efun, you can see that the driver treats a string as if it were made up
of smaller elements. In this chapter, you will learn how to deal with
strings on a more basic level, as characters and sub strings.
5.2 Strings as Character Arrays
You can do nearly anything with strings that you can do with arrays,
except assign values on a character basis. At the most basic, you can
actually refer to character constants by enclosing them in '' (single
quotes). 'a' and "a" are therefore very different things in LPC. 'a'
represents a character which cannot be used in assignment statements or
any other operations except comparison evaluations. "a" on the other
hand is a string made up of a single character. You can add and subtract
other strings to it and assign it as a value to a variable.
With string variables, you can access the individual characters to run
comparisons against character constants using exactly the same syntax
that is used with arrays. In other words, the statement:
if(str[2] == 'a')
is a valid LPC statement comparing the second character in the str string
to the character 'a'. You have to be very careful that you are not
comparing elements of arrays to characters, nor are you comparing
characters of strings to strings.
LPC also allows you to access several characters together using LPC's
range operator ..:
if(str[0..1] == "ab")
In other words, you can look for the string which is formed by the
characters 0 through 1 in the string str. As with arrays, you must be
careful when using indexing or range operators so that you do not try to
reference an index number larger than the last index. Doing so will
result in an error.
Now you can see a couple of similarities between strings and arrays:
1) You may index on both to access the values of individual elements.
a) The individual elements of strings are characters
b) The individual elements of arrays match the data type of the
array.
2) You may operate on a range of values
a) Ex: "abcdef"[1..3] is the string "bcd"
b) Ex: ({ 1, 2, 3, 4, 5 })[1..3] is the int array ({ 2, 3, 4 })
And of course, you should always keep in mind the fundamental
difference: a string is not made up of a more fundamental LPC data type.
In other words, you may not act on the individual characters by
assigning them values.
5.3 The Efun sscanf()
You cannot do any decent string handling in LPC without using
sscanf(). Without it, you are left trying to play with the full strings
passed by command statements to the command functions. In other
words, you could not handle a command like: "give sword to leo", since
you would have no way of separating "sword to leo" into its constituent
parts. Commands such as these therefore use this efun in order to use
commands with multiple arguments or to make commands more
"English-like".
Most people find the manual entries for sscanf() to be rather difficult
reading. The function does not lend itself well to the format used by
manual entries. As I said above, the function is used to take a string and
break it into usable parts. Technically it is supposed to take a string and
scan it into one or more variables of varying types. Take the example
above:
int give(string str) {
string what, whom;
if(!str) return notify_fail("Give what to whom?\n");
if(sscanf(str, "%s to %s", what, whom) != 2)
return notify_fail("Give what to whom?\n");
... rest of give code ...
}
The efun sscanf() takes three or more arguments. The first argument is
the string you want scanned. The second argument is called a control
string. The control string is a model which demonstrates in what form
the original string is written, and how it should be divided up. The rest
of the arguments are variables to which you will assign values based
upon the control string.
The control string is made up of three different types of elements: 1)
constants, 2) variable arguments to be scanned, and 3) variable
arguments to be discarded. You must have as many of the variable
arguments in sscanf() as you have elements of type 2 in your control
string. In the above example, the control string was "%s to %s", which
is a three element control string made up of one constant part (" to "),
and two variable arguments to be scanned ("%s"). There were no
variables to be discarded.
The control string basically indicates that the function should find the
string " to " in the string str. Whatever comes before that constant will
be placed into the first variable argument as a string. The same thing
will happen to whatever comes after the constant.
Variable elements are noted by a "%" sign followed by a code for
decoding them. If the variable element is to be discarded, the "%" sign
is followed by the "*" as well as the code for decoding the variable.
Common codes for variable element decoding are "s" for strings and "d"
for integers. In addition, your mudlib may support other conversion
codes, such as "f" for float. So in the two examples above, the "%s" in
the control string indicates that whatever lies in the original string in the
corresponding place will be scanned into a new variable as a string.
A simple exercise. How would you turn the string "145" into an
integer?
Answer:
int x;
sscanf("145", "%d", x);
After the sscanf() function, x will equal the integer 145.
Whenever you scan a string against a control string, the function
searches the original string for the first instance of the first constant in
the original string. For example, if your string is "magic attack 100" and
you have the following:
int improve(string str) {
string skill;
int x;
if(sscanf(str, "%s %d", skill, x) != 2) return 0;
...
}
you would find that you have come up with the wrong return value for
sscanf() (more on the return values later). The control string, "%s %d",
is made up of to variables to be scanned and one constant. The constant
is " ". So the function searches the original string for the first instance
of " ", placing whatever comes before the " " into skill, and trying to
place whatever comes after the " " into x. This separates "magic attack
100" into the components "magic" and "attack 100". The function,
however, cannot make heads or tales of "attack 100" as an integer, so it
returns 1, meaning that 1 variable value was successfully scanned
("magic" into skill).
Perhaps you guessed from the above examples, but the efun sscanf()
returns an int, which is the number of variables into which values from
the original string were successfully scanned. Some examples with
return values for you to examine:
sscanf("swo rd descartes", "%s to %s", str1, str2) return: 0
sscanf("swo rd descartes", "%s %s", str1, str2) return: 2
sscanf("200 gold to descartes", "%d %s to %s", x, str1, str2) return: 3
sscanf("200 gold to descartes", "%d %*s to %s", x, str1) return: 2
where x is an int and str1 and str2 are string
5.4 Summary
LPC strings can be thought of as arrays of characters, yet always
keeping in mind that LPC does not have the character data type (with
most, but not all drivers). Since the character is not a true LPC data
type, you cannot act upon individual characters in an LPC string in the
same manner you would act upon different data types. Noticing the
intimate relationship between strings and arrays nevertheless makes it
easier to understand such concepts as the range operator and indexing on
strings.
There are efuns other than sscanf() which involve advanced string
handling, however, they are not needed nearly as often. You should
check on your mud for man or help files on the efuns: explode(),
implode(), replace_string(), sprintf(). All of these are very valuable
tools, especially if you intend to do coding at the mudlib level.
Copyright (c) George Reese 1993