libxmlrpc_util

This chapter describes the functions in the libxmlrpc_util function library, which is part of XML-RPC For C/C++ (Xmlrpc-c). Also see General Library Information - C.

libxmlrpc_util provides utilities not specifically related to XML-RPC. But the libraries of Xmlrpc-c use these facilities, so you need them to use the XML-RPC-specific libraries. libxmlrpc_util also contains some functions that are not documented for use outside of Xmlrpc-c, but rather are intended to be called only by other Xmlrpc-c code. libxmlrpc_util is a prerequisite of most other Xmlrpc-c libraries, which means you must link it with any program that you link with those other libraries.

Chapter Contents

The <xmlrpc-c/util.h> header file declares the interface to libxmlrpc_util and many other things described in this chapter.

You'll have to figure out where on your system this file lives and how to make your compiler look there for it. Or use xmlrpc-c-config.

Linking The Library

The classic Unix name for the file containing the libxmlrpc_util library is libxmlrpc_util.a or libxmlrpc_util.so. The classic linker option to cause the library to be linked into your program is -l xmlrpc_util. These are hints; you'll have to modify this according to conventions of your particular platform. You'll also have to figure out where the library resides and how to make your linker look there for it. Or use xmlrpc-c-config.

libxmlrpc_util has no prerequisites to link.

Assertions

Xmlrpc-c provides some macros and functions for making assertions of various things. Like the C library assert() function, these generate runtime assertion checking code as described above. But also like the assert() function, they don't generate the assertion checking code if you compile with the NDEBUG macro defined, as with a -DNDEBUG compiler option.

The concept of assertions in code is widely misunderstood, because it is somewhat more abstract than C coders are used to. An assertion is a statement in code that declares a certain fact to be true. It doesn't make it true; it just declares that the coder knows it is true. In a standard C library, the assert() function performs this duty.

But what does the statement cause the computer to do? If you're a high level coder, that's none of your business. You write code to describe the solution to a computational problem, and how the computer manipulates itself to compute the solution is beyond your concern. Your audience is the human reading your code.

So the most basic function of an assertion is to help the reader read the code. You assert that the value of variable foo is not zero, and that helps the reader see that it won't cause a problem to divide by it.

Another practical effect of an assertion could be that the compiler generates more efficient code with the additional information from your higher intelligence about how the program works. But I've never seen any actual compiler capable of that.

An assertion might also help the compiler to point out bugs in your program. You assert that the value of foo is zero, but you never initialized foo, so you obviously didn't write the program you thought you did. But I've never seen a compiler with that capability either.

Finally, there is run time checking of the assertion. This one is real. At run time, the program checks that the condition asserted really is true. If it isn't, which means the program is broken, it crashes itself. The advantages of this are twofold: 1) this makes it easier to diagnose the bug; 2) it stops the broken program from going on to damage something.

People often have trouble seeing the abstract meaning of an assertion and simply see it as a statement that tells the computer, "crash if X is not true." But in fact, it's quite the opposite: It says, "I assure you X is true."

People sometimes get the idea that assertion statements are for error checking. That is definitely not what they are for. If there is a possibility that X is false when the program is working as designed, you should not assert that X is true. Instead, check the truth of X and if it's false, issue an error message or exit the program or return with a failure return code, or something like that.

Incidentally, a common way you know something is true inside a subroutine is that you required as an entry condition to the subroutine that the caller make sure it is true. A subroutine assumes that it's entry conditions are met, so if you set up the requirement that the caller pass only positive values for parameter X, you may legitimately assert inside the subroutine that X is positive. In fact, that assertion is a good formal way to state that entry condition for the reader of the code.

You can use XMLRPC_ASSERT_PTR_OK() to assert that a pointer is a valid pointer and not a null pointer.

Error Handling

Xmlrpc-c provides facilities for handling errors. The Xmlrpc-c functions use them, and you can use them in your own code.

Example

Here is an example of how to use these facilities:


    xmlrpc_env env;

    xmlrpc_env_init(&env);

    xmlrpc_do_something(&env);
    if (env.fault_occurred)
        report_error_appropriately();

    xmlrpc_env_clean(&env);


Error Environment Variable

XML-RPC defines a "fault structure." It is represented in the protocol by a <fault> XML element. In an XML-RPC response, it describes how a request failed, with a fault code and and a fault string. The fault code is supposed to be an integer from a small set agreed upon by the client and server and describe the general nature of the failure, while the fault string describes the failure in text, in a form typically suitable only for presenting to a human user. The fault string is whatever the server thinks best makes the point; there are no constraints on it.

Because an error detected in a server often turns into an error response to the client, and an error reported in a client is often what was reported in an error response from the server, Xmlrpc-c represents errors the same way as the XML-RPC fault structure. That makes it easy to pass the error information around.

Xmlrpc-c error handling facilities are based on an "error environment" variable (not to be confused with a Unix process "environment variable" -- we're talking about a C variable). An error environment variable has type xmlrpc_env. You pass one into a function which might recognize an error and fail. The function sets in the error environment variable whether or not it failed, and if it did, information about the failure analogous to an XML-RPC fault structure. If the failure has nothing to do with an XML-RPC request failing, the fault code is always XMLRPC_INTERNAL_ERROR.

I would like to point out that discrete error codes such as XML-RPC fault codes and the classic C integer return code, are archaic -- They come from an era when a couple of bytes was all a program could spare for describing an error. They have the apparent advantage of being machine-friendly, but in practice people almost never write code to check the value of a discrete error code. The code at best checks whether a function succeeded or failed, and any more specific information than that just gets passed up to a human.

Moreover, if the fact of a particular kind of failure is informative enough that a program can take a particular action because of it, then the user ought to be able to get that information from a successful RPC instead of by analyzing the failure of one. For example, it is better to have a means of querying explicitly whether a file named foo exists that to have the user attempt to access the file and recognize a special fault for "file doesn't exist."

Therefore, I recommend you pay little attention to Xmlrpc-c's fault codes and instead treat the fault string as the primary error indication. Don't be stingy in your fault strings -- it doesn't cost you anything to return 25 words of error information and it might save the user a lot of diagnosis time. Break with the Unix tradition of providing at most 3 words of information about any failure. When a subroutine you call fails, include the subroutine's fault string plus information about the context in which you called it in the fault string you return to your caller.

Using Error Environment Variables

It is possible for a variable of type xmlrpc_env to be invalid. To assert that this is not the case, use ASSERT_ENV_OK().

Before passing an error environment variable to anything, you must initialize it with xmlrpc_env_init(). This sets the environment variable to a valid state that indicates no error has occurred.

When you are done with an error environment variable, you must declare such by calling xmlrpc_env_clean() on it. After that, you cannot use it again unless you run xmlrpc_env_init() on it again.

To set an error environment variable to indicate an error has occurred, use xmlrpc_env_set_fault(). The arguments to that are a fault code and a fault string. Or use xmlrpc_env_set_fault_formatted(), which is the same except in place of the fault string argument, you have a printf-style format string followed by printf-style substitution arguments. Or use xmlrpc_faultf(), which is the same as xmlrpc_env_set_fault_formatted() except that the fault code is always XMLRPC_INTERNAL_ERROR (and its name is shorter!).

In any function call to build an error environment variable, you supply the fault string as a NUL-terminated UTF-8 string of XML characters. (If you aren't familiar with UTF-8, just use ASCII, because that meets the UTF-8 requirement. XML characters are everything but the ASCII control characters, plus CR, LF, and Tab). If your argument is not valid UTF-8 XML characters, the function edits it in arbitrary ways to make a string that is.

These are the defined fault codes and for each, the integer to which it maps should it find its way into an XML-RPC fault structure. The fault code names are defined C identifiers.
Fault code name XML-RPC code
XMLRPC_INTERNAL_ERROR -500
XMLRPC_TYPE_ERROR -501
XMLRPC_INDEX_ERROR -502
XMLRPC_PARSE_ERROR -503
XMLRPC_NETWORK_ERROR -504
XMLRPC_TIMEOUT_ERROR -505
XMLRPC_NO_SUCH_METHOD_ERROR -506
XMLRPC_REQUEST_REFUSED_ERROR -507
XMLRPC_INTROSPECTION_DISABLED_ERROR -508
XMLRPC_LIMIT_EXCEEDED_ERROR -509
XMLRPC_INVALID_UTF8_ERROR -510

The XML-RPC specification does not specify the meaning of any fault codes -- it says that is up to the particular client and server or some higher standard to specify. Xmlrpc-c is one such higher standard, as expressed above. I don't know if those fault code values are used by any servers and clients other than those that use Xmlrpc-c.

Note that Xmlrpc-c does not conform to any part of the Fault Code Interoperability standard.

To determine whether an error environment variable indicates a failure or not, just look at the fault_occurred member of the C struct which is type xmlrpc_env. (This is uncharacteristically primitive and un-object-oriented. There really should be a fault_occurred() function (method) for querying this).

You must ensure that what you pass to a subroutine as an error environment variable is valid and that anything that is passed to you as an environment variable is valid when you return. Conversely, you may assume that anything passed or returned to you as an error environment variable is valid. Bear in mind that the functions that manage error environment variables may have an argument that is an error environment variable, but not the error environment variable. E.g. the argument to xmlrpc_env_init() is an environment variable, but its purpose isn't to return error information, and obviously, you aren't required to ensure it is valid before the call.

Throwing Errors

Xmlrpc-c provides facilities, based on error environment variables, for doing bailout style error handling -- basically, a goto around all the normal code wherever you detect that something's gone horribly wrong. This mimics what some modern object oriented languages do, but since C does not have the infrastructure that those have (mainly, automatic destructors of objects), it basically is just an ugly goto, with all the associated unreadability and opportunity for coding error.

If you like high level code, don't use these facilities.

XMLRPC_FAIL() sets the specified environment variable to indicate a failure, with the indicated fault code and fault string, and then branches to the label cleanup. You must provide a label cleanup that undoes anything your subroutine might have done and returns from your subroutine.

Example:


static void
mysubroutine(void) {
  mymem = malloc(80);

  rc = open("myfile", 0);
  if (rc < 0)
    XMLRPC_FAIL(&env, XMLRPC_INTERNAL_ERROR, "couldn't open the file");

  process the file ...


cleanup:
  free(mymem);
}


We call what XMLRPC_FAIL does "throwing an error."

XMLRPC_FAIL1() is the same as XMLRPC_FAIL(), except that instead of the fault string, you give it a printf-style format string that contains 1 substitution and an argument for that substitution.

XMLRPC_FAIL2 is the same except with 2 substitutions, and XMLRPC_FAIL3 is the same except with 3 substitutions.

XMLRPC_FAIL_IF_NULL() is the same as XMLRPC_FAIL() except that you also pass it a pointer, and it throws an error only if that pointer is null; otherwise, it does nothing.

XMLRPC_FAIL_IF_FAULT() checks the error environment variable you supply and if it indicates no error, does nothing. If it does indicate an error, XMLRPC_FAIL_IF_FAULT() throws an error.

Fatal Error Handling

The utilities in this section have nothing particular to do with Xmlrpc-c. They're just conveniences for any C programmer.

XMLRPC_FATAL_ERROR() issues a message containing the indicated string and the location in the program that you called it, if possible. Use this where you detect a bug in your program.

Memory Blocks

Xmlrpc-c provides a data type for managing blocks of memory. Xmlrpc-c uses such blocks of memory in some of its interfaces, and you may find it convenient to use for your own purposes too.

The basic object is a "memory block." A memory block is essentially an array whose size you can change at will. The array elements can be of any type (e.g. an array of characters, an array of integers, an array of struct foo). Here's an example:


#include <xmlrpc.h>;

struct mystruct {
    time_t measurementTime;
    int measurement;
    bool reported;
}
struct mystruct * myArray;
xmlrpc_mem_block * myMemBlockP;

myMemBlockP = XMLRPC_TYPE_MEM_BLOCK_NEW(struct mystruct, &env, 0);

for (i = 0; i < 5; ++i) {
    struct mystruct newMeasurement;
    newMeasurement.time        = time(NULL);
    newMeasurement.measurement = currentMeasurement();
    newMeasurement.reported    = FALSE;

    XMLRPC_TYPED_MEM_BLOCK_APPEND(struct mystruct, &env,
                                  &myMemBlockP, 
                                  &newMeasurement, 1);
}
myArray = XMLRPC_TYPED_MEM_BLOCK_CONTENTS(struct mystruct, myMemBlockP);

printf("the 3rd measurement was %d\n", myArray[2]);

myArray[2].reported = TRUE;

...

XMLRPC_TYPED_MEM_BLOCK_FREE(struct mystruct, &myMemBlockP);


xmlrpc_mem_block is the type of a memory block. This one type is used for all memory blocks regardless of the element type, and you must indicate on every call to the memory block manipulation functions what the type is.

To create a memory block, use XMLRPC_TYPED_MEM_BLOCK_NEW(). To destroy one, use XMLRPC_TYPED_MEM_BLOCK_FREE().

To find out the current size (number of elements) of a memory block, use XMLRPC_TYPED_MEM_BLOCK_SIZE(). To change the current size (to any number of elements), use XMLRPC_TYPE_MEM_BLOCK_RESIZE().

To access and update the contents of a memory block, use XMLRPC_TYPED_MEM_BLOCK_CONTENTS() to get a pointer that you can use as a C array. (See above example).

Another way to set the contents of the memory block is to use XMLRPC_TYPED_MEM_BLOCK_APPEND() to increase the size of the array by one and set the added element to the value you specify.

XMLRPC_TYPED_MEM_BLOCK_INIT() and XMLRPC_TYPED_MEM_BLOCK_CLEAN() are like the create and destroy functions, but you supply the storage for the xmlrpc_mem_block. Example:


static void
myFunction(void) {
    xmlrpc_env env;
    xmlrpc_mem_block myMemBlock;

    XMLRPC_TYPED_MEM_BLOCK_INIT(unsigned int, &env, &myMemBlock, 5);
    -- use myMemBlock --
    XMLRPC_TYPED_MEM_BLOCK_CLEAN(unsigned int, &myMemBlock);
}


Base64 Encoding And Decoding

These are functions to encode byte strings into base64 and decode base64.

Base64 is a code that can represent an arbitrary string of bytes as a string of printable characters. The base64 code uses only the 64 printable characters from the ASCII code. These routines encode those characters as ASCII. The encodings are therefore necessarily UTF-8 Unicode as well.

In the XML-RPC <base64> XML element, the contents of the element are the base64 encoding of a byte string. So you can use these functions to create and interpret an XML-RPC XML stream. You don't need these functions if you use the higher level Xmlrpc-c facilities, because they take care of the encoding and decoding of XML for you. Even if you use the XML Encoding And Decoding functions, the library takes care of it. You need these functions for XML-RPC only if you're assembling your own XML character by character.

xmlrpc_base64_encode() encodes a string of bytes into base64.

xmlrpc_base64_encode_without_newlines() does the same, but encodes it as a single line. Base64 allows the characters to be spread over lines any way you like -- the line breaks are meaningless. xmlrpc_base64_encode() uses short lines to make the code easy to display. But some things that look at base64 codes, such as HTTP authentication, do not understand that newlines are meaningless, so you may need to use xmlrpc_base64_encode_without_newlines() to make sure the only one is at the very end.

xmlrpc_base64_decode() decodes a string of bytes from base64.

Character Code Conversions

These functions deal with the various ways of representing characters and text strings in a program. See Character Encoding.

Like XML-RPC, Xmlrpc-c services mostly expect strings in UTF-8 form. Note that all ASCII strings are also UTF-8. A UTF-8 string in which no byte has the high bit on is an ASCII string.

Assuming the underlying platform has "wide" character string services, Xmlrpc-c can alternatively deal with UCS-2 strings represented with the wchar_t data type.

Note that ANSI C does not precisely define the representation used by wchar_t. Xmlrpc-c always uses UCS-2, but your program may expect UTF-16, UCS-4, or just about anything else. If so, you won't be able to use these conversion subroutines.

xmlrpc_validate_utf8()

xmlrpc_validate_utf8() confirms that a text string contains valid UTF-8 characters. It fails if the string does not. More precisely, it confirms that the text string contains valid UTF-8 characters of the Basic Multilingual Plane, which means they could be represented as UCS-2.

Prototype:

  
    void 
    xmlrpc_validate_utf8(xmlrpc_env * envP,
                         const char * utf8_data,
                         size_t       utf8_len);
    
  

envP is the handle of an error environment variable.

utf8_data points to the bytes which form the alleged UTF-8 string; utf8_len is its length.

xmlrpc_utf8_to_wcs()

xmlrpc_utf8_to_wcs() converts a UTF-8 character string to a UCS-2 "wide" character string (wchar_t).

Prototype:

  
    xmlrpc_mem_block *
    xmlrpc_utf8_to_wcs(xmlrpc_env * envP,
                       const char * utf8_data,
                       size_t       utf8_len);
    
  

envP is the handle of an error environment variable.

utf8_data points to the bytes which form the UTF-8 string; utf8_len is its length. If this is not valid UTF-8, the function fails.

The function returns a newly constructed memory block which contains the bytes of the UCS-2 string.

xmlrpc_wcs_to_utf8()

xmlrpc_wcs_to_utf8() converts a UCS-2 "wide" character string (wchar_t) to UTF-8.

Prototype:

  
    xmlrpc_mem_block *
    xmlrpc_wcs_to_utf8(xmlrpc_env *    const envP,
                       const wchar_t * const wcsData,
                       size_t          const wcsLen);
    
  

envP is the handle of an error environment variable.

wcsData points to the string of UCS-2 characters. wcsLen is its length in characters (i.e. the number of bytes it occupies is twice wcsLen).

xmlrpc_force_to_utf8()

xmlrpc_force_to_utf8() is a weird but handy function. It examines the string you pass it to see if it is valid UTF-8, like xmlrpc_validate_utf8(), but if it isn't, modifies it until it is. To do this, it 1) replaces some characters that have the high bit on with DEL (0x3F) and 2) chops a few characters off the end.

This is useful because you may have some memory that is supposed to contain a UTF-8 string but in reality contains garbage. The supposed string winds its way through XML-RPC to end up being viewed by a human, and you'd like that human to see that it's garbage and deal with it. But the various Xmlrpc-c functions won't transport arbitrary garbage; they transport only UTF-8. The mutations that xmlrpc_validate_utf8() makes are small enough that if it's only partly garbage, then the non-garbage parts probably will make it all the way through.

Prototype:

  
    void
    xmlrpc_force_to_utf8(char * const buffer);
    
  

buffer points to the quasi-UTF-8 string, which is terminated by a zero byte (both as input and output).

This function was new in Xmlrpc-c 1.12 (September 2007).

xmlrpc_force_to_xml_chars()

xmlrpc_force_to_xml_chars() is similar to xmlrpc_force_to_utf8(). It modifies a UTF-8 string to make it contain nothing but valid XML characters. All UTF-8 characters except control characters are also XML characters, and CR, LF, and tab are XML characters as well. So this function simply replaces those 29 non-XML UTF-8 characters with DEL (the XML character with code point 0x3f).

This isn't terribly useful, because of the Xmlrpc-c extension to XML-RPC that allows a string to have non-XML characters in it. But you may not want to exploit that extension because your communication partner doesn't have a similar extension. In that case, you will have to make sure you never create an XML-RPC string value that contains a non-XML character, and xmlrpc_force_to_xml_chars() can help with that.

This function was new in Xmlrpc-c 1.12 (September 2007).