C++ FAQ Celebrating Twenty-One Years of the C++ FAQ!!!
(Click here for a personal note from Marshall Cline.)
Section 36:
[36.9] How do I serialize objects that contain pointers to other objects, but those pointers form a tree with no cycles and no joins?

Before we even start, you must understand that the word "tree" does not mean that the objects are stored in some sort of tree-like data structure like std::set. It simply means that your objects point to each other, and the "with no cycles" part means if you keep following pointers from one object to the next, you never return to an earlier object. Your objects aren't "inside" a tree; they are a tree. If you don't understand that, you really should read the lingo FAQ before continuing with this one.

Second, don't use this technique if the graph might someday contain cycles or joins.

Graphs with neither cycles nor joins are very common, even with "recursive composition" design patterns like Composite or Decorator. For example, the objects representing an XML document or an HTML document can be represented as a graph without joins or cycles.

The key to serializing these graphs is to ignore a node's identity and instead to focus only on its contents. A (typically recursive) algorithm dives through the tree and writes the contents as it goes. For example, if the current node happens to have an integer a, a pointer b, a float c, and another pointer d, then you first write the integer a, then recursively dive into the child pointed to by b, then write the float c, and finally recursively dive into the child pointed to by d. (You don't have to write/read them in the declaration order; the only essential rule is that the reader's order is consistent with the writer's order.)

When unserializing, you need a constructor that takes a std::istream&. The constructor for the above object would read an integer and store the result in a, then would allocate an object to be stored in pointer b (and will pass the std::istream to the constructor so it too can read the stream's contents), read a float into c, and finally will allocate an object to be stored in pointer d. Be sure to use smart pointers within your objects, otherwise you will get a leak if an exception is thrown while reading anything but the first pointed-to object.

It is often convenient to use the Named Constructor Idiom when allocating these objects. This has the advantage that you can enforce the use of smart pointers. To do this in a class Foo, write a static method such as FooPtr Foo::create(std::istream& istr) { return new Foo(istr); } (where FooPtr is a smart pointer to a Foo). The alert reader will note how consistent this is with the technique discussed in the previous FAQ — the two techniques are completely compatible.

If an object can contain a variable number of children, e.g., a std::vector of pointers, then the usual approach is to write the number of children just before recursively diving into the first child. When unserializing, just read the number of children, then use a loop to allocate the appropriate number of child objects.

If a child-pointer might be NULL, be sure to handle that in both the writing and reading. This shouldn't be a problem if your objects use inheritance; see that solution for details. Otherwise, if the first serialized character in an object has a known range, use something outside that range. E.g., if the first character of a serialized object is always a digit, use a non-digit like 'N' to mean a NULL pointer. Unseralization can use std::istream::peek() to check for the 'N' tag. If the first character doesn't have a known range, force one; e.g., write 'x' before each object, then use something else like 'y' to mean NULL.

If an object physically contains another object inside itself, as opposed to containing a pointer to the other object, then nothing changes: you still recursively dive into the child-node as if it were via a pointer.