Software

Easily write and format collections and collection ranges in C++

Writing and formatting collections can pose quite a challenge for the C++ developer. We'll walk you through the code and explain how you can make the task more manageable.


In my previous article, I showed you how to easily format ranges and containers as you write them to Standard Template Library (STL) streams. Writing ranges is straightforward using operator <<. You use range( itFirst, itLast [, formatter]) for writing a range and container( cont [, formatter]) for writing a container. You can supply a formatter, which formats each element prior to writing it. If you don’t supply one, a default one will be provided for you.

Now it's time to see how you can easily write collections and collection ranges with familiar syntax. Table A shows examples of how you can write a collection:
Table A
Output Type of writing
 Easily -> 3, collections -> 12, write -> 5 Default
Easily -> 3
collections -> 12
write ->5
With formatter (custom transformation and custom writer)
[Easily] -> [3];
[collections] -> [12];
[write] -> [5].
With formatter (custom transformation and custom writer)
Key: Easily, Value: 3;
Key: collections, Value: 12;
Key: write, Value: 5.
With formatter (custom transformation and custom writer)
Word 'Easily' appeared 3 time(s);
Word 'collections' appeared 12 time(s);
Word 'write' appeared 5 time(s).
With formatter (custom transformation and custom writer)
Word 'EASILY' appeared 3 time(s);
Word 'COLLECTIONS' appeared 12 time(s);
Word 'WRITE' appeared 5 time(s).
With formatter (custom transformation, custom transformer, and custom writer)
[0] Word 'Easily' appeared 3 time(s);
[1] Word 'collections' appeared 12 time(s);
[2] Word 'write' appeared 5 time(s).
With formatter (custom transformation, custom transformer, and custom writer)
Writing a collection

The collections are associative arrays, associating a key to a value (for example, std::map, std::multipap). Since collections are containers, you might assume that the range/container functions should work for collections also. But, consider the following code:
// try to parse a file, and for each word write how many times
// it appeared in the file
std::map< std::string, int> collWordCounts;
// … code
std::cout << container( collWordCounts) << std::endl;


This code will generate a compile-time error. Each element in a collection is a std::pair< const Key, Value> (in our case, std::pair< const std::string, int>). There is no operator<< defined for std::pair< Key, Value>, hence the error. Even if one were defined, it would not have been satisfactory for our needs. You would expect the output to look something like:
‘easily’ -> 3
‘write’ -> 5
‘collections’ -> 12


Or, even better:
Word ‘easily’ appeared 3 time(s).
Word ‘write’ appeared 5 time(s).
Word ‘collections’ appeared 12 time(s).


To understand the difference between writing a collection and writing a sequence container, you have to look at the objects used in formatting a range/container:
  • ·        The formatter object decides what transformation will be applied to each element and how the elements will be written.
  • ·        The writer object allows you to format as the surroundings of elements are written (prefix, after element, and suffix).
  • ·        The transformation object allows you to transform each element prior to writing it.

For collections, the element is a std::pair< const Key, Value>. (Key is the collection’s key and Value is the collection’s value type.) The formatter and writer can be left the same, since the formatter/writer concept is the same for collections. However, the transformation must decide how to write a pair of Key and Value (in contrast to just a value, for sequence containers/ranges), which is different from a sequence container.

Since you handle the transformation differently, the default transformation is also different. Instead of writing the value, as it does for sequenced containers and ranges, the transformation will write the elements like this: “<key> -> <value>”—for example: “easily -> 3”. This means a different default formatter as well. (It uses the default transformation for collections, not for sequence containers.) Solely for this purpose, I have created the functions coll_range and coll_container, which resemble the range/container syntax.

The transformation
As explained above, the transformation now applies to a [key, value] pair. So the code shown in Listing A will work as expected.

But most of the time it’s cumbersome to do it this way. To make it simpler, Table B outlines a format string, which specifies where the key and value should be. Note that %k specifies key and %v specifies value.
Table B
Examples of code and its output

std::cout << coll_container( collWords,
formatter( coll_transform( “%k -> %v"), "\n"));
will write:

collections -> 12
easily -> 3
write -> 5

std::cout << coll_container( collWords,
formatter( coll_transform( "Key: %k, Value: %v"), "\n"));
will write:

Key: collections, Value: 12
Key: easily, Value: 3
Key: write, Value: 5

std::cout << coll_container( collWords,
formatter( coll_transform( “[%k] -> [%v]"), "\n"));
will write:

[collections] -> 12
[easily] -> 3
[write] -> 5

std::cout << coll_container( collWords,
formatter( coll_transform( "Word '%k' appeared %v time(s)."), "\n"));
will write:

Word ‘collections’ appeared 12 time(s)
Word ‘easily’ appeared 3 time(s)
Word ‘write’ appeared 5 time(s)


As a bonus, you don’t have to write both key and value; this snippet illustrates:
// will write only the keys
std::cout << coll_container( collWords, formatter( coll_transform( "%k")));
// will write only the values
std::cout << coll_container( collWords, formatter( coll_transform( "%v")));


The coll_transform function can be used like this:
  • ·        coll_transform( strFormat) applies the strFormat transformation to each [key, value] pair from the collection.
  • ·        coll_transform( strFormat, transformer) applies the strFormat transformation to each [key, value] pair from the collection. It applies the transformer prior to writing the string/key/value. An example of a transformer is class KeyToUpper, which we will demonstrate shortly.

Remember that for sequence containers, the transformation allowed transforming an element prior to writing it. For example, for an array of Names, you might want the names to appear like this: KEITH, Jones (first name in uppercase, last name normal). The same applies here. The transformer allows for that; the object's class must implement the following functions:
  • ·        write_prefix( streamOut, std::pair< const Key, Value>) writes a prefix before writing the string corresponding to this element; the second argument is provided for when you need to compute the prefix based on the element.
  • ·        transform_value( streamOut, Value) transforms the value and writes it.
  • ·        transform_key( streamOut, Key) transforms the key and writes it.

The default transformer is called coll_transformation_base. It does not write any prefix (write_prefix is empty) and writes the value and the key unchanged. If you want to refine this, derive your class from coll_transformation_base and overwrite only the functions you need.

For example, here’s how you write the key in uppercase:
// writes the Key (which MUST be a string)
// in upper case
class KeyToUpper
    : public coll_transformation_base
{
public:
    template< class StreamType, class CharType>
        void transform_key( StreamType & streamOut, const std::basic_string< CharType> & key)
    {
        std::basic_string< CharType> upper = key;
        std::transform(
            upper.begin(), upper.end(), upper.begin(), toupper);
        streamOut << upper;
    }
};


Listing B shows what you can do by combining the transformation (format string and transformer) with the writer. The comments show the output of the code.

Here’s what you need to remember when writing collections:
  • ·        Use coll_range/coll_container instead of range/container.
  • ·        When using a formatter, use coll_transform as its first parameter. The only exception to this is when you write something like class write_key_and_value, as shown in Listing A.
  • ·        When using coll_transform, its first parameter is the format string. You can provide a transformer as a second parameter, which will transform the string/ key/value, prior to writing it.
  • ·        The format string does not need to include both key (%k) and value (%v). For instance, Key: %k will write only the keys, each being prefixed by Key, and %v will write only the values.

Try it out
You can download the code that allows writing ranges, containers, collection ranges/collections, as well as an example. Run the example with an argument—a text file to read from. For each word in the file, it will count how many times it has appeared and show this on the console.

 

Editor's Picks