This implementation is not extremely efficient (string copies, temporary vectors, ...), but it's correct. Any optimization can be done if it's found to be too inefficient.