It's very hard to beat the performance of a simple pointer increment. This operation is normally available right on the CPU hardware and executes in a clock cycle (or less, in some circumstances).

Iterators are nice because they provide a level of abstraction between the operation and the implementation; this brings several advantages, including the ability to swap in a new implementation without disturbing existing code, and the provision of a consistent interface for similar functionality across a broad range of objects. But like all things, abstraction comes at a cost, and often incurs a performance penalty. Which is more important - raw performance or maintainability - is up to the programmer.

Honestly, though, the cost of loop overhead is normally very, very small in comparison with the time spent performing the actual workings within the loop. You will usually get much more benefit using the clearer, more portable abstractions and simply kicking the compiler's optimization setting up a notch or two than you'll gain by other means.

Note, also, that your tests are not valid. In the STL case, you call list.end() on each iteration of the loop, incurring an unnecessary function call. You also mix direct dereference with calls to list.at(), two very different accessor methods. These inconsistencies make your results meaningless.