It's much easier to "visualize" the covariant derivative using a higher dimensional Euclidean "scaffolding" into which you isometrically imbed your manifold (always possible, at least for spaces with a positive definite metric, but in anycase, the results are the same).  By the way, it's MUCH easier to derive most of the standard diff geo formulas this way.
 
If you have a continuous, regular, C_infinity, blah, blah, blah manifold (think smooth curved surface) with a vector field defined over it. Take the (everday, ol' Euclidean vector) derivative of this field with respect to some parameter.  The result will be a new vector field that does not necessarily reside in the tangent space of the manifold.
 
The PROJECTION of the derivative vector onto the tangent space of the manifold is the covariant derivative.  It is the derivative field with the normal component subtracted off.  It is the component of the everyday ol' Euclidean derivative that resides in the tangent space of the manifold. Think "shadow" of the derivative vector with the light directly overhead. INTRINSICALLY, the normal component doesn't really exist.  EXTRINSICALLY, it is not unique as it depends on the imbedding, which is not unique.  The covariant derivative IS unique and does not depend on the (isometric) imbedding.
 
Now if you do this twice (take the second covariant derivative), then do it again, but reverse the order, then subtract, you get the Riemann Curvature Tensor.  ie:

    vi;j;k  - vi;k;j  = -v m R imjk

 Where vi is the indexed vector field and ";j " and ";k " are the covariant derivatives of v with respect to parameters indexed by j and k, and repeated indices are summed.  Depending on your bent, the above can serve as a definition for the curvature tensor.  In flat space R is of course zero as the second derivative commutes.