Let W be such a subring and let W′ be the closure of W in c(S), the set of functions uniformly approximated by functions in W. If g is in W′, approximated by a sequence f1 f2 f3 etc, scale the sequence by a constant, and the limit g scales accordingly. Also, add two sequences together and add their limits. Thus the functions of W′ form an R vector space.
Product is a little trickier, and that's where we need S to be compact, so that all functions are bounded in R. Let the functions d1 d2 d3 etc approach e, and let f1 f2 f3 etc approach g. Let b be the larger of |e| and |g|, then add 1 for good measure. Build a new sequence of functions hi = di×fi. The product of continuous functions is continuous, hence each hi is in c(S). Also, W is a subring, hence each hi is in W. If j = e×g, then j(x) is the limit of h(x). This because the product of two limits is the limit of the product sequence in real space. The tricky part is uniformity.
Go far enough out in the sequences d and f, so that for every x, dn is within ε of e, and fn is within ε of g. The product is hn, and when this is subtracted from jn, the error term is no worse than 2bε+ε2. This approaches 0 for small ε, hence hn approaches j uniformly, j is in W′, and W′ is a subring of c(S).
Since W′ is a ring containing the constant functions, f in W′ implies p(f) in W′ for any polynomial p. If |f| ≤ 1, Compose f with p(x) from the previous lemma, and we can approximate the function that is the absolute value of f. (I'm going to call this function |f|, which is poor notation, because |f| also means the norm of f. Sorry about that.) Since W′ is closed, it contains |f|.
If f is any function, divide f by b, the bound on f, take the absolute value, and multiply by b. Thus f ∈ W′ implies |f| ∈ W′.
Note that (f+|f|)/2 extracts the positive portions of f, and leaves the negative portions on the cuttingroom floor. Similarly, (f-|f|)/2 retains the negative portions. Given f and g, f plus the positive portions of g-f yields the max, while f plus the negative portions of g-f yields the min. Therefore W′ is a lattice, it separates points, and it is closed under scaling and translation. This makes W′ dense in c(S).
Remember that W′ is closed, and dense in c(S), hence W′ = c(S). Every function in c(S) is approximated by the functions of W.