Democratizing Tensor Processors: Efficient And Generalized Tensor Computation With Architectural Support