Learning In Safety-Critical, Lifelong, And Multi-Agent Systems: Bandits And Rl Approaches