Currently, rouge lexers are expressed in Ruby code, which makes them extremely flexible for complex languages like Ruby. However, there is a use-case for allowing user-provided lexers in contexts where executing arbitrary Ruby code would be inappropriate.
Supporting this use case would involve a rouge extension providing an alternate Lexer subclass which consumes either a custom format or implements an existing one such as tmLanguage, and attempts to map the resulting tokens back to rouge/pygments's token set.
Main concerns here are:
- tmLanguage is not nearly as flexible as Rouge's stack-based approach (inherited from pygments), which allows arbitrary lexer state among other things. Very complex lexers would probably not be expressible accurately in this format.
- A custom format could place a "15th competing standard" burden on language devs / communites, where they already have to implement Ruby lexers, or could encourage people to use inaccurate/slower lexing rather than contribute a Ruby lexer.