A little while back I was trying to make a small IRC bot but I eventually lost my interest in it. While writing the bot I had to write a regex to match the raw IRC message pattern. A friend (thanks Jobe) on IRC came up with the following regex:
^(?:[:@]([^\\s]+) )?([^\\s]+)(?: ((?:[^:\\s][^\\s]* ?)*))?(?: ?:(.*))?$
It will match 4 groups (source, command, target and the parameters). A small example in JAVA:
Pattern pattern = Pattern.compile("^(?:[:@]([^\\s]+) )?([^\\s]+)(?: ((?:[^:\\s][^\\s]* ?)*))?(?: ?:(.*))?$"); Matcher matcher = pattern.matcher(line.subSequence(0, line.length())); if (matcher.matches()) { //i.e irc.mibbit.net source = matcher.group(1); //i.e 433/NOTICE cmd = matcher.group(2); //i.e RoomBot/#mibbit target = matcher.group(3); //i.e I have 3093 clients and 1 servers param = matcher.group(4); }
Would you have done differently?
Related posts:
[...] reading here: Regular expression to match raw IRC messages | Joshua Lückers Share and [...]
Not sure I understand how it knows how to section it, could you explain? (Never really did this sort of thing)
You have to use rounded brackets.
For example the regex: ([a-Z])([0-9]) will match a3 and will group it into:
group 1: a
group 2: 3
Those groups can be used to backreference. For more info check: http://www.regular-expressions.info/brackets.html
Some regex engines also allow you to name those groups: http://www.regular-expressions.info/named.html
Thanks for posting this up. I’ve been trying to find one to parse raw IRC commands; Been too busy to write one myself.
I’m actually writing an IRC client in PHP via PHP CGI, so I’m trying to find the most optimized way to parse each command sent to the client so it can handle it how it needs to. If you have any resources that might help me code my first IRC client, I’d love to hear from you!
Thanks again for posting this up, man.
I’m happy this post is useful!
A good resource might be the IRC RFC for Clients: http://irchelp.org/irchelp/rfc/rfc2812.txt
Personally I always check what kind of command the given line is (PRIVMSG, NOTICE etc) and then send the message to for eg. the NoticeHandler. This handler then takes care of all the notices etc.
$source = ‘PING’, when parsing a standard ping/pong from the protocol (PING :irc.example.com)
Disregard, I misread some imploding in my PHP code.
Oops, happens to the best
Hmm, it seems the regex expression doesn’t seem to parse JOIN messages correctly.
Group(3) would be null.
It has been a while since I had to use this regex expression so it might need improvement.