I'll let you in on a secret: my pet hamster did all the coding. I was just a channel, a `front' if you will, in my pet's grand plan. So, don't blame me if there are bugs. Blame the cute, furry one.
iptables hooks in on the NF_IP_LOCAL_IN, NF_IP_FORWARD and NF_IP_LOCAL_OUT hooks. It keeps an array of rules in memory (hence the name `iptables', although in fact there is only one table). The only difference between the three hooks is where they being traversing in the table.
Inside the kernel, each rule (`struct ipt_kern_entry') consists of the following parts:
Userspace has four operations: it can read the current table, read the info (hook positions and size of table), replace the table, and add in new counters.
The kernel starts traversing at the location indicated by the particular hook. That rule is examined, if it's a match, the match function associated with that rule is called. If that function returns a negatice number, that number is taken as a final result: the negated verdict minus one (unless it's IPT_RETURN, in which case the stack is popped). If the function returns a positive number, and the number is not equal to the last position plus one, that location is pushed on the stack. Traversal continues at the returned location.
Because I'm lazy, iptables
is fairly extensible. This is
basically a scam to palm off work onto other people, which is what
Open Source is all about (cf. Free Software, which as RMS would say,
is about freedom, and I'm sitting in one of his talk at the moment).
Extending iptables
potentially involves to parts:
extending the kernel, by writing a new module, and possibly extending
the userspace program iptables
, by writing a new shared
library.
Writing a kernel module itself is fairly simple, as you can see from the examples. The functions you need to know about are:
This is the entry-point of the module. It returns an error number, or 0 if it successfully registers itself with netfilter.
This is the exit point of the module; it should unregister itself with netfilter.
This is used to register a new match type. You hand it a `struct ipt_match', which is usually declared as a static (file-scope) variable.
This is used to register a new type. You hand it a `struct ipt_target', which is usually declared as a static (file-scope) variable.
Used to unregister your target.
Used to unregister your match.
New match functions are usually written as a standalone module. It's possible to have these modules extensible in turn, although it's usually not necessary. One way would be to use the netfilter framework's `nf_register_sockopt' function to allows users to talk to your module directly. Another way would be to export symbols for other modules to register themselves, the same way netfilter and iptables do.
The core of your new match function is the struct ipt_match which it passes to ipt_register_match(). This structure has the following fields:
This field is set to NULL.
This field is the name of the match function, as referred to by userspace. The name should match the name of the module (ie. if the name is "mac", the module must be "ipt_mac.o") for auto-loading to work.
This field is a pointer to a match function, which takes the skb, the in and out device names (one of which may be ""), the ipt_matchinfo union from the rule that was matched, the IP offset (non-zero means a non-head fragment), a pointer to the protocol header (ie. just past the IP header) and the length of the data (ie. the packet length minus the IP header length). It should return non-zero if the packet matches.
This field is a pointer to a function which checks the specifications for a rule; if this returns 0, then the rule will not be accepted from the user. For example, the "tcp" match type will only accept tcp packets, and so if the `struct ipt_ip' part of the rule does not specify that the protocol must be tcp, a zero is returned.
This field is set to `&__this_module', which gives a pointer to your module. It causes the usage-count to go up and down as rules of that type are created and destroyed. This prevents a user removing the module (and hence cleanup_module() being called) if a rule refers to it.
New targets are also usually written as a standalone module. The discussions under the above section on `New Match Functions' apply equally here.
The core of your new target is the struct ipt_target which it passes to ipt_register_target(). This structure has the following fields:
This field is set to NULL.
This field is the name of the target function, as referred to by userspace. The name should match the name of the module (ie. if the name is "REJECT", the module must be "ipt_REJECT.o") for auto-loading to work.
This is a pointer to the target function, which takes the skbuff, the input and output device names (either of which may be ""), the ipt_targinfo union of the matching rule, and the position of the rule in the table. The target function returns a non-negative absolute position to jump to, or a negative verdict (which is the negated verdict minus one).
This field is a pointer to a function which checks the specifications for a rule; if this returns 0, then the rule will not be accepted from the user.
This field is set to `&__this_module', which gives a pointer to your module. It causes the usage-count to go up and down as rules with this as a target are created and destroyed. This prevents a user removing the module (and hence cleanup_module() being called) if a rule refers to it.
Now you've written your nice shiny kernel module, you may want to
control the options on it from userspace. Rather than have a branched
version of iptables
for each extention, I use the very latest
80's technology: laserdisc. Sorry, I mean shared libraries.
The shared library should have an `_init()' function, which will automatically be called upon loading: the moral equivalent of the kernel module's `init_module()' function. This should call `register_match()' or `register_target()', depending on whether your shared library provides a new match or a new target.
You only need to provide a shared library if you want to initialize part of the structure, or provide additional options. For example, the `REJECT' target doesn't require either of these, so there's no shared library.
There are useful functions described in the `iptables.h' header, especially:
checks if an argument is actually a `!', and if so, sets the `invert' flag if not already set. If it returns true, you should increment optind, as done in the examples.
converts a string into a number in the given range, returning -1 if it is malformed or out of range.
should be called if an error is found. Usually the first argument is `PARAMETER_PROBLEM', meaning the user didn't use the command line correctly.
When you use the `register_match()' function, you hand it a pointer to a static `struct iptables_match', which has various fields:
This pointer is used to make a linked list of matches (such as used for listing rules). It should be set to NULL initially.
The name of the match function. This should match the library name (eg "tcp" for `libipt_tcp.so').
A function which prints out the option synopsis.
A function which swaps the ipt_matchinfo union around, to match packets coming the other way. If that cannot be done, call `exit_error()'.
This can be used to initialize the ipt_matchinfo union. It will be called before `parse()'.
This is called when an unrecognized option is seen on the command line. `invert' is true if a `!' has already been seen. The `flags' pointer is for the exclusive use of your match library, and is usually used to store a bitmask of options which have been specified. It should return non-zero if the option was indeed for your library.
This is called after the command line has been parsed, and is handed the `flags' integer reserved for your library. This gives you a chance to check that any compulsory options have been specified, for example: call `exit_error()' if this is the case.
This is used by the chain listing code to print (to standard output) the ipt_matchinfo union for a rule. The numeric flag is set if the user specified the `-n' flag.
This is a NULL-terminated list of extra options which your library offers. This is merged with the current options and handed to getopt_long; see the man page for details. The return code for getopt_long becomes the first argument (`c') to your `parse()' function.
When I grow up, I'm going to implement a compatibility layer for these old modules, and I'll include here a simple recipe for porting those modules.
I need to do this before we hit production anyway, so it will be done sooner or later.