REAPOFF is a proxying - content filtering firewall. Proxying firewalls are sometimes called application level gateways (ALGs). ALGs are different from packet filtering firewalls because they allow the enforcement of a detailed security policy at a high level.
ALGs help in this case because they ensure that communications occur in accordance with a specified protocol. ALGs are able to understand the protocols and deny connections which are not using the connect protocols. In addition, ALGs can enforce very fine grain control over transactions. This control is possible because the ALG is able to parse the protocol.
If an ALG is used in this case, port 80 connections must comply with the HTTP protocol, while port 25 must comply with the SMTP protocol. If the malicious user attempts to talk to their mail server listening on port 80 using SMTP, the ALG will terminate the connection due to a protocol violation.
If you also want to use the GUI (recommended), change directory into the gui directory and type ./configure then make.
The deploy directory will contain all the executables and configuration files you will need. You can copy this directory to the target machine if you intend running the firewall on a different machine than the one used to compile on.
The basic idea is that the GUI is completely separate from the configuration of the actual firewall. Thus it is possible to use a GUI on a workstation and generate a REAPOFF installation for a completely different system. By default the GUI will create a complete, pre-configured installation of REAPOFF in the ./deploy directory.
The GUI uses a template file called ``template.xml''. This file contains all the rules currently written in XML format. The rules are classified according to a family. Thus for example all rules pertaining to HTTP belong to a family called HTTP. This classification is a guide only, although its probably not a good idea to mix and match rules between families. The family ``general'' contains rules which can generally be used on any proxy.
Figure 1 shows the main GUI screen. There are a number of parts in this screen. The left most pane shows the current proxies which are configured in a tree view. The rules defined for each proxy are also shown. The top right pane shows a description of the currently selected rule or proxy. This description helps the user decide if they want this rule present. A better explanation is given of the GUI structure in the following sections.
Rules can take on variables sometimes. This allows the user to configure the rule specifically for their situation. For example the above screen shot shows a rule which accepts three variables. A variable may contain more than one instance, by having multiple lines of text. For example in the above screen shot the variable string matches accepts multiple arguments each match is present on its own line.
Figure 2 shows the template dialog box which is presented whenever a new rule is added. The user can then select which rule they wish to add to the current proxy. Multiple selections are allowed in this dialog box.
Usually rule dumps from such firewalls are difficult to read, since the rules don't necessarily correspond to a particular policy and its difficult to work out what each rule is supposed to achieve. On the other hand, a firewall's configuration is supposed to reflect a security policy which is a higher level, clearly defined document stating what restrictions and privileges are applied to different groups of users,computers and times. In a sense there is a separation between the actual policy and the conditions under which these policies are applied.
This type of separation makes auditing the configuration of the firewall an easy task, one simply needs to establish which policies apply to whom and then look at how these policies are implemented. An easier to understand firewall configuration process leads to a more secure installation and more likely to be configured as per the policy.
Modern firewall installation are leaning toward this type of configuration, for example IPTables, supports the concept of chains, which are a collection of rules. Then by selecting which chain applies and when, it is possible to delegate chains to policies. REAPOFF uses the same principle when producing IPTables rules as well.
A Policy is therefore defined as a collection of rules. Rule execution can be diverted to different policies depending on certain conditions. An example can illustrate this principle best:
Consider Figure 3 above which shows a screenshot of the HTTP proxy. This configuration may be found in the examples directory. Rule evaluation proceeds from the top to the bottom in order. When the rule evaluation reaches the ``Policy Selection by Authentication'' rule, the authentication parameters within the request are compared to the username and password list specified within the rule configuration variables (in this case username=''username'' and password of password'). If these match, the policy is selected as ``Policy Name'' which in this case is power_users. Execution then continues from the power_users policy. Note that the path of execution does not actually change until the ``Execute selected policy'' rule is executed. This allows to put in several policy selection rules in succession with each subsequent rule overriding the previous one.
The following sections will describe some of the currently available proxies and how they should be deployed in practice:
If a connection is made to the proxy from a denied IP address, the proxy will immediately tear down the connection and log the connection attempt.
Transparent HTTP proxy: Use this to enable transparent HTTP proxy. You must have Linux 2.4 kernel for this to work. Note that the client must have the firewall configured as a gateway as well, and you will most likely need to allow DNS into the internal network as well.
Handoff Proxy: Sometimes a caching proxy is required in addition to REAPOFF. This rule allows REAPOFF to hand off all connections to a separate caching proxy for further processing and possibly authentication.
Limit Post Size: Often the security policy forbids the uploading of files via HTTP. This is done to protect intellectual properties for example. It is difficult, however, to enforce this policy because any site can accept a file upload, for example a web based email system. The easiest way to stop this is to limit the maximum size of a POST directive. Thus if the POST is too large, the connection will be terminated and the user will be warned. A log is also generated.
Block Active X: Active X is a problematic technology since it is basically an executable downloaded from the internet allowed to run on the clients machine. There is no ``sand box'' environment like JAVA for example. Thus it is common in many security policies to deny Active X. This might break some sites, but Active X is not really used much on the internet so its not a great loss. Note that this method is not full proof, because an attacker can always craft malicious javasctipt that creates an Active X object on the fly without allowing REAPOFF to inspect it.
Deny HTTP methods: HTTP has quite a number of different methods, some are extensions. For example, file upload can also be done via the PUT method. This rule allows you to restrict the HTTP method to a specific set of allowed methods. Note that in order for WebDav to work, many other methods must be allowed, so this rule helps to stop WebDav.
CONNECT support: In order to allow SSL communications through a HTTP proxy, the CONNECT method must be allowed. This method creates an end-to-end tunnel from client and server, over which encrypted traffic can be exchanged. This represents a significant threat since it allows any internal user to completely bypass the firewall. If you need to provide SSL support for clients, you must enable this rule. Alternatively, REAPOFF will have an intercepting SSL proxy available in the next release (There is a pre-alpha version you can play with).
Block advertisers optionally: Advertising is a pain on the net. However, many Ad blocking rules make mistakes sometimes and accidentally block sites which are not ads. To make life easier, you can use this rule to allow people to bypass the ad blocking and get the page anyway.
FTP services: This allows the HTTP proxy to service FTP urls over HTTP. It is probably better to use the transparent FTP support instead though. Note that you cant use this option if you want the proxy to be transparent. Transparent proxies need to have a proper transparent FTP proxy configured instead.
The plug proxy supports the following command line options:
SSL is an encrypted end-to-end protocol. This fact raises problems for network security devices, such as firewalls and IDS:
SSL also represents a major threat for networks because proxies typically need to support the CONNECT directive which allows a tunnel to be established between the client and servers on the Internet. Since the tunnel is generally encrypted, proxies are forced to allow any traffic through. A large proliferation of software packages has recently become available to exploit this flaw and allow outbound tunnels through the HTTP proxy to carry arbitrary network traffic. It is also possible to route arbitrary traffic over the SSL tunnel via pppd and effectively form a VPN terminating inside the network.
WebDav is also a major threat. Since WebDav allows the sharing of folders using the HTTP protocol, and is widely available and supported under windows platforms, its use is very difficult to stop in real networks over SSL. The author was very surprised to discover how easy it was for malicious insiders to use webdav to connect out to an external HTTP server on the Internet which allows WebDav connections over SSL. Almost any windows client with Internet Explorer versions greater than about 5.5 can easily connect out over SSL and copy files in either direction unchecked.
Clearly allowing your clients to use SSL represents a major threat to your network. Also allowing your web server to communicate directly with clients using SSL negates the IDS that may be used. How can SSL be properly managed in the network? Clearly, it is very difficult to deny SSL outright, since many popular sites now require it. Users do not necessarily appreciate the dangers and are commonly focused on the need to do business on line in an e-banking or e-commerce situation.
There are currently 3 ways in which SSL can be managed on the network securely. REAPOFF supports all three, but the following sections describe the details. It is important to select the most appropriate strategy for the situation. Which strategy is chosen depends mainly on the security policy and the amount of computational capacity available on the gateway.2
Note that this is the usual method for doing this in less capable firewalls (e.g. packet filtering firewalls) and older application level firewalls. REAPOFF offers much more powerful methods for controlling SSL and you should only select this method if you don't have enough processing capacity on the gateway machine for the network load.
There are a number of key steps in this architecture:
The Full SSL proxying architecture is shown in figure 5. This architecture requires REAPOFF to decrypt the SSL traffic, inspect it via the usual HTTP proxy rules and then re-encrypt the traffic to the client. Note that effectively REAPOFF is performing a Man in the Middle attack against the SSL connection stream. However, SSL is designed to prevent this type of attack from taking place, by requiring server certificates to be signed by trusted certificate authorities. In order for REAPOFF to transparently perform this function, REAPOFF must be trusted as a certification authority by the client. Otherwise the client will constantly issue a ``This certificate is not trusted'' message.
The main steps used in this case are:
An example of a full SSL proxy can be found in the examples directory.
If you do not have a suitably configured apache web server, or you would like to make the certificate permanently available to many machines, there is a small REAPOFF configuration file which will serve out the certificate over any chosen port (for example 8000):
plug -p 8000 -o cacert.outbound
Once REAPOFF is trusted by the client, it is possible to completely remove all other CA's from the trusted store, since REAPOFF will automatically intercept and change the certification of each site. In a large installation, it may be wise to configure the SOE to automatically trust the gateway to sign certificates.
Basically the proxy will listen on a particular port for incoming connections. When a connection is received from a client, the proxy will read some data from the listening socket. This data will be processed through the set of rules, and any relevant actions will be executed. After the data is processed, it will be passed to the connecting socket. Outbound rules apply to traffic from the listening side to the connecting side, whereas inbound traffic applies from the connecting socket to the listening socket.
Note also that the configuration files fully define the behaviour of the proxy. This means that if the authors make an error in writing their configuration files directly, a vulnerability may result. Hence inexperienced users may want to restrict themselves to using the GUI rather than write their own configuration files.
Suppose we want to block incoming object tags from the HTTP proxy:
if str object s/<\s*object/<_no_object_allowed/ig
Now suppose the attacker was running their own site and wanted to slip the object tag past REAPOFF. They could first send the '' character and then the 'o', then the 'b' etc. Each of these characters will cause REAPOFF to process the read buffer (which is 1 character big). This will fail to match the string object and nothing will happen.
Therefore char mode should not be considered for security critical filtering. This mode does not guarantee that the matching engine will work properly, although in most cases it will (because typically packets contain very large buffers). A similar problem also exists with distinguishing FTP port commands from error messages with the word PORT in them (A common attack against firewalls).
It must be noted that in line mode, multi-line REs are not guaranteed to work for exactly the same reason that REs are not guaranteed in char mode. If you need to make a multi line match you should use the smart mode. Line mode is the default mode with REAPOFF, unless specified otherwise.
You might find that binary protocols do not work properly at all using line mode. This is because there may not be a new line separating communications between the client and server. In this case one end will send their message across without having a new line, and wait for the server to return its message. In the meantime REAPOFF will be waiting for a new line and not process the buffer. In this case a deadlock may occur and things will not work.
In order to solve this deadlock, try to identify which mode to use where and switch to it using "set mode line" or "set mode char" where necessary.
if /... start of buffering ../ startbuff if /... end of buffering ../ endbuff if /... some rule ../ actions.....
In this case the proxy will start buffering when the first rule matches. Each new read, will cause the rules to be evaluated, but no other actions will be performed until the end of buffering rule matches, and the endbuff directive is executed. After that happens other rules are evaluated and actions are executed.
If is usually best to have the startbuff/endbuff directives at the beginning of the rule set to allow all the other rules an opportunity to match after buffering ends.
The following actions will be executed while buffering: KILL , SET , GOTO , END, EVAL. The idea is that it should still be possible for the proxy to apply the right policies while buffering.
config set mode char config set port 8080 config set remote 3127 config startbuff
if..... s/hello/hello world/ig if str world log got world
The second condition will match if the first condition is true and the word hello is present in the buffer.
This substitution can be conditioned by a string match for optimized performance:
if str www.someserver.com s/((GET|POST|PUT)\s*http:\/\/www\.someserver\.com)/BLOCK $1/ig
This way the RE does not need to be executed when it obviously has no chance of matching.
if /(GET|POST|PUT)\s(\S*)/ log Got a $1 request for URL $2
The above parses the HTTP header and extracts two captures strings, the first representing the method, and the second representing the URL requested. The log action is then invoked with those captures substrings expanded appropriately.
#First extract content type from headers if /(^|\n)Content-type:\s*(\S*)/ set content_type $2 ..... (some more rules) #Now do something special for particular content types: if $content_type /(text|html)/ log This is a text or html page
In this way it is possible to break protocols down in steps and make decisions about actions in a more systematic way.
if str hello printin 1 or str world printin 2 and str nice printin 3
The following table summarized what the result would be in each of the following cases:
|thats a nice world||2,3|
|hello, nice people||1,2,3||Pay particular attention to this one|
The following logical operators are supported: and, or, and_not, or_not
#Find port specification in URL (e.g. http://www.server.com:8000/) if $url /http:\/\/[^\/:]*:(\d*)/ #Set remote port to connect to: set remote $1 log Will connect to port $remote.
The following are built in variables:
if $state str pre action.. if $state str post log Connection terminated.
Note that variables are persistent across the inbound and outbound chains so they may be used as ways to communicate and synchronize the inbound and outbound rule sets.
if something eval variable = function arg
This will assign the result from running the arg string through the function named. The function must be compiled into the plug proxy. Functions should be modular so users may easily write their own functions. Functions are included from function.h. To see which functions are supported use plug -h. Currently the following functions are supported:
More functions to come in the next release as they become available.
Linux supports transparent proxying via the firewalling modules, ipchains, and iptables. In this case the packet filtering engine within the kernel rewrites the packets as though they were actually destined to the local host with the specified port. An example of this is:
iptables -t nat -A PREROUTING -i eth0 -p tcp \ --dport 80 -j REDIRECT --to-port 3128
Here the kernel will redirect all packets coming on eth0 and going to port 80 into the localhost with port 3128. The proxy will accept connections on 3128 and service the request.
In order for the proxy to know the original destination, a getsockname call must be performed. REAPOFF makes this call available via the special read only variable $transparent.
Hence if you want to allow transparent connections, do this:
if $state str pre set destination $transparent
Note that transparent proxying increases the security risk since clients do not need to be especially configured to use the gateway. In addition you need to allow DNS for internal clients so they can resolve their own addresses.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
Note that the behavior of REAPOFF is controlled by the rules used. These rules are also considered to be modification to REAPOFF for the purpose of licensing. If you write your own rules for whatever reason, you must also distribute those rules in accordance with the GPL. If you require a special exception to these rules you may contact the author for a special licensing arrangement. It goes without saying that any additional functions (see 10.1) written into reapoff constitute a modification of the source and must have a compatible license.
GNU GENERAL PUBLIC LICENSE TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION 0. This License applies to any program or other work which contains a notice placed by the copyright holder saying it may be distributed under the terms of this General Public License. The "Program", below, refers to any such program or work, and a "work based on the Program" means either the Program or any derivative work under copyright law: that is to say, a work containing the Program or a portion of it, either verbatim or with modifications and/or translated into another language. (Hereinafter, translation is included without limitation in the term "modification".) Each licensee is addressed as "you". Activities other than copying, distribution and modification are not covered by this License; they are outside its scope. The act of running the Program is not restricted, and the output from the Program is covered only if its contents constitute a work based on the Program (independent of having been made by running the Program). Whether that is true depends on what the Program does. 1. You may copy and distribute verbatim copies of the Program's source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice and disclaimer of warranty; keep intact all the notices that refer to this License and to the absence of any warranty; and give any other recipients of the Program a copy of this License along with the Program. You may charge a fee for the physical act of transferring a copy, and you may at your option offer warranty protection in exchange for a fee. 2. You may modify your copy or copies of the Program or any portion of it, thus forming a work based on the Program, and copy and distribute such modifications or work under the terms of Section 1 above, provided that you also meet all of these conditions: a) You must cause the modified files to carry prominent notices stating that you changed the files and the date of any change. b) You must cause any work that you distribute or publish, that in whole or in part contains or is derived from the Program or any part thereof, to be licensed as a whole at no charge to all third parties under the terms of this License. c) If the modified program normally reads commands interactively when run, you must cause it, when started running for such interactive use in the most ordinary way, to print or display an announcement including an appropriate copyright notice and a notice that there is no warranty (or else, saying that you provide a warranty) and that users may redistribute the program under these conditions, and telling the user how to view a copy of this License. (Exception: if the Program itself is interactive but does not normally print such an announcement, your work based on the Program is not required to print an announcement.) These requirements apply to the modified work as a whole. If identifiable sections of that work are not derived from the Program, and can be reasonably considered independent and separate works in themselves, then this License, and its terms, do not apply to those sections when you distribute them as separate works. But when you distribute the same sections as part of a whole which is a work based on the Program, the distribution of the whole must be on the terms of this License, whose permissions for other licensees extend to the entire whole, and thus to each and every part regardless of who wrote it. Thus, it is not the intent of this section to claim rights or contest your rights to work written entirely by you; rather, the intent is to exercise the right to control the distribution of derivative or collective works based on the Program. In addition, mere aggregation of another work not based on the Program with the Program (or with a work based on the Program) on a volume of a storage or distribution medium does not bring the other work under the scope of this License. 3. You may copy and distribute the Program (or a work based on it, under Section 2) in object code or executable form under the terms of Sections 1 and 2 above provided that you also do one of the following: a) Accompany it with the complete corresponding machine-readable source code, which must be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange; or, b) Accompany it with a written offer, valid for at least three years, to give any third party, for a charge no more than your cost of physically performing source distribution, a complete machine-readable copy of the corresponding source code, to be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange; or, c) Accompany it with the information you received as to the offer to distribute corresponding source code. (This alternative is allowed only for noncommercial distribution and only if you received the program in object code or executable form with such an offer, in accord with Subsection b above.) The source code for a work means the preferred form of the work for making modifications to it. For an executable work, complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the executable. However, as a special exception, the source code distributed need not include anything that is normally distributed (in either source or binary form) with the major components (compiler, kernel, and so on) of the operating system on which the executable runs, unless that component itself accompanies the executable. If distribution of executable or object code is made by offering access to copy from a designated place, then offering equivalent access to copy the source code from the same place counts as distribution of the source code, even though third parties are not compelled to copy the source along with the object code. 4. You may not copy, modify, sublicense, or distribute the Program except as expressly provided under this License. Any attempt otherwise to copy, modify, sublicense or distribute the Program is void, and will automatically terminate your rights under this License. However, parties who have received copies, or rights, from you under this License will not have their licenses terminated so long as such parties remain in full compliance. 5. You are not required to accept this License, since you have not signed it. However, nothing else grants you permission to modify or distribute the Program or its derivative works. These actions are prohibited by law if you do not accept this License. Therefore, by modifying or distributing the Program (or any work based on the Program), you indicate your acceptance of this License to do so, and all its terms and conditions for copying, distributing or modifying the Program or works based on it. 6. Each time you redistribute the Program (or any work based on the Program), the recipient automatically receives a license from the original licensor to copy, distribute or modify the Program subject to these terms and conditions. You may not impose any further restrictions on the recipients' exercise of the rights granted herein. You are not responsible for enforcing compliance by third parties to this License. 7. If, as a consequence of a court judgment or allegation of patent infringement or for any other reason (not limited to patent issues), conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot distribute so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not distribute the Program at all. For example, if a patent license would not permit royalty-free redistribution of the Program by all those who receive copies directly or indirectly through you, then the only way you could satisfy both it and this License would be to refrain entirely from distribution of the Program. If any portion of this section is held invalid or unenforceable under any particular circumstance, the balance of the section is intended to apply and the section as a whole is intended to apply in other circumstances. It is not the purpose of this section to induce you to infringe any patents or other property right claims or to contest validity of any such claims; this section has the sole purpose of protecting the integrity of the free software distribution system, which is implemented by public license practices. Many people have made generous contributions to the wide range of software distributed through that system in reliance on consistent application of that system; it is up to the author/donor to decide if he or she is willing to distribute software through any other system and a licensee cannot impose that choice. This section is intended to make thoroughly clear what is believed to be a consequence of the rest of this License. 8. If the distribution and/or use of the Program is restricted in certain countries either by patents or by copyrighted interfaces, the original copyright holder who places the Program under this License may add an explicit geographical distribution limitation excluding those countries, so that distribution is permitted only in or among countries not thus excluded. In such case, this License incorporates the limitation as if written in the body of this License. 9. The Free Software Foundation may publish revised and/or new versions of the General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. Each version is given a distinguishing version number. If the Program specifies a version number of this License which applies to it and "any later version", you have the option of following the terms and conditions either of that version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of this License, you may choose any version ever published by the Free Software Foundation. 10. If you wish to incorporate parts of the Program into other free programs whose distribution conditions are different, write to the author to ask for permission. For software which is copyrighted by the Free Software Foundation, write to the Free Software Foundation; we sometimes make exceptions for this. Our decision will be guided by the two goals of preserving the free status of all derivatives of our free software and of promoting the sharing and reuse of software generally. NO WARRANTY 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. END OF TERMS AND CONDITIONS