To Build a Honeypot

To Build A Honeypot

Honeynet Project
http://project.honeynet.org
Last Modified: 7 June, 2000

This article is a follow up to the "Know Your Enemy" series. Many people from the Internet community asked me how I was able to track black-hats in the act of probing for and compromising a system. This paper discusses just that. Here I describe how I built, implemented, and monitored a honeypot network designed specifically to learn how black-hats work.

What is a Honeypot?

For me, a honeypot is a system designed to teach how black-hats probe for and exploit a system. By learning their tools and methods, you can then better protect your network and systems. I do not use honeypots to capture the bad guy. I want to learn how they work without them knowing they are being watched. For me, a well designed honeypot means the black-hat never knew he was being tracked. There are a variety of different approaches on how you can do this. Mine is only one of many.

Before I continue, I would like to post a disclaimer. First, no honeypot can catch/capture all the bad guys out there. There are too many ways to spoof/hide your actions. Instead of going into detail on how this is possible, I highly recommend you check out Insertion, Evasion, and Denial of Service: Eluding Network Intrusion Detection or fragrouter. Second, keep in mind that you are playing with fire. Someone far more advanced then you may compromise your honeypot, leaving you open to attack. Third, throughout this paper I use the term black-hat. I define a black-hat as anyone who is attempting un-authorized access to a system. This could be an 15 year old kid from Seattle, or a 45 year old company employee in accounting. Also, I refer to our black-hat as a he, however we have no idea what the true gender of the black-hat is.

Where to Begin?

There are a variety of different approaches to building a honeypot. Mine was based on simplicity. Build a standard box that I wanted to learn how the black-hat community was compromising. In this case it was Linux, but you can just as easily use Solaris, NT, or any other operating system. Don't do anything special to this system, build it as you would any other. Then put the system on the Internet and wait. Sooner or later someone will find the system and attack it. The system is built to be attacked and compromised, someone will gain root on that system, that is the goal. However, while they are gaining root (or Admin), you are tracking their every move.

This approach is different from other concepts. Network Associates has built a commercial product called CyberCop Sting, Designed to run on NT, this product can emulate variety of different systems at the same time, including Linux, Solaris, Cisco IOS, and NT. Fred Cohen has developed the deception toolkit, which are a variety of tools intended to make it appear to attackers as if a system has a large number of widely known vulnerabilities. One of my favorites is NFR's BackOfficer Friendly, which emulates a Back Orifice server. All of these have their advantages. However, my goal was to build a honeypot that mirrored my production systems, so I could better understand what vulnerabilities and threats existed for my production network. Also, the fewer modifications I make to the honeypot, the less chance the black-hat will find something "fishy" on the box. I do not want the black-hat to ever learn that he was on a honeypot.

The Plan

My plan was simple. Build a box I wanted to learn about, put it on the network, and then wait. However, there were several problems to this. First, how do I track the black-hats moves.? Second, how do I alert myself when the system is probed or compromised? Last, how do I stop the black-hat from compromising other systems? The solution to this was simple, put the honeypot on its own network behind a firewall. This solves a variety of problems.

First, most firewalls log all traffic going through it. This becomes the first layer of tracking the black-hat's moves. By reviewing the firewall logs, we can begin to determine how black-hats probe our honeypot and what they are looking for.
Second, most firewalls have some alerting capability. You can build simple alerts whenever someone probes your network. Since no one should be connecting to your honeypot, any packets sent to it are most likely black-hats probing the system. If there is any traffic coming FROM the honeypot out to the Internet, then the honeypot was most likely compromised. For an example on how set up alerting with Check Point FireWall-1, click here.
Third, the firewall can control what traffic comes in and what traffic goes out. In this case, the firewall lets everything from the Internet in, but only limited traffic out. This way the black-hats can find, probe, and exploit our honeypot, but they cannot compromise other systems.

The goal is to have our honeypot behind a controlled system. Most firewalls will do, as long as it can both control and log traffic going through it.

Tracking Their Moves

Now, the real trick becomes how to track their moves without them knowing it. First, you do not want to depend on a single source of information. Something can go wrong, things can be erased, etc. I prefer to track in layers. That way, if something does go wrong, you have additional sources of information. Also, you can compare different sources to paint a better picture.

Personally, I do not like to log information on the honeypot itself. There are two reasons for this. First, the fewer modification you make to the honeypot, the better. The more changes you make, the better the chance a black-hat will discover something is up. The second reason is you can easily lose the information. Don't forget, sooner or later the black-hat will have root on the honeypot. Several times I have had data altered, or in one case, the entire hard drive wiped clean. Our goal is to track the enemies moves, but log all the data on a system they cannot access. As we discussed above, our first layer of tracking is the firewall logs. Besides this, I track the black-hat's moves several other ways.

A second layer I use is the system logs on the honeypot. System logs provide valuable data, as they tell us what the kernel and user processes are doing. However, the first thing a black-hat normally does is wipe the system logs and replace syslogd. So, the challenge becomes logging syslog activity to another server, but without the black-hat knowing it. I do this by first building a dedicated syslog server, normally on a different network separated by the firewall. Then I recompile syslogd on the honeypot to read a different configuration file, such as /var/tmp/.conf. This way the black-hat does not realize where the real configuration file is. This is simply done by changing the entry "/etc/syslog.conf" in the source code to whatever file you want. We then setup our new configuration file to log both locally and to the remote log server (example). Make sure you maintain a standard copy of the configuration file, /etc/syslog.conf, which points to all local logging. Even though this configuration file is now useless, this will throw off the black-hat from realizing the true destination of our remote logging. Now, you will capture all system logs up to and including when the system is compromised. This will help tell us how the system was probed and compromised. It is also very interesting comparing these true system logs to the logs a black-hat has "cleaned" on a compromised system. This is the only time where I make a modification on the honeypot. Be advised, more advance users can detect these modifications using the command strings(1) on the syslogd binary. Then again, there are more advance ways to hide the modifications also. This is merely a suggestion you may want to consider.

The only problem with using a remote syslog server is it can be detected with a sniffer. Normally, black-hats either kill or replace syslogd when they gain root. If so, they can no longer sniff the syslog packets, since there are no longer any packets sent. However, if the black-hat does not modify nor kill the syslogd dameon, then they could sniff the packets sent. For the truly devious, you could send your syslogd traffic using a different protocol, such as IPX, which are normally not sniffed. Your level of paranoia may vary. There are also several alternatives you can use to standard syslogd. CORE-SDI has ssyslog, which implements a cryptographic protocol called PEO-1 that allows the remote auditing of system logs. For you NT users, they also have a Windows version, called slogger. There is also syslog-ng, developed by BalaBit Software, which is similiar in use to ssyslog, but uses SHA1 instead. All versions are free and open source.

My third layer of tracking (the firewall is the first, syslogd hack is the second) is to use a sniffer. I run a sniffer on the firewall that sniffs any traffic going to or from the honeypot. Since the honeypot is isolated by the firewall, you know all traffic goes through the firewall. The advantage of a sniffer is it picks up all keystrokes and screen captures, to include STDIN, STDOUT, and STDERR. This way you see exactly what the black-hat is seeing. Also, all the information is stored on the firewall, safely protected from the black-hat (I hope :). A disadvantage is the black-hat can hide his moves with encryption, such as ssh. However, if you are not running any such services on your honeypot, the blackhat may not use them. Also, a sniffer can be spoofed by advanced users, as discussed by the paper linked above.

My sniffer of choice is snort. Written by Marty Roesch, snort is a powerful ids sniffer that has all the functionality of tcpdump and much more. You can capture all the keystrokes in most plaintext sessions (example). It also has builtin IDS functionality, including customizable alerting and logging feartues. For examples of an IDS database, check out www.whitehats.com, which has an online signature database and several example config files. To check out the config file I use for snort, click here. You may want to run several different sniffers and/or IDS systems at the same time, such as Dragon, NFR, or Real Secure. Another idea I am playing with is running proxy servers on the firewall. That way specific traffic that runs through the firewall is proxied, allowing for more control and logging. I'm trying it out now with just a http proxy server on the firewall.

Another option for capturing keystrokes is to modify the shell. Most shells can be modified so that not only are all the keystrokes stored in the history file (such as .sh_history or .bash_history) but the shell can be modified to log all the keystrokes to syslog. Thus, the unknowning black-hat will have his keystrokes logged to syslog, and potentially a remote syslog server. Antonomasia has provided code to modify the bash file. Once again, use this with caution. The more modifications you make to a system, the greater the chance the modifications (and your honeypot) will be discovered. However, the advantage to this method is you will capture all keystrokes, including those from an encrypted session.

Finally, I run tripwire on the honeypot (there is also a NT version). Tripwire tells us what binaries have been altered on a compromised system (such as a new account added to /etc/passwd or a trojaned binary). I do this by running tripwire from a floppy, then storing the tripwire database to a floppy. You do NOT want any tripwire information stored locally on the system. By storing it on removable media, you can guarantee the integrity of the data. As an added precaution, I recommend compiling tripwire as statically linked. This way you are not using libraries that may be compromised on the honeypot. For the truly paranoid, boot off a floopy (such as tomsrtbt), then run tripwire. This protects against trojaned kernel modules. Tripwire is an excellent way to determine if you system has been compromised. Also, it is an excellent forensic tool that helps identify what modifications the black-hat has made.

You may find these layers as redundant. But remember, no single layer of information can capture all the traffic. Also, different sources give you different information. For example, most systems cannot detect stealth scans, however, many firewalls can. If your firewall logs your honeypot being scanned, but there is nothing in the system logs, then you were most likely scanned by a "stealth" scanner, such as nmap. Also, we are not perfect. Often while tweaking one service, you munge another. You could accidentally kill system logging or the sniffer. By having other layers of information, you still can put a picture together of what happened. If you develop any of your own methods of tracking, I highly recommend you implement them. The more layers you have, the better off you are. If you have any methods you would like to recommend, I would love to hear from. Additional methods can include hacking the system shell or kernel to log keystrokes, but to be dead honest, I haven't developed the skills yet to do that.

The Sting

Remember, our goal is to learn about the black-hat, without him ever knowing he was had. To gain a better understanding of this strategy, I highly recommend you watch one of my favorite movies, The Sting. We want to attract the black-hats, monitor them, let them gain root, and then eventually kick them off the system, all without them getting supicious. To attract black-hats, I like to name my honeypot enticing names, such as ns1.example.com (name server), mail.example.com (mail server), or intranet.example.com (internal web server). These are often primary targets for black-hats. Once we have enticed them, use the methods discussed above to track their actions.

Once the black-hat gains root, the question becomes, now what? Normally, I continue to monitor the black-hat for several days, to learn what he is up to. However, you have to be careful, eventually the black-hat will catch on that he is on a honeypot. If he does, bad things can happen. What I like to do is once I learn everything I can, I kick the black-hat off, normally by rebooting the box. I do this with the shutdown command, sending a message to all logged on users (the black-hat), stating the system is going down for routine maintenance. I then take the system off-line, remove the backdoors the black-hat made, and bring the system back online. Or, you can reinstall, building a new system. I recommend you fix the vulnerability that was used to gain access last time, so you can learn about new exploits/vulnerabilities.

The other issue is limiting the black-hat, we do not want him launching attacks from our own system. I do this by using the firewall. Remember, all traffic to and from the honeypot must go through the firewall. I use a rulebase that allows anything from the Internet to reach our firewall, but only limited traffic outbound (basically, the exact opposite of what a firewall is designed to do). The trick is, allowing enough outbound traffic so a black-hat does not get supicious, but we still have to limit their capabilities. If you block everything outbound, the black-hat will know right away that something is up. If you allow everything outbound, the black-hat can blatantly scan the Internet from your system. You now become liable for his actions, so we have to find a balance. Normally the first thing a black-hat does following access is to download their tool set. If they can't reach the Internet, they are going to cover their tracks and leave your system. What has worked for me is to allow all traffic inbound, and allow FTP, ICMP, and DNS (UDP) outbound. Normally, this is enough for the black-hat without them getting supiscious right away, but denies them utilizing most of their tools outbound. Your mileage may vary.

Thats it. All that is let left is to wait for the black-hat to strike (kind of like fishing). Ensure you have a good alerting mechanism, so you know as soon as possible when your system is being probed or has been compromised. You want to get as much information as soon as possible. You do not want the black-hat to catch on before you know he is there, bad karma may be coming your way. Good luck!

Conclusion

Honeypots are an extremely powerful tool that allows you to learn about the black-hat community. Correctly implemented, they give you an inside window on how the black-hat community works. There are a variety of different approaches to building and implementing a honeypot, mine is only one of many. My goal is to build a simple system that mirrors the production network. then sit back and wait. The key to tracking the enemy is layers. Do not depend on a single layer of information, as it can be altered or lost. By comparing different layers of information, you can also gain a better understanding of what the black-hat was doing. Happy hunting :)