Monday, May 23, 2011

Installing Apache Tomcat under Linux

After a security outage I experienced recently with JBOSS I decided to move my webapp to Apache Tomcat. The incident was an exploit of zecmd.war that needs to be removed from $JBOSS_HOME/server/default/deploy/management/ directory. Once again this reminds me of the thinning that needs to take place before JBOSS is taken to production. A good guide for thinning JBOSS can be found here.

Nonetheless, the decision was taken: we were moving to Apache Tomcat version 7. The installation we performed under Linux is not a difficult one; however, certain steps need to take place so as to address reliability/security. Let’s go through the installation step by step. First, we need to download and install java, as well as tomcat. Tomcat needs java version 1.6 in order to function and so JDK 1.6 can be found here.
After java is in place, we may download Tomcat 7 and this can be found here.
After downloading apache-tomcat-7.0.14.tar.gz (current stable version as of this writing), as root, I did the following:
Create tomcat user and group:

$>groupadd tomcat
$>useradd tomcat –m –g tomcat

Move apache-tomcat-7.0.14.tar.gz  under /opt. Then unpack the *.tar.gz and remove it so as to gain space. Finally change ownership of the newly created directory to tomcat user:

$>mv apache-tomcat-7.0.14.tar.gz /opt
$>tar –xvf apache-tomcat-7.0.14.tar.gz
$>rm apache-tomcat-7.0.14.tar.gz
$>chown –R tomcat:tomcat apache-tomcat-7.0.14

What we have now is a brand new installation of Apache Tomcat owned by tomcat OS user. Directory /opt/ apache-tomcat-7.0.14 is our CATALINA_HOME. Next we need to fix up a bit tomcat user with the right environment variables, like CATALINA_HOME mentioned above. Go to the user’s home directory and edit .profile file. Add the following parameters:

JAVA_HOME=/usr/java/jdk1.6.0_25  
export JAVA_HOME  
PATH=$JAVA_HOME/bin:$PATH  
export PATH  
CATALINA_BASE=/opt/ apache-tomcat-7.0.14
export CATALINA_BASE
CATALINA_HOME=/opt/ apache-tomcat-7.0.14
export CATALINA_HOME
CATALINA_OPTS=”$CATALINA_OPTS -Xms128m -Xmx512m -server”

JAVA_HOME is needed to point out to the directory that java is installed. So,\ as to be sure, $JAVA_HOME/bin/java file should exist. Otherwise, you will get an error in Tomcat’s log directory stating that $JAVA_HOME/bin/java file is invalid. Also, if java can be found under /usr/bin do not use /usr as CATALINA_HOME, but better the actual directory where java was installed (in my case =/usr/java/jdk1.6.0_25 ).  After all, /usr/bin/java is a symbolic link pointing to the actual java binary.

CATALINA_BASE and CATALINA_HOME is the directory where Tomcat is installed. CATALINA_BASE may be left out, but CATALINA_HOME is absolutely mandatory.

CATALINA_OPTS dictates, among others, the heap parameters for starting Tomcat, as well as indicate that we need to start it as a server.
 
Another good idea is to run Tomcat as a service in Linux, by placing the appropriate tomcat file under /etc/init.d. Mine looks like this:

JAVA_HOME=/usr/java/jdk1.6.0_25  
export JAVA_HOME  
PATH=$JAVA_HOME/bin:$PATH  
export PATH  
CATALINA_BASE=/opt/ apache-tomcat-7.0.14
export CATALINA_BASE
CATALINA_HOME=/opt/ apache-tomcat-7.0.14
export CATALINA_HOME
CATALINA_OPTS=”$CATALINA_OPTS -Xms128m -Xmx512m -server”
case $1 in  
start)  
bash $CATALINA_HOME/bin/startup.sh  
;;   
stop)     
bash $CATALINA_HOME/bin/shutdown.sh  
;;   
restart)  
bash $CATALINA_HOME/bin/shutdown.sh  
pid=$(ps -ef | grep tomcat | grep java | grep -v grep | grep -v stop | awk '{print $2}')
while [[ "$pid" != "" ]]
do
sleep 2
echo "shutting down app server.."
pid=$(ps -ef | grep tomcat | grep java | grep -v grep | grep -v stop | awk '{print $2}')
done
bash $CATALINA_HOME/bin/startup.sh  
;;   
esac      
exit 0 

We may then add appropriate permissions to this file. Do the following so as to fix this:

$>chmod 755 /etc/init.d/tomcat

We are now ready to start Tomcat as a service under tomcat user, like this:

$>su – tomcat
$>service tomcat start

But beware! Let us first secure our installation. Take the following steps:

1. Remove all files and folders under $CATALINA_HOME/webapps, $CATALINA_HOME/server/webapps, as well as some other files not necessary to common production:

$>rm –rf $CATALINA_HOME/webapps
$>rm $CATALINA_HOME/server/webapps

   
2. Make sure $CATALINA_HOME/conf/web.xml file has an entry called listings set to
false. By default it is, but you may edit the file just to make sure:

<init-param>
     <param-name>listings</param-name>
     <param-value>false</param-value> 
</init-param>


In the same file, make sure that the following entry exists as follows:
       
<error-page>
    <exception-type>java.lang.Throwable</exception-type>
    <location>/manos.jsp</location>
</error-page>


As you may have guessed, manos.jsp does not exist, so in case of a Java Exception the user will receive a blank page instead of the stacktrace, which can reveal vulnerability in your app. Please note however that in my app I have configured my own 500 and 404 error pages.

3. So as to reveal as little as possible to a potential attacker in HTTP headers, you may edit $CATALINA_HOME/conf/server.xml file and add the server=”Something” entry in all Connectors. It will look like this:

<Connector port="8080" server="Something" />

4. Change the shutdown command in $CATALINA_HOME/conf/server.xml to something other than the default SHUTDOWN. Also, make sure that a proper set up of iptables hides port 8005 from the outside world.
        
<Server port="8005" shutdown="LukeSkywalker">
 
5.  Make certain directory/file restrictions harder, as follows:

$>chmod 700 $CATALINA_HOME/conf
$>chmod 700 $CATALINA_HOME/conf/*
$>chmod 700 $CATALINA_HOME/temp
 
And after all this set up, you are pretty much done!

Some additional steps I did was to configure all Connector entries in $CATALINA_HOME/conf/server.xml to use UTF8 by placing entry URIEncoding="UTF-8", place connector/j library for mysql under $CATALINA_HOME/lib directory and editing $CATALINA_HOME/conf/context.xml to add the mysql datastore (the latter might not be a best practice, since this resource is globally available to all apps running under Tomcat - an alternative approach would be to place it under $CATALINA_HOME/conf/Catalina/localhost/ROOT.xml, assuming your app is called ROOT).

Friday, May 6, 2011

Everything is a query these days - Part 2

In my previous post I advocated a solution for detecting and subsequently deleting duplicate records from a Database. What about the reverse problem, i.e. when two Databases have a common table, but they are not synchronized? Let us suppose that a table in a production Database is correctly updated via an external app, but the same backup Database table is not updated due to a synchronization issue. Assuming that no fancy tools are out there to assist you in this issue, how would you manually update the backup Database table so as to be in sync with production?

The table in question is called emp and has four columns, emp_id, first_name, last_name, dept_id. It is obvious from this structure that the primary key is emp_id and records missing from the backup Database table are lacking this attribute.

The first solution may be something like the following query: 

INSERT INTO emp_backup
SELECT * FROM emp
WHERE emp_id NOT IN (SELECT DISTINCT emp_id FROM emp_backup);

This query has two evident problems. The first is performance, given that for every row fetched from emp table a check has to be made so as to define whether or not emp_id column value is present in the corresponding backup table.
But there is a bigger issue than this. Most RDBMS systems will not allow for an indefinite list of values in the NOT IN … clause. In other words, if query

SELECT DISTINCT emp_id FROM emp_backup

fetches more than x-number of rows, it will subsequently fail, giving you a warning for your actions. This So, what is the best query to solve this? The first solution that comes to mind is to upsert emp_backup table, i.e. make insert if record does not exist or make a dummy update if the record exists. Let us transform this into an Oracle query:

MERGE INTO emp_backup b
USING emp e
 ON (b.emp_id = e.emp_id)
WHEN NOT MATCHED THEN
  INSERT (b.emp_id, b.first_name, b.last_name, b.dept_id)
  VALUES (e.emp_id, e.first_name, e.last_name, e.dept_id);

The query above is Oracle's way of upserting data from one table to another. MySQL has a similar command called REPLACE INTO. More on REPLACE INTO can be found here.

Wednesday, April 13, 2011

integrating eclipse with jboss and svn

While this seems to be a trivial task for web developers working with eclipse IDE and Jboss Application Server on their development environment, there are a couple of traps I keep stubling upon. The first one is generating a war file with an empty lib directory. This happens to me with eclipse Galileo and jboss 5.1. While I refuse to upgrade both my IDE and jboss version and ultimetaly resolve the problem, I found the answer on this thread.

The thread basically proposes to change the Java output folder in your jboss server screen, close the project, reopen it, change again the Java output folder and finally redeploy the project. This solves the problem in a non-quick but for sure dirty way.

Just to add up to that thread, I noticed that the problem occured more often after I start my laptop, open eclipse and publish my project to the server. If on the other hand I start my laptop, start jboss via eclipse and then do whatever (write some code, publish it to jboss, restart the server, redeploy my project, etc) it all works as it supposed to be!

The second issue is more of a personal mental glitch rather than a real issue. It has to do with eclipse and svn locking of files. There are times where I synchronize with the repository and want to update my projects. But instead of clicking "override and update" I click "override and commit". I do not want to make other developers angry, so I quickly hit the "Cancel" button. This has the effect of locking the chosen file, not for other users but for me!

I read some threads on how to resolve this, but what worked for me is the following:

1. Close eclipse
2. Go to your workspace, one directory above the directory that contains the locked resource.
3. Rename the directory that contains your locked resource to to xxx_old
4. Take an update
5. Open eclipse and check whether its all good. If yes, delete the xxx_old dir

Thats it!

Monday, April 4, 2011

Everything is a query these days - Part 1

So you have an IT problem and you are desperate to solve it. You get some sort of error description and trying to figure out what the error is all about. Or you get a phonecall from an angry customer complaining that your application has malfunctioned. What do you do? The answer is so easy. You google it!

Today I had an issue of duplicate records in a database. My team is in beta testing right now and the application owners found out that in a particular User Interface fetchedduplicate records. In the words of IT that either means some undefined code break somewhere in the application (bad scenario) or duplicate entries in the database (good scenario). After some searching I figured out that it was the latter, i.e. duplicate row entries in a reference tables. A mysql table.

So how did I solve this trivial problem? I simply typed "duplicate row entries in mysql". And the answer was magically there:

1. Duplicate Record Detection

From a vast variety of queries, these were the simpler for a mysql database:

SELECT name
FROM emp_t
GROUP BY name
HAVING COUNT(*) > 1

Assuming here that you do not allow duplicate entries of the name column in your database.

So I performed the searches and found out the duplicate records in question. Of course the next step is to delete these records. How do I do that?

2. Duplicate Record Deletion

The funny thing here is that you do not even have to figure out the mechanics behind the fancy deletion query. All you need is a tmp table just to perform a test prior to deleting the actual duplicate records from your "production db". Easy:

CREATE TABLE emp_tmp
AS SELECT * FROM emp;

The actual delete query may be the following:

DELETE
FROM (
        SELECT t1.name, t1.id
        FROM (
                SELECT name
                FROM emp_tmp
                GROUP BY name
                HAVING COUNT(name)>1
        ) AS t0 
              INNER JOIN emp_tmp t1 
                  ON t0.name = t1.name
) AS t2 
        INNER JOIN
            emp_tmp t3 
                ON t3.name = t2.name
WHERE t2.id < t3.id;

For an sql primer this looks like a tough job. But remember its all about googling, someone else has done that for you!All you need to do is check whether the above query meets your needs, i.e. deletes duplicate records from your target table. After you make sure it works with the temporary table you created, go ahead and try it on the actual table!

Another point here is why and how duplicate records got inside in the first place. This is lack of design for sure. Duplicate records mean that someone should have put a unique integrity constraint in the fields now considered duplicate. But didn't. So get your designs right before the coding begins!

Sunday, April 3, 2011

Hello World

The first program you are taught in virtually all programming languages is how to print "Hello World" in standard console. I am not going to deviate from this principle, my first article will be named after this infamous programming principle. So you get the idea. This is a blog about IT, Computer Science and Software Engineering.

Me?I am just a guy trying to write quality code. Code that gets the job done. I am lucky enough to have to deal with all sorts of programming languages almost on a daily basis. Today is C (segmentation fault, go figure). Tommorow is Java (another null pointer exception, what is wrong here?).Some other time is PL/SQL, perl, bash, little xsl, just a touch of javascript. You get the idea. 

Most of the times, after I finished typing a program, no matter how large or small, it just does not work as indented. It needs more testing. But then something gets into the way of my testing. I test one thing, and I realize I just broke the code of another module. Frustration comes quickly, followed by fear and anxiety. I am not going to deliver the module on time. What will my manager think? Or my module has malfunctioned doing nasty things. What will my customer think? What will my customer say to my manager? It is a vicious circle that all programmers have to live with.

So how can we make our lives just a bit better? By controlling our fear over IT. Acknowledging the fact that when we deal with IT, matters will most definitely go wrong at some point. But there is no reason to panic over it! No reason to be afraid that the end of the world is near, just because your program has malfunctioned to the point of -you think- no return. Most of the times, the answer to your problem is right there, staring you in the face.

This blog will serve as my personal diary of fear over IT. I want to document all the times I panicked for not been able to find the solution over an IT problem. In the years to come, I want to read these stories and laugh over my frustrations and failures. I am sure that this will make me not just a better programmer, but a better individual.

Feel free to read, laugh, empathize, comment and share your stories.