Big business requires big data. In order for your company to handle such massive amounts of data, you’ll need a NoSQL database. Of the available databases, which is best suited for your task? Given you have quite a few to consider, the choice might be a bit daunting. Fortunately a number of those options are open source and pretty easy to deploy so you can quickly kick the tires and see if it’s the right one to fit your needs.

One such NoSQL database is Apache Cassandra, which offers linear scalability, high availability, no single point of failure, and can work on commodity hardware. Cassandra is used by a number of very large companies (including Reddit, Netflix, and Github).

I want to walk you through the process of installing Apache Cassandra on Ubuntu Server 20.04.

SEE: Navigating data privacy (free PDF) (TechRepublic)

What you’ll need

  • A running instance of Ubuntu Server 20.04

  • A user with sudo privileges

How to install Java

Apache Cassandra requires Java. We need to make sure to install the correct version of Java, which is OpenJDK 8. For that, we turn to the standard repository.

Log in to your Ubuntu Server and install Java with the command:

sudo apt install openjdk-8-jdk -y

When that installation completes, you’re ready to install Apache Cassandra.

How to install Apache Cassandra

Apache Cassandra can’t be installed from the standard repositories. Because of that, we need to add the official repository. In order to do this over HTTPS, we must first install apt-transport-https with the command:

sudo apt install apt-transport-https -y

Once that dependency is taken care of, download and install the necessary GPG key with the command:

wget -q -O - | sudo apt-key add -

Create an apt list file for Apache Cassandra with the command:

sudo nano /etc/apt/sources.list.d/cassandra.list

In that file, add the following:

deb 311x main

Save and close the file.

Update apt with the command:

sudo apt-get update

Finally, install Apache Cassandra with the command:

sudo apt-get install cassandra -y

When the installation completes, verify it with the command:

nodetool status

You should see the database server is running on datacenter1 (Figure A).

Figure A

Apache Cassandra is up and running on Ubuntu Server 20.04.

How to change the name of your database cluster

By default, your Apache Cassandra cluster will be named “Test Cluster.” Let’s rename that. To do so, you must first access the Apache Cassandra console with the command:


At the console, rename the cluster with the command:

UPDATE system.local SET cluster_name = 'NAME' WHERE KEY = 'local';

Where NAME is the new name of your cluster.

Exit the console with the command:


Now we need to change the name in the Apache Cassandra configuration file. Open the file for editing with the command:

sudo nano /etc/cassandra/cassandra.yaml

Look for the line starting with cluster_name and change it to the same name you used in the UPDATE command (Figure B).

Figure B

The Apache Cassandra configuration file is where you configure the cluster name.

Flush the Apache Cassandra cache with the command:

nodetool flush system

Restart Apache Cassandra with the command:

sudo systemctl restart cassandra

Now, when you log in to the Apache Cassandra console, you should see the new name of your cluster is in effect (Figure C).

Figure C

We’ve renamed our cluster.

How to open Apache Cassandra for remote connection

By default, Apache Cassandra is only listening to local connections. That’s great, if the applications that will use the database are hosted on the same machine. If you need to use Apache Cassandra as a remote database, you’ll need to enable it for remote connection. To do this, open the configuration file again with the command:

sudo nano /etc/cassandra/cassandra.yaml

In that file, you must change three lines. The first is:

start_rpc: false

Change the above to:

start_rpc: true

The next line is:

rpc_address: localhost

Change the above line to:


The next line needs to be uncommented (by removing the # character) and changed to:

broadcast_rpc_address: SERVER_IP

Where SERVER_IP is the IP address of your hosting server.

Save and close the file. Flush and restart Apache Cassandra with these commands:

nodetool flush system
sudo systemctl restart cassandra

You should now be able to connect to Apache Cassandra from remote servers.

And that’s how you can easily get this powerful, open source NoSQL database server installed. Enjoy your newfound ability to scale your data to meet enterprise-level needs.

Subscribe to TechRepublic’s How To Make Tech Work on YouTube for all the latest tech advice for business pros from Jack Wallen.

Image: Andrey Popov