Setting Up Neo4j on Azure VM

NB: I’m leaving this up for continuity purposes, but MS Open Tech no longer exists, so the VM Depot is no longer being updated (see https://msopentech.com/blog/2015/04/17/nextchapter/).  Newer versions of Neo4j will need to be installed the usual way using VMs.

It’s time for me to get back to experimenting with different datastores and data structures (and burn some Azure credits I’m basically wasting).  One datastore I’m interested in for my day job is the graph database Neo4j.  Relationships are fascinating to me, and a graph database stores relationships as data you can query.  There are DBaaS (managed, cloud-based Neo4j) providers such as graphstory, but for getting started and learning it’s probably cheaper to set up your own instance, and here I’ll show you one way to get up your own instance.  Fortunately, Neo Technology (the company behind Neo4j) created a VM image on Microsoft’s VM Depot, which we can use to spin up an Azure VM .

  1. Obviously, you need an Azure account.  If you don’t have one, you need to create one.  Despite the promise of “Free Account”, running VMs is not free on Azure, and the cheapest option for me was $13/month (prices at https://azure.microsoft.com/en-us/pricing/details/virtual-machines/#Linux).   It’s not terrible, especially if you remember to turn off your VM when you’re not using it.  The day job gets me MSDN credits, and anyone in the same boat can probably run a small VM without worries.
  2. It would also be a good idea to know some Linux, because that’s the OS.  If you don’t know the difference between SSH and LTS, you might want to pick up a used copy of Ubuntu Unleashed for 12.04 LTS for a buck or so.  It’s scary thick, but don’t panic, it’s organized well enough to be used as a reference.
  3. In order to publish a VM Depot image to your Azure account, you need a PublishSettings file (which is similar to a WebDeploy file, if you know what those are).  Just click https://manage.windowsazure.com/publishsettings/index?client=xplat and save the file locally.  You don’t need to do anything else, even though there are additional instructions on the page.
  4. Find the Neo4j Community on Ubuntu VM.  This VM is Neo4j 2.0.1 and the current Neo4j is 2.3, so it’s a little behind but good enough as a sandbox.  (This link might change if the Ubuntu OS or Neo4j version are updated, so if it’s broken let me know and I’ll update this post)
  5. On the VM Depot page, click the “Create Virtual Machine” button.  If you haven’t logged in you’ll be prompted to do so, and then you’ll need to provide your PublishSettings file.
  6. Next you’ll get to choose your DNS name, VM username and a few more options.  Pay attention to the ADVANCED settings, the default machine size will cost you about $65/month.  This would be a good time to scale it down a bit.  This is also a good time to change default ports for Neo4j or SSH if you want to.
  7. Now wait about 10 minutes for everything to get set up.  The publish process is a background process, and once it’s complete you’ll get an email if you close the window.

Once you get the confirmation, you’re now ready to start using Neo4j!