Getting an artifact into the maven central repository

So I’m going to try to get something into the maven central repository.

There’s a description on how to do this on the sonatype.org site, but it doesn’t go into enough detail for me, so I’m going to write up some notes on it here.

The project I’m opting to upload is a fairly trivial Bandcamp API client.

The Bandcamp API (which is fairly unrelated to the topic of this blog post) allows you to get song lyrics and graphics from the Bandcamp site, which is a bit like MusicBrainz, but shinier, not as complete, and monetised.

The maven artifact that holds the Java binding for the Bandcamp API will live in the groupId:artifactId co-ordinates of com.randomnoun.bandcamp:bandcamp-api-client.

Since I’m going to have slightly higher standards for testing/documentation for these projects it probably helps that the project isn’t that big (about 10 Java classes in total, most of which are transfer objects or POJOs ).

So, in no particular order:

Choose a license

I’m going to use the BSD simplified license, since it just seems like less hassle in the long term.

The content of the maven license URL appears be dumped directly into the generated site documentation, so for me this involved creating another page (text / html) out on the web somewhere with the contents of the license (since the standard OSI page has a fair amount of extraneous header/footer/sidebar guff that didn’t transfer cleanly).

Use of the Bandcamp API is governed by the Bandcamp API Terms of Use Agreement.
This agreement also requires compliance with the Website’s Terms of Use and
Privacy Policy, so I’ve added all of those into the project/licenses section of the project’s pom.xml.

Create a public-facing project site

The standard maven site:site goal / lifecycle / mojo / plugin looks fairly hideous from the vantagepoint of the year 2013, so I’ve used the reflow maven site skin, which gives a more modern, bootstrap-based site.

The standard maven skin (lhs) versus the reflow maven skin (rhs)
The standard maven skin (lhs) versus the reflow maven skin (rhs)

I notice bootstrap itself got bumped from 2.3.2 to 3.0.0 a couple of weeks ago so I’m looking forward to finding out how that’s going to break everything I’ve written up until this point, which is one of the perks of writing software that constantly changes whilst you’re attempting to use it.

Turns out that maven has this thing called Doxia which it uses to prevent you from writing HTML to document the thing, instead preferring you to use APT, or FML, or 10 other flavours of home-grown shit text markup which is apparently going to stop everyone from using HTML because (in the humble opinion of the Maven steering committee) these are easier to use than the language that everyone else has been using up until this time.

It creates its own document event model (called the Doxia Sink API) which is Apache’s attempt to reinvent the wheel again, causing me to write a doxia module that kind-of-almost allows me to use HTML instead. [2]

Seeing as I’m going to all this trouble of generating a site, I enabled a few maven report modules (code coverage, test/main javadocs, test/main source xref, surefire) and to keep things looking consistent, created a custom javadoc stylesheet which I had to write twice because Oracle decided to completely change the javadoc element structure in Java 1.7 for no reason whatsoever.

The standard javadoc stylesheet these days (lhs) versus the modified stylesheet (rhs)
The standard javadoc stylesheet these days (lhs) versus the modified stylesheet (rhs)

The site is hosted at http://code.randomnoun.com/(module-name), which in this case will be http://code.randomnoun.com/bandcamp-api-client, and is updated automatically during release by the
scp://code.randomnoun.com/var/www/code.randomnoun.com/(module-name) reference in the
project/distributionManagemenet/site/url element in the pom.xml, which required a bit of stuffing around with ~/.m2/settings.xml server credentials and buggerising up the SSH known_hosts and authorized_keys files between the build server and the web server holding the site, then re-adding support for the scp protcol which was dropped by default in mvn 3.

Create a public CVS repository (accessible by both http and pserver protocols)

The way things used to be

So this is how I normally access CVS, using a tube map metaphor, because I think it’s more entertaining than a UML sequence diagram, and it’s the only way I’m going to get remotely near a high speed train now that Tony Abbott is in power.

Access to CVS from machines within the randomnoun corporate firewall
Access to CVS from machines within the randomnoun corporate firewall

I, dear reader, am on the left hand side of this diagram, and wish to retrieve things from the CVS repository, on the right hand side. To do this I fire up my CVS client, point it at cvs.dev.randomnoun, which is an internal DNS record resolving to 192.168.0.13, which connects to the standard port 2401, which then serves up files from the /var/lib/cvsd folder of bnedev03, which is the VM that holds my sourcecode.

Notice the .randomnoun TLD, which is a measure I use to prevent internal URLs from leaking onto the internet (which usually end with, say, .com or .au).

The way things are going to look from here on

Because the SCM links in the pom.xml are now public, I opted for creating a separate read-only CVS repository for things I’m making publicly available.

This allows me to avoid the horrible latency of cloud-based version control systems, whilst hopefully minimising any data leakage I’d otherwise suffer by hosting it inside the same cvs repository as the rest of my crap other modules of varying code quality.

I’ve created a new internal DNS entry (cvs.randomnoun.com) which resolves to the same IP address as above (192.168.0.13), and an external DNS entry of the same name (cvs.randomnoun.com) which resolves to my externally accessible IP (123.243.191.198).

The public CVS server sits on the same internal VM, but listens on a non-standard port (2402). External connection requests on port 123.243.191.198:2401 are routed to port 192.168.0.13:2402. The read-only cvsd daemon has it’s repository refreshed periodically from the read/write cvsd by a cronjob on the cvs machine (from /var/lib/cvsd to /var/lib/cvsd-public):

Access to CVS from machines outside the randomnoun corporate firewall
Access to CVS from machines outside the randomnoun corporate firewall.
The fluffy cloud image above represents the smoke and mirrors that constitute The Internet.

This has the advantage that:

  • internal updates go to the read/write cvs repository, whereas
  • external access use the same SCM URL, but is directed to the read-only cvs repository subset, where hopefully things are less likely to go pear-shaped.

Feel free to complain that I use the same tube station icons for processes, machines and file systems above, but let me point out that these are mostly virtual machines and virtual file systems, so it’s more similar under the hood than you might at first think [1].

You should be able to grab the source using the following anonymous checkout:

$ cvs -d:pserver:anonymous@cvs.randomnoun.com:/randomnoun login
Logging in to :pserver:anonymous@cvs.randomnoun.com:2401/randomnoun
CVS password:
cvs login: CVS password file /home/user/.cvspass does not exist - creating a new file
$ cvs -d:pserver:anonymous@cvs.randomnoun.com:/randomnoun checkout bandcamp-api-client
cvs checkout: Updating bandcamp-api-client
U bandcamp-api-client/.classpath
U bandcamp-api-client/.project
U bandcamp-api-client/pom.xml
cvs checkout: Updating bandcamp-api-client/.settings
...
$

There’s also a publicly visible cvsweb installation running on the virtual host which I’ve set up at http://cvs.randomnoun.com/cvsweb/cvsweb.cgi/bandcamp-api-client/.

Create some mailing lists

Which is all a bit pointless in this day and age, but the apache maven modules have them so I thought I’d try to keep up with the Joneses. The mailing lists are run by GNU Mailman (some instructions, some slightly different instructions ), which involved setting it up and configuring a postfix virtual MX host out in The Cloud somewhere (part 1, part 2, part 3).

The web interface to the bandcamp-api-client mailing list is hosted here.

Ignore the ciManagement bits

Note that I’ve still got links to internal (.dev.randomnoun) JIRA/BAMBOO systems in the pom.xml, but seeing as that metadata isn’t used in the build, I’m going to leave it for now.

Create an account and raise a ticket on the OSSRH JIRA (my ticket)

The steps to raise this ticket are listed at https://docs.sonatype.org/display/Repository/Sonatype+OSS+Maven+Repository+Usage+Guide .

I’m using the following settings:

summary: bandcamp-api-client – Java bindings for the Bandcamp API
groupId: com.randomnoun.bandcamp
project URL: http://code.randomnoun.com/bandcamp-api-client
scm URL: http://cvs.randomnoun.com/cvsweb/cvsweb.cgi/bandcamp-api-client
nexus username: knoxg
already sync to central: no

The sonatype guys were pretty quick in creating access, which was nice. They also changed the applied groupId from ‘com.randomnoun.bandcamp’ to ‘com.randomnoun’, which allows me to create new subgroups automatically; so the applied groupId appears should be the top-level writable groupId for an organisation, rather than the groupId used for an actual maven artifact.

Create a PGP signing key.

Using the instructions at http://maven.apache.org/developers/release/pmc-gpg-keys.html , and https://docs.sonatype.org/display/Repository/How+To+Generate+PGP+Signatures+With+Maven , and being somewhat frustrated at the mucking about required to generate the minimum levels of randomness required.

I’ve decided to publish by keys into the SKS keyserver, which is one of the servers checked by the OSSHP verification process.

knoxg@bnestg01:~$ gpg --version
gpg (GnuPG) 1.4.9
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Home: ~/.gnupg
Supported algorithms:
Pubkey: RSA, RSA-E, RSA-S, ELG-E, DSA
Cipher: 3DES, CAST5, BLOWFISH, AES, AES192, AES256, TWOFISH
Hash: MD5, SHA1, RIPEMD160, SHA256, SHA384, SHA512, SHA224
Compression: Uncompressed, ZIP, ZLIB, BZIP2
knoxg@bnestg01:~$
knoxg@bnestg01:~$ gpg --list-keys
gpg: directory `/home/knoxg/.gnupg' created
gpg: new configuration file `/home/knoxg/.gnupg/gpg.conf' created
gpg: WARNING: options in `/home/knoxg/.gnupg/gpg.conf' are not yet active during                                   this run
gpg: keyring `/home/knoxg/.gnupg/pubring.gpg' created
gpg: /home/knoxg/.gnupg/trustdb.gpg: trustdb created
knoxg@bnestg01:~$
knoxg@bnestg01:~$
knoxg@bnestg01:~$ gpg --list-keys
knoxg@bnestg01:~$ gpg --gen-key --no-use-agent
gpg (GnuPG) 1.4.9; Copyright (C) 2008 Free Software Foundation, Inc.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Please select what kind of key you want:
   (1) DSA and Elgamal (default)
   (2) DSA (sign only)
   (5) RSA (sign only)
Your selection? 1
DSA keypair will have 1024 bits.
ELG-E keys may be between 1024 and 4096 bits long.
What keysize do you want? (2048)
Requested keysize is 2048 bits
Please specify how long the key should be valid.
         0 = key does not expire
        = key expires in n days
      w = key expires in n weeks
      m = key expires in n months
      y = key expires in n years
Key is valid for? (0) 0
Key does not expire at all
Is this correct? (y/N) y

You need a user ID to identify your key; the software constructs the user ID
from the Real Name, Comment and Email Address in this form:
    "Heinrich Heine (Der Dichter) "

Real name: Greg Knox
Email address: knoxg@randomnoun.com
Comment:
You selected this USER-ID:
    "Greg Knox "

Change (N)ame, (C)omment, (E)mail or (O)kay/(Q)uit? O
You need a Passphrase to protect your secret key.

You don't want a passphrase - this is probably a *bad* idea!
I will do it anyway.  You can change your passphrase at any time,
using this program with the option "--edit-key".

We need to generate a lot of random bytes. It is a good idea to perform
some other action (type on the keyboard, move the mouse, utilize the
disks) during the prime generation; this gives the random number
generator a better chance to gain enough entropy.
+++++++++++++++++++++++++..++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++.+++++++++++++++++++++++++++++++++++.++++++++++>++++++++++.............+++++

Not enough random bytes available.  Please do some other work to give
the OS a chance to collect more entropy! (Need 300 more bytes)
We need to generate a lot of random bytes. It is a good idea to perform
some other action (type on the keyboard, move the mouse, utilize the
disks) during the prime generation; this gives the random number
generator a better chance to gain enough entropy.
+++++..+++++.+++++.+++++..+++++.+++++.+++++.+++++++++++++++...+++++.+++++..+++++..+++++++++++++++.+++++++++++++++.+++++..+++++++++++++++.++++++++++.+++++.++++++++++.+++++>++++++++++......................................................................+++++^^^
gpg: key A21A1486 marked as ultimately trusted
public and secret key created and signed.

gpg: checking the trustdb
gpg: 3 marginal(s) needed, 1 complete(s) needed, PGP trust model
gpg: depth: 0  valid:   1  signed:   0  trust: 0-, 0q, 0n, 0m, 0f, 1u
pub   1024D/A21A1486 2013-08-26
      Key fingerprint = A411 49F8 40FD A387 DF86  94F1 443D 95BD A21A 1486
uid                  Greg Knox 
sub   2048g/9A7FC682 2013-08-26

knoxg@bnestg01:~$ gpg --list-keys
/home/knoxg/.gnupg/pubring.gpg
------------------------------
pub   1024D/A21A1486 2013-08-26
uid                  Greg Knox 
sub   2048g/9A7FC682 2013-08-26

knoxg@bnestg01:~$ gpg --list-secret-keys
/home/knoxg/.gnupg/secring.gpg
------------------------------
sec   1024D/A21A1486 2013-08-26
uid                  Greg Knox 
ssb   2048g/9A7FC682 2013-08-26

knoxg@bnestg01:~$
knoxg@bnestg01:~$ gpg --edit-key A21A1486
gpg (GnuPG) 1.4.9; Copyright (C) 2008 Free Software Foundation, Inc.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Secret key is available.

pub  1024D/A21A1486  created: 2013-08-26  expires: never       usage: SC
                     trust: ultimate      validity: ultimate
sub  2048g/9A7FC682  created: 2013-08-26  expires: never       usage: E
[ultimate] (1). Greg Knox 

Command> quit

knoxg@bnestg01:~$ gpg --keyserver hkp://pool.sks-keyservers.net --send-keys A21A1486
gpg: sending key A21A1486 to hkp server pool.sks-keyservers.net
knoxg@bnestg01:~$

Once you’ve got a PGP signature, add it to your build using the instructions here.

If you’re anything like me, then you’ll find that you’ll need to completely rebuild your staging server with new versions of Java and Maven, but that’s OK since it was running an old, unsupported version of Debian Lenny, so you probably needed to get round to doing that anyway.

You’ll probably also find yourself trying a few dozen ways of getting that plugin to work, before realising that you need to add some undocumented elements to your ~/.m2/settings.xml file.

Copy into the OSSRH staging repository

Since you’re using maven, you’ve probably already got some horribly complex build process surrounding it just to make it more manageable. I use vmaint. It’s tops.

The steps you want to add to your release process should be something similar to the following:

  • Copy all the artifacts you’ve built into a temporary directory (because maven doesn’t let you deploy from your local repository)
  • Run these commands:
    knoxg@bnestg01:~$ /usr/bin/mvn --batch-mode -Durl=https://oss.sonatype.org/service/local/staging/deploy/maven2/ -DrepositoryId=sonatype-nexus-staging -DpomFile=/tmp/bandcamp-api-client-0.0.14.pom -Dfile=/tmp/bandcamp-api-client-0.0.14.jar gpg:sign-and-deploy-file
    knoxg@bnestg01:~$ /usr/bin/mvn --batch-mode -Durl=https://oss.sonatype.org/service/local/staging/deploy/maven2/ -DrepositoryId=sonatype-nexus-staging -DpomFile=/tmp/bandcamp-api-client-0.0.14.pom -Dfile=/tmp/bandcamp-api-client-0.0.14-sources.jar -Dclassifier=sources gpg:sign-and-deploy-file
    knoxg@bnestg01:~$ /usr/bin/mvn --batch-mode -Durl=https://oss.sonatype.org/service/local/staging/deploy/maven2/ -DrepositoryId=sonatype-nexus-staging -DpomFile=/tmp/bandcamp-api-client-0.0.14.pom -Dfile=/tmp/bandcamp-api-client-0.0.14-javadoc.jar -Dclassifier=javadoc gpg:sign-and-deploy-file
    

Alternatively, you can use the web interface to manually upload the artifacts to the staging repository. Here’s some background information on staging repositories if you’re interested.

Close and release the OSSRH staging repository

  • log into the OSSRH Nexus repository
  • check that the staging repository exists and has the files you uploaded (in this case, the pom, client jar, the sources jar and the javadoc jar)
  • select your staging repository and click the ‘Close’ button on the toolbar
  • type a message into the ‘Close Confirmation’ box
  • check that the Central Sync Requirement Rules have passed
  • click the ‘Refresh’ button on the toolbar, which should then allow you to
  • select your staging repository and click the ‘Release’ button on the toolbar
  • type a message into the ‘Release Confirmation’ box. If the ‘Automatically drop’ checkbox is selected (which it is by default), then your staging repository will be removed from the list after it has been released (it will still get synced to central though).

These steps are shown in the screenshots below (click each screenshot for a closer look):

1) Check files for release
1) Check files for release
2) Close confirmation
2) Close confirmation
3) Closed repository activity tab
3) Closed repository activity tab
4) Release confirmation
4) Release confirmation
5) Released repository activity tab (in-progress)
5) Released repository activity tab (in-progress)



If everything doesn’t go hunkydory (say you’ve forgotten to document a class, include all the required licenses, or you’ve inadvertently left an API key in the sourcecode), just click the ‘Drop’ button on the toolbar, fix the problem, re-release and deploy it to the staging repository, and repeat until everything’s looking better than average.

Once you’ve gone through that, all thats left is to wait two hours, and see if it’s appeared in the central repository. (If you like, you can use that two hours to construct a list of verbs that convey the concept of copying a file).

For what it’s worth, the nginx index page at central appears to come up pretty quickly, but the artifact itself takes a little longer to become available.

Update the OSSRH JIRA ticket

I believe this only needs to be done after the first artifact, not for subsequent artifacts.

That’s it

So there you go. Including the com.randomnoun.bandcamp:bandcamp-api-client dependency from any old pom.xml file should now cause maven to automatically download the artifact for you.

The release I’ve put up there (0.0.15) is reasonably complete, and should work, but will probably get a few more small changes as I come to grips with this whole central repository release process before I bump it to 1.0.0.

You’d be surprised how long that took to complete.

The bandcamp artifact mentioned above is now part of the com.randomnoun.bandcamp:bandcamp-api-client artifact, which can be directly referenced in your pom.xml from the maven central repository.

The Doxia HTML module mentioned above is now available via the com.randomnoun.maven.doxia:doxia-module-html artifact, which can also be directly referenced in your pom.xml from the maven central repository.

Update 25/9/2013: It’s in central now

[1] If I’d thought about this a fraction of a second more, I would have made bnedev03 appear on the tube map as ‘zone 1’ and external access out in the wilderness of ‘zone 2’ somewhere.

[2] I still needed to learn Apache Velocity though, which is possibly the worst templating language that has ever been devised. The Doxia Sink doesn’t allow advanced HTML usage, like, say, a DIV element contained within another DIV element, so you get to do all sorts of creative things with the handful of HTML elements it does recognise, which is the sort of constructive back-bending effort that will be familiar to anyone who has ever tried to write a page that renders correctly in more than two types of browser.

2 Comments

Add a Comment

Your email address will not be published. Required fields are marked *