Data geo-tagging is fundamental to the mapping of environmental data. Luckily, GPS technology has become widely available and is relatively inexpensive (~10-40$/device). Different GPS models are available. Depending on the application or the level integration desired, different types will be preferred.

  • Most smart phones are equipped with GPS and can be used for basic route logging, or can interface with another device through a more involved system.
  • GPS USB dongles, such as the Canmore GT-730FL-S, are capable of route logging or can interface with a computer through a serial port. datasheet buy on amazon
  • For very tightly integrated system, one might want to go with an embedded GPS module such as those from Locosys or Canmore (or many other brands). These usually come in two flavour, surface-mount modules that require an external antenna, and modules with integrated active antenna connecting to an on-board connector.

With the possible exception of cellphones, all these GPS devices use a common language. They use a set of standardized and proprietary messages transmitted in text format over a serial interface.The standardized messages are described by the NMEA standard and contain most of the useful information. Messages have distinct labels followed by a number of comma separated fields. Two messages in particular are the most likely to contain necessary information for simple applications. Simply pasted from the linked website, here’s their description.

GGA – essential fix data which provide 3D location and accuracy data.


     GGA          Global Positioning System Fix Data
     123519       Fix taken at 12:35:19 UTC
     4807.038,N   Latitude 48 deg 07.038' N
     01131.000,E  Longitude 11 deg 31.000' E
     1            Fix quality: 0 = invalid
                               1 = GPS fix (SPS)
                               2 = DGPS fix
                               3 = PPS fix
			       4 = Real Time Kinematic
			       5 = Float RTK
                               6 = estimated (dead reckoning) (2.3 feature)
			       7 = Manual input mode
			       8 = Simulation mode
     08           Number of satellites being tracked
     0.9          Horizontal dilution of position
     545.4,M      Altitude, Meters, above mean sea level
     46.9,M       Height of geoid (mean sea level) above WGS84
     (empty field) time in seconds since last DGPS update
     (empty field) DGPS station ID number
     *47          the checksum data, always begins with *

RMC – NMEA has its own version of essential gps pvt (position, velocity, time) data. It is called RMC, The Recommended Minimum, which will look similar to:


     RMC          Recommended Minimum sentence C
     123519       Fix taken at 12:35:19 UTC
     A            Status A=active or V=Void.
     4807.038,N   Latitude 48 deg 07.038' N
     01131.000,E  Longitude 11 deg 31.000' E
     022.4        Speed over the ground in knots
     084.4        Track angle in degrees True
     230394       Date - 23rd of March 1994
     003.1,W      Magnetic Variation
     *6A          The checksum data, always begins with *

One peculiarity to note is the format in which latitude and longitude are given

Latitude:  DDMM.MMMM
Longitude: DDDMM.MMMM

The digits in the ‘D’ slots represent an integer number in degree while the ‘M’ slots represent a decimal number in minutes. One can find more details about the degree/minute/second representation on wikipedia. In a nutshell, this can be converted to decimal degrees by dividing the minute part by 60 and adding it to the degree part:

DD + MM.MMMM/60.0

for the latitude for example.

The last field of every sentence is a checksum of the whole message. The checksum is a XOR of all the ASCII characters bytes between ‘$’ and ‘*’ (these excluded). Sample C code to compute the checksum is given.

/* Compute checksum of input array */
char GPS_checksum(char *s, int N) 
  int i = 0;  
  char chk = s[0];

  for (i=1 ; i < N ; i++)                                                                
    chk ^= s[i];

  return chk;

/* The example main routine */
int main(int argc, char **argv)
  char sentence[69] = "$GPRMC,123519,A,4807.038,N,01131.000,E,022.4,084.4,230394,003.1,W*6A";

  unsigned char chk = GPS_checksum(sentence+1, 65);

  printf("The checksum is %2hu\n", chk);


To readout the output from the GPS from an Arduino board for example there are two things to do. First, connect the module to the microcontroller. Most GPS use the serial interface to connect. They have thus four wires.

  • VCC connects to 5V or 3.3V depending on the module,
  • Ground connects to ground (GND),
  • Tx connects to Rx of Arduino board (pin 0),
  • Rx connects to Tx or Arduino board (pin 1).

The second step is to write the code to read the serial output of the GPS. Extensive examples can be found by googling a little. Notable ones are:

  • Adafruit GPS tutorial contains both help on connecting the GPS module and also using it with an arduino.
  • TinyGPS is a library for Arduino that allows to use a GPS module relatively painlessly.
  • I have a also written a small GPS library for a project and the files (GPS.h, GPS.cpp) can be cherry-picked from the project repository.

I hope this little primer will be enough to point towards the right direction to start up with GPS.

One thought on “Data geo-tagging primer

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.