Some Julia Tidbits from FIT.jl: Enums, and my FIT.jl progress update.

Some things that I learned about from my most recent session: Julia Enums

Julia has an @enum macro that allows one to define an enum type. Working at Amazon, I’ve had to deal a lot with Enums as an easy way to represent categories of things. Especially in the Java world. Having come from Python though, I didn’t use the concept of enums very much.

Here are Julia’s excellent docs for enums: https://docs.julialang.org/en/v1/base/base/#Base.Enums.@enum

Enums exist in Julia via the @enum macro, which allows one to create an enumerated type. The Garmin FIT protocol specifies that rows in a binary format have different types, based on the presence of certain bits in the header. For example:

00000010 <- 1 in bit position 7 (yes julia is 1-indexed) with a 0 in bit 8 indicates that this is a definition message.
00000000 <- 0 in bit 7 indicates a data message (defined by the preceding definition message).
00000001 <- 1 in bit 8 means that this is a timestamp message, and should be interpreted differently.

I translated these differrent messages into an enum type like this:

@enum RecordType begin
    definition = 1
    data = 2
    timestamp = 3
end

mutable struct RecordHeader
  type::RecordType
  local_mesg_type::Int64
  time_offset::Union{Nothing, Int64}
end

The RecordHeader struct now holds a type field using the RecordType enum. When parsing these bits in the header I set the type of the RecordHeader struct I create accordingly:

function decode_header(v::Vector{UInt8})::RecordHeader
  type = if v[8] == 1
      timestamp
  elseif v[7] == 1
      definition
  else
      data
  end

  if type == timestamp
      local_mesg_type = sum_bits(v[6:7])
      time_offset = sum_bits(v[1:5])
      RecordHeader(type, local_mesg_type, time_offset)
  else
      local_mesg_type = sum_bits(v[1:4])
      RecordHeader(type, local_mesg_type, nothing)
  end
end

It’s not particularly complicated, but it does encode the knowledge of the Garmin FIT protocol. The function takes in a vector of UInt8 numbers (these are my bits, I could have used bool but didn’t), and will decode accordingly. The interesting thing here is the type assignment. Conditionals can be used on the right hand side of an assignment in Julia because they are statements as well. A conditional can be evaluated and have a resultant value. So, after checking the appropriate bytes, we assign the type value to the correct enum value.

I use the enum definitions directly, but a quick trip to the REPL can show you that these are indeed of the type enum, and defined in the scope that the function is running.

julia> @enum MyEnum value=1 value2=2 value3=3

julia> value
value::MyEnum = 1

julia> value2
value2::MyEnum = 2

The @enum I wrote is defined globally, so these values are present within global scope.

Attempting to Read Rows

So, Garmin FIT files are generally structured like so: a file header, always 14 bytes, then `n` rows where n is anything that fits within the file size (defined in the file header). The rows themselves can be of 3 types: a definition row, a data row, or a timestamp row. So far I have written code to read definition rows.

Definition rows contain field definitions for all messages that come after them. So, a definition row will “define” the data rows that come later by laying out exactly what types are present in the data message, and what global message type the data is.

An example is heartrate data. The definition would tell us that all of the messages that follow are heartrate messages, and within the heartrate messages you’ll get beats per minute (in integer format) HRV (in integer format), etc. So the message defines the data message types, as well as the data message fields.

Here’s how I implemented the row reader, that doesn’t really work, ha.

You start with a wonderful loop, that reads the file row by row.

# Loop through bytes_read until done
while f.bytes_read < f.header.size
    # Read header byte.
    header_bits = bit_vector(read_bytes!(f, 1); little_endian=true)
    header = decode_header(header_bits)
    record = DataRecord(header, read_record_content!(f, header))

    if isnothing(f.body)
        f.body = [record]
    else
        push!(f.body, record)
    end
end

f is a struct I have written to hold all relevant information about reading the FIT file. There should be one per file when running this code. I tried to follow design patterns that I had learned in Go, about keeping necessary information within a struct and using this struct as an argument to subsequent functions. This helps with dependency injection, and a multitude of other useful things one can do as a software engineer.

So, every time `read_bytes!` is called within my code, we are incrementing f.bytes_read by that amount. In theory, we should eventually read the exact size of the file, and be able to complete the read.

Now, each row has a header, that consists of 1 byte. Read that, and decode it to decide what type of row we are dealing with.

Finally, read the body of our row, which requires both the file itself (contained in that same struct f as a file pointer) and the header (which determines how much we are reading of that row).

if hdr.type == definition
    # Read all bytes first
    header_bytes = read_bytes!(f, 5)
    header_field_definitions = read_bytes!(f, header_bytes[end] * 3) # sz * field size
    header_developer_flag = read_bytes!(f, 1)
    if header_developer_flag[1] > 0x00
        header_developer_fields = read_bytes!(f, header_developer_flag[1] * 3)
    else
        header_developer_fields::Vector{UInt8} = []
    end

    # Parse the bytes
    global_mesg_type = get_mesg_num_string(byte_vec_to_int(header_bytes[3:4]))
    field_definitions = [parse_field(global_mesg_type, header_field_definitions[i:i+2]) for i  1:3:length(header_field_definitions)]
    developer_field_definitions = [parse_field(global_mesg_type, header_developer_fields[1:1+2]) for i  1:3:length(header_developer_fields)]

    # Get the total row size from parsing the fields.
    row_sz = sum(get_row_size.([field_definitions, developer_field_definitions]))

    # Associate the global message type with this local message type.
    f.mesg_map[hdr.local_mesg_type] = LocalMesgTypeMapping(row_sz, global_mesg_type)

    DefinitionBody(
        header_bytes[2] == 0 ? "littleendian" : "bigendian",
        global_mesg_type,
        header_bytes[end],
        field_definitions
    )
else
    row_bytes = read_bytes!(f, f.next_row_sz)
end

Now we read the row based on the header type, and read all fields accordingly. Under the definition branch, I’ve split the reading and parsing into two separate parts. Definition rows always have a fixed 5 bytes, so I read that first. Then, read the field definitions next, using the definition header byte that contains the number of field definitions. Each field definition is 3 bytes long. Finally, read the developer fields if any are present.

The two things to talk about here, are the row_sz, and the mesg_map assignment that I do in this code block.

row_sz shows off an incredibly cool feature of Julia, broadcasting. By suffixing your function call with a dot, you can call the function over the entire collection. Julia then does some cool processing in the background to ensure that work it is doing is efficient. In this case, field_definitions and developer_field_definitions contain those 3 byte field definitons, and the second byte is always the size of the data fields that you should expect. row_sz is intended to be the total amount of bytes that we should read for data rows of this type.

function get_row_size(field_definitions::Vector{FieldDefinition})::Int64
    sum([def.sz_bytes for def  field_definitions])
end

Now because my field definitons are already parsed, sum can be called on a list comprehension of those field definitions.

To remember what these definition rows define, there is a mapping in Garmin FIT files that is called the Local Message Type. That local message type is defined in the record header (the 1 byte, bitvector from earlier), and part of what we decode. In order to remember the row size, and the global message type of a local message type, we need a mapping. I use the `f.mesg_map` field to store that mapping globally for a file and be able to reference it later.

And there you have it! Thanks for reading.

 Share!

 
I run WindleWare! Feel free to reach out!

Subscribe for Exclusive Updates

* indicates required