CS 242 Fall 2010 : Assignment 3.1

This page last changed on Oct 20, 2010 by cemeyer2.

Assignment 3.1 – Binary file manipulation in C

This assignment is due in your SVN repository by 2010-10-21 08:00 CST by your section time in a directory named Assignment3.1.

You will be given a file of data. This file is a formatted set of data and data graphics transformation operations. This data file would be interpreted by a real-time system meant to display the items in the display list.

  • In Assignment 3.1, you will act as the customer writing stubs, descriptions, and tests for the functions that you want another student, your contractor, to write to aid in processing the binary data file.
  • In Assignment 3.2, you will act as a contractor to implement the functions given to you by another student, i.e. – your customer.
  • In Assignment 3.3, you will use the code written by the your contractor, written to your spec, to complete the assignment.

Tools

You are welcome to use whichever C development environment you prefer.

Mac OS

For the Mac, we recommend that you use either Apple’s Xcode, which is free if you register
with Apple, or Eclipse with CDT (C/C++ Development Tools) (which requires gcc be installed on your machine, and the easiest way to get it is to install XCode). On the eclipse webpage there is a download for a version of eclipse already set up for C/C++ developement, or you can install the CDT through the help menu->install new software->select Helios update site, then select CDT from the Programming Languages category. Once you launch Eclipse for the first time, it should automatically configure itself to use the proper libraries and compiler for C development (at least it did on my machine)

Windows

For Windows, we recommend you use either Visual Studio 2010 or Dev-C++. VS is free from the web store for engineers (http://webstore.illinois.edu click on the departments tab once you login).
You are welcome to use Eclipse, but unfortunately configuring the Eclipse C development environment can be tricky.

Linux

For Linux, we recommend using Eclipse with CDT, installed using the same method mentioned above for Mac. Before you install the CDT or download the C/C++ version of eclipse, make sure you have all the necessary build tools installed on your machine first, since it will allow eclipse to autoconfigure itself rather than you doing it manually after the fact. To install the necesary files on popular distributions:

  • Ubuntu/Debian: sudo apt-get install build-essential
  • Fedora/Red Hat: sudo yum groupinstall “Development Tools”
  • SuSE: zypper install -t pattern devel_C_C++
On all of the above mentioned platforms, try to stay away from using a text editor like notepad, textedit, or vim with manually built makefiles, since one of the goals of this course is to expose you to new tools that will hopefully make your life easier once you get to know them.

Structure of the data file

The data file will be composed of a header region followed by a data region The header region is contains two header blocks, and the data region contains a number of data blocks. Each data block represents either data coordinates, and transformations to be applied to data coordinates.

You must authenticate the file and successively trace through each data block.

Header Region

The header region contains an authentication block followed by an offset table block.

Authentication block

  1. A byte representing the authentication string length
  2. The byte-string: “CS242 Indirect Data”

Offset table block

  1. A byte representing the number of rows
  2. A byte representing the number of columns
  3. A list of (rows * columns) 2-byte offsets

The offsets contained in the table in the header are used to locate the start of the data blocks within the data region. Each offset represents the number of bytes after the end of the authentication block to the start the offset data block. The data blocks may or may not occur in memory in the order that their offsets occur.

A sequence of data blocks referenced from an offset is completed when you encounter a data return block.

Data Region

Following the header block is the data region containing the data blocks.

It is possible that there will be gaps in the data between the data blocks. It is also possible that the number of data blocks does not match the rows*columns. To successfully complete Assignment 3.x, you will need to be able to trace through the data file. It will not work to simply linearly scanning the data file – this will hurt you

Data blocks

Below are the byte formats of the possible data blocks. Each block begins with a one byte record id which determines its type:

Centroid

Centroid bytes
record id: 0x01 1
scaling factor 1
x 4
y 4
z 4
block size 14

A centroid sets the context for subsequent coordinates; i.e. coordinates are are offset relative to the current centroid stack. When you output coordinates, you should always output their absolute coordinates. If the same set of coordinated are “called” after a different centroid, you will output a different set of absolute coordinates. Whenever you encounter a new centroid it replaced the previous one.

If there were 4 successive calls to a set of coordinates you would output those coordinates 4 separate times.

Coordinates

Coordinate List bytes
record id: 0x0a 1
coords count 1
x 2
y 2
z 2
0 or more coords 6 / per coord
block size 6 * (coord count) + 2

Each list of 3D coordinates represent the raw data. Each coordinate is always relative to the current centroid.

When doing calculations with coordinates, you must cast up and down properly to handle overflow. Centroids are 32-bits and coordinates are 16-bits.

Scaling

Scaling Factor bytes
record id: 0x02 1
scaling factor 1
block size 2

When you encounter a scaling factor, you scale relative coordinates by shifting them left by the indicated number of bits. For example, a scaling factor of 5 will result in all x, y, and z coordinates to be shifted left 5 bits. Beware of shifting a coordinate value and causing an unintended overflow.

Calling & Rotation

Call Local Data bytes
record id: 0x0b 1
offset 2
block size 3

This operation jumps the trace to the data block at the specified offset. When a data return occurs, you must be able to return to the following data block.

  • The offset is a 16-bit offset in bytes from the address of the beginning of this data block to the address of the data block where you should continue your trace through the data.
Call Rotate Local Data bytes
record id: 0x1b 1
offset 2
rotation angle 2
block size 5

As with the previous operation, this operation jumps to the data block at the specified offset. However, it also causes a rotational transformation on the subsequent coordinates. When a data return occurs, it undoes the current rotational transformation, and returns like the previous operation. New rotation calls build upon previous rotation calls, so you must be able to handle the stacked rotational contexts.

  • The offset is a 16-bit offset in bytes from the address of the beginning of this data block to the address of the data block where you should continue your trace through the data.
  • The rotation angle ranges from 0 increasing around the circle clockwise through 65535, with 0 representing 12 o’clock. You will need to convert this value to radians which are measured
    counterclockwise.
Use the following equations to calculate a rotational transformation:

x' = x cos (theta) - y sin (theta)
y' = y cos (theta) + x sin (theta)

Data return

Data Return bytes
record id: 0x0f 1
block size 1

The data return pops your trace back to the position in the data block immediately following the previous call.

If there are no other return addresses on the data return stack, the trace returns to next offset in the offset table block.

Sample file

We have included a sample file, cs242data2.bin, for you to use to develop your work. It is in little endian format. We have provided a decoding of the file.

Output

Ultimately, you will need to output a human readable version of the data as if you traced an entire “frame” of a simulation. Some of the interpreted instructions will branch to other locations within the data and return to continue where they had left off.

You will need to output coordinates and indications of state changes. Coordinate values must be output in hexadecimal.

You must authenticate the binary data file and successively trace through each data block in the order in which its offset appears in the table. As you trace through the file you will output the encountered centroids, graphic transformations, and the values of absolute coordinates in a human readable form.

When a call occurs, you must:

  • convert the rotation angle (16-bit, relative to vertical, increasing clockwise) to normal radians
  • branch to the location
  • apply the rotation and scaling factor to relative coordinates
  • add to most-recent 32-bit centroid
  • output coordinates in hexadecimal

Addendum to Assignment 3

This data represents a hypothetical interpreted drawing language based on an actual drawing language.  The sizes of the data elements are intentionally unrealistic to force dealing with unusual data types.

Task:

What kind of human-readable output would be useful for data of this type?  There are two straightforward types of textual output that could be useful.  One could output a “disassembly” form of text or one could output a trace.  My intention is that you write a trace.  That is, follow the path for each data block as if it were a set of instructions to draw a picture.

A disassembly style of output could be useful for YOU if you were working with this data and needed to “see” what was actually created as opposed to what was intended by the creator of the data.

Example:

 You encounter the following data just after your verification string:

02 02 0e 00 94 00 a1 00 ae 00 1c 00 1e 00

This says there is a 2 x 2 table of 16-bit offsets (little endian) that follow.  This means that there are a total of four blocks of data at offsets: 000e, 0094, 00a1, 00ae.  These offsets are from the start of the table where the dimensions of the table are.

At the beginning of one of these data block you encounter:

01 00 00 00 0f 00 00 00 00 00 00 00 0e 00 0b 00 01 f0 00 0f

The 01 represents a “locus” with scaling factor of 0 and x,y,z = (0xf0000, 0, 0xe0000), followed by 0x0b which is a call to an offset of 0x100 (offset relative to the 0x0b) with a rotation angle of 0xf0 (which is measured clockwise from north on a range from 0 – 65536) followed by 0x0f (return from the data block).

What kind of output would be useful?

Locus  sf, x,y,z  (using actually values)
callLocal 0x100, 0xf0
<output of data in block>
“return”               // from callLocal
“return”               // from data block

When you encounter coordinate definitions,  it could be useful to output their raw coordinates followed by their transformed coordinates.  The transformed coordinates are coordinates AFTER scaling, after rotating and after the locus.  You must at least output their transformed coordinates.

Requirements for 3.1

Write specifications for a library of C functions that you will use to complete this assignment. You will give those specifications to your contractor in the form of comments describing the functionality you want and function “stubs”. A stub is the first line of the function declaration indicating the arguments and their order as well as the return type. In C these stubs are known as function prototypes. Your comments must be Doxygen compliant. Your commented prototype functions should be composed into a single valid .h header file.

Example

/**
* @file Assignment3.1.h
* @author Maurice Rabb
* @date 2010-03-02
*/

/**
* A trick to make a bool type in C.
* @see http://www.codeguru.com/forum/archive/index.php/t-238109.html
*/
typedef enum {false = 0, true = 1} bool;

/**
* Takes two integers and adds them together
* @param header a pointer to a header region
* @param isIndirect a boolean specifying whether the format is indirect data or not
* @return a boolean confirming whether or not the header is valid
*/
char authenticateHeader(byte* header, char isIndirect);

You will receive specifications from another student (your customer), and act as their contractor. Your customer and contractor should not be the same person.

In discussion section, as the customer, you will present your specification, and then pass it to your contractor. Acting as a contractor, be prepared to review your customer’s specification. During Assignment 3.2, you will act as a contractor to implement your customer’s specification. The second week you will present the code of your contractor.

What functions should you specify?

Remember, one of the core goals of this course is to learn the power of small, focused and well-named functions and methods. You should specify functions that would be both necessary and useful to process the data file, such as:

  • Functions to process each type of block.
  • Functions to transform points, such as: scaling, rotation, relative to absolute positioning, etc.
  • Functions to output the block elements as they occur, such as: that centroid or scaling factor has been encountered, display a coordinate in hexidecimal, etc.

Testing

You must write complete unit tests for this part of the assignment. This is analogous to what we asked you to do for assignment 1.1. You are free to use any unit test framework you wish, or just use assert statements in your code. Assignment 3.2 will require your contractor to implement your specification such that your unit tests all pass. It is profoundly in your best interest to write complete and thorough unit tests so that the amount of work you will have to do for assignment 3.3 will be minimized.

Rubric

Criterion Weight Notes
Code submission 2 final version in correct directory by 3pm 10/22/10
Basic Preparation 1  
Requirements Satisfaction 4  
Presentation 2  
Participation 2  
Effort 2  
Testing 6 High coverage tests
Overall Design 6 Your library could be used to easily trace through a data file described in the assignment
Naming 2  
Decomposition 2  
Documentation 6 Doxygen-compliant comments for all functions
Cleverness 1  

cs242data.bin (application/octet-stream)

cs242data2.bin (application/macbinary)
Document generated by Confluence on Feb 22, 2012 18:13

  1. No comments yet.

  1. No trackbacks yet.